This topic describes how to use the data synchronization feature of DataWorks to migrate data from MaxCompute to Object Storage Service (OSS).

Prerequisites

Procedure

  1. Create a table in the DataWorks console.
    1. Login DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. In the top navigation bar, select the region where the target workspace resides. Find the target workspace and click Data Analytics in the Actions column.
    4. Right-click a created workflow, Select new > MaxCompute > table.
    5. In create a table page, select the engine type, and enter table name.
    6. On the table editing page, click DDL Statement.
    7. In the DDL Statement dialog box, enter the following statement and click Generate Table Schema:
      create table Transs
      (name    string,
      id    string,
      gender    string);
    8. Click Submit to Production Environment.
  2. Import data to table Transs.
    1. Click Import on the DataStudio page.
    2. In data import wizard dialog box that appears, enter at least three letters to search for the table to which data is to be imported, and then click next Step.
    3. In the dialog box that appears, set Select Data Import Method to Upload Local File and click Browse next to Select File. Select the local file that you want to import. Then, specify other parameters.
      Example:
      qwe,145,F
      asd,256,F
      xzc,345,M
      rgth,234,F
      ert,456,F
      dfg,12,M
      tyj,4,M
      bfg,245,M
      nrtjeryj,15,F
      rwh,2344,M
      trh,387,F
      srjeyj,67,M
      saerh,567,M
    4. Click Next.
    5. Select how destination table fields match the source fields.
    6. Click Import Data.
  3. Create a table in the OSS console.
    1. Log on to the OSS console and create a bucket. For more information, see Create buckets.
    2. Upload the qwee.csv file to OSS. For more information, see Upload objects.
      Note Make sure that fields in the qwee.csv file are exactly the same as those in the Transs table.
  4. Add data sources in the DataWorks console.
    1. Login DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. On the Workspaces page that appears, find the target workspace and click Data Integration in the Actions column.
    4. In the left-side navigation pane of the page that appears, click Connection. The Data Source page appears.
    5. In the upper-right corner, click New data source. In the dialog box that appears, click MaxCompute(ODPS).
    6. In the Add MaxCompute(ODPS) data source dialog box, specify the required parameters and click Complete. For more information, see Configure a MaxCompute connection.
    7. Add OSS as a data source. For more information, see Configure an OSS connection.
  5. Configure MaxCompute as the reader and OSS as the writer.
    1. Go to the data analytics page. Right-click the specified workflow and choose new > data integration > offline synchronization.
    2. In create a node dialog box, enter node name, and click submit.
    3. In the top navigation bar, choose Conversion scripticon.
    4. In script mode, click **icon.
    5. In import Template dialog box SOURCE type, data source, target type and data source, and click confirm.
    6. Modify JSON code and click the Run icon icon.
      Sample code:
      {
          "order":{
              "hops":[
                  {
                      "from":"Reader",
                      "to":"Writer"
                  }
              ]
          },
          "setting":{
              "errorLimit":{
                  "record":"0"
              },
              "speed":{
                  "concurrent":1,
                  "dmu":1,
                  "throttle":false
              }
          },
          "steps":[
              {
                  "category":"reader",
                  "name":"Reader",
                  "parameter":{
                      "column":[
                          "name",
                          "id",
                          "gender"
                      ],
                      "datasource":"odps_first",
                      "partition":[],
                      "table":"Transs"
                  },
                  "stepType":"odps"
              },
              {
                  "category":"writer",
                  "name":"Writer",
                  "parameter":{
                      "datasource":"Trans",
                      "dateFormat":"yyyy-MM-dd HH:mm:ss",
                      "encoding":"UTF-8",
                      "fieldDelimiter":",",
                      "fileFormat":"csv",
                      "nullFormat":"null",
                      "object":"qweee.csv",
                      "writeMode":"truncate"
                  },
                  "stepType":"oss"
              }
          ],
          "type":"job",
          "version":"2.0"
      }                           
  6. View the data of the newly created table in the OSS console. For more information, see Download objects.