All Products
Search
Document Center

MaxCompute:Migrate data from MaxCompute to OSS

Last Updated:Oct 27, 2023

This topic describes how to use the data synchronization feature of DataWorks to migrate data from MaxCompute to Object Storage Service (OSS).

Prerequisites

Procedure

  1. Create a table in the DataWorks console.

    1. Login DataWorks console.

    2. In the left-side navigation pane, click Workspaces.

    3. On the Workspaces page, find the workspace that you want to configure and click Data Development in the Actions column.

    4. Right-click a created workflow, Select new > MaxCompute > table.

    5. In create a table page, select the engine type, and enter table name.

    6. On the table editing page, click DDL Statement.

    7. In the DDL dialog box, enter the following CREATE TABLE statement and click Generate Table Schema.

      create table Transs
      (name    string,
      id    string,
      gender    string);
    8. Click Submit to Production Environment.

  2. Import data to the table Transs.

    1. Click Import on the DataStudio page.

    2. In data import wizard dialog box that appears, enter at least three letters to search for the table to which data is to be imported, and then click next Step.

    3. In the dialog box that appears, set Select Data Import Method to Upload Local File and click Browse next to Select File. Select the local file that you want to import and specify other parameters.

      Example:

      qwe,145,F
      asd,256,F
      xzc,345,M
      rgth,234,F
      ert,456,F
      dfg,12,M
      tyj,4,M
      bfg,245,M
      nrtjeryj,15,F
      rwh,2344,M
      trh,387,F
      srjeyj,67,M
      saerh,567,M
    4. Click Next.

    5. Select how destination table fields match the source fields.

    6. Click Import Data.

  3. Create a table in the OSS console.

    1. Log on to the OSS console and create a bucket. For more information, see Create buckets.

    2. Upload the file qwee.csv to OSS. For more information, see Upload objects.

      Note

      Make sure that fields in the file qwee.csv are exactly the same as the fields in the Transs table.

  4. Add data sources in the DataWorks console.

    1. Login DataWorks console.

    2. In the left-side navigation pane, click Workspaces.

    3. On the Workspaces page that appears, find the target workspace and click Data Integration in the Actions column.

    4. In the left-side navigation pane of the Data Integration page, click Data Source to go to the Data Sources page.

    5. On the Data Sources page, click Create Data Source. In the Add data source dialog box, click MaxCompute.

    6. In the Add MaxCompute data source dialog box, configure the parameters and click Complete. For more information, see Add a MaxCompute data source.

    7. Add OSS as a data source. For more information, see Add an OSS data source.

  5. Configure MaxCompute as the reader and OSS as the writer.

    1. Go to the data analytics page. Right-click the specified workflow and choose new > data integration > offline synchronization.

    2. In create a node dialog box, enter node name, and click submit.

    3. In the top navigation bar, choose Conversion scripticon.

    4. In script mode, click **icon.

    5. In import Template dialog box SOURCE type, data source, target type and data source, and click confirm.

    6. Modify JSON code and click the 运行 icon.

      Sample code:

      {
          "order":{
              "hops":[
                  {
                      "from":"Reader",
                      "to":"Writer"
                  }
              ]
          },
          "setting":{
              "errorLimit":{
                  "record":"0"
              },
              "speed":{
                  "concurrent":1,
                  "dmu":1,
                  "throttle":false
              }
          },
          "steps":[
              {
                  "category":"reader",
                  "name":"Reader",
                  "parameter":{
                      "column":[
                          "name",
                          "id",
                          "gender"
                      ],
                      "datasource":"odps_first",
                      "partition":[],
                      "table":"Transs"
                  },
                  "stepType":"odps"
              },
              {
                  "category":"writer",
                  "name":"Writer",
                  "parameter":{
                      "datasource":"Trans",
                      "dateFormat":"yyyy-MM-dd HH:mm:ss",
                      "encoding":"UTF-8",
                      "fieldDelimiter":",",
                      "fileFormat":"csv",
                      "nullFormat":"null",
                      "object":"qweee.csv",
                      "writeMode":"truncate"
                  },
                  "stepType":"oss"
              }
          ],
          "type":"job",
          "version":"2.0"
      }                           
  6. View the data of the newly created table in the OSS console. For more information, see Download objects.