Data Integration supports data synchronization in wizard mode and script mode. Wizard mode is simpler while script mode is more flexible.

This topic describes how to export full data from Table Store (generated by the Put, Update, and Delete actions) to MaxCompute through Data Integration.

Step 1. Create a Table Store data source.

Note
  • Skip this step if a data source is already created.
  • If you do not want to create the data source, you can specify the endpoint, instanceName, AccessKeyID, and AccessKeySecret on the subsequent configuration page.

For more information about how to create a data source, see Create a Table Store data source.

Step 2. Create a MaxCompute data source

This operation is similar to Step 1. You only need to select MaxCompute as the data source.

In this example, the data source is named OTS2ODPS.

Step 3. Create a full export tunnel

  1. On the Data IDE page, click Sync Tasks.
  2. Select Script Mode.
  3. In the Import Template dialog box that appears, set Source Type to Table Store (OTS) and Type of Objective to MaxCompute (ODPS).
  4. Click OK to go to the configuration page.
  5. Set configuration parameters.
    {
    "type": "job",
    "version": "1.0",
    "configuration": {
    "setting": {
      "errorLimit": {
        "record": "0"    # Maximum number of errors allowed
      },
      "speed": {
        "mbps": "1",   # Maximum traffic, in Mbps.
        "concurrent": "1"  # Number of concurrent tasks.
      }
    },
    "reader": {
      "plugin": "ots",  # Name of the plugin read
      "parameter": {
        "datasource": "",  # Name of the data source
        "table": "",  # Name of the table
        "column": [  # Name of the column in Table Store that needs to be exported to MaxCompute
          {
            "name": "column1"
          },
          {
            "name": "column2"
          },
          {
            "name": "column3"
          },
          {
            "name": "column4"
          },
          {
            "name": "column5"
          }
        ],
        "range": {  # Range of the data to be exported. In full export mode, the range is from INF_MIN to INF_MAX.
          "begin": [ # Start position of the data to be exported. The minimum position is INF_MIN. The number of configuration items set in "begin" must be the same as the number of primary key columns of the table in Table Store.
            {
              "type": "INF_MIN"
            },
            {
              "type": "INF_MIN"
            },
            {
              "type": "STRING",  # Indicates that the start position in the third column is begin1.
              "value": "begin1"
            },
            {
              "type": "INT",  # Indicates that the start position in the fourth column is 0.
              "value": "0"
            }
          ],
          "end": [  # End position of the data to be exported
            {
              "type": "INF_MAX"
            },
            {
              "type": "INF_MAX"
            },
            {
              "type": "STRING",
              "value": "end1"
            },
            {
              "type": "INT",
              "value": "100"
            }
          ],
          "split": [  # Indicates the partition scope, which is not configured in normal cases. If performance is poor, you can open a ticket to submit a query.
            {
              "type": "INF_MIN"
            },
            {
              "type": "STRING",
              "value": "splitPoint1"
            },
            {
              "type": "STRING",
              "value": "splitPoint2"
            },
            {
              "type": "STRING",
              "value": "splitPoint3"
            },
            {
              "type": "INF_MAX"
            }
          ]
        }
      }
    },
    "writer": {
      "plugin": "odps",  # Name of the plugin written by MaxCompute
      "parameter": {
        "datasource": "",  # Name of the MaxCompute data source
        "column": [],  # Name of the column in MaxCompute. The column name sequence corresponds to that in Table Store.
        "table": "",  # Name of a table in MaxCompute. It must be created first; otherwise, the task may fail.
        "partition": "",  # It is required if the table is partitioned. For non-partition tables, do not set this parameter. The partition information of the data table must be written. Specify the parameter until the last-level partition.
        "truncate": false  # Indicates whether to clear the previous data
      }
    }
    }
    }
    Note For detailed configurations, see Configure Table Store Reader and Configure MaxCompute Writer.
  6. Click Save.

Step 4. Run the task (test)

  1. At the top of the page, click operation.

    If no variable is included in the configurations, the task is executed immediately. If a variable exists, you must enter the actual value of the variable, and then click OK. Then, the task starts running.

  2. After running the task, you can check whether the task is successful, and view the number of exported data rows in the log.

Step 5. Set scheduling parameters

  1. At the top of the page, click Data Development.
  2. On the Task Development tab, double-click the created task OTStoODPS.

  3. Click Scheduling Configuration to set the scheduling parameters.

    To set the task to start running on the next day, configure the following parameters as shown.



    The configurations are described as follows:

    Parameter Description
    Scheduling status It is not selected by default, indicating running the task.
    Auto retry We recommend that you select this parameter so that the system can retry after an error occurs.
    Activation date The default value is recommended.
    Scheduling period Minute is used in this example.
    Start time It is set to 00:00 in this example.
    Interval The scheduling interval is set to 5 minutes in this example.
    End time It is set to 23:59 in this example.
    Dependency attribute Set the Dependency Attribute based on your business needs, or retain the default value.
    Cross-cycle dependency Select Self-dependent; operation can continue after the conclusion of the previous scheduling period.
  4. Click Parameter Configuration to set the parameters.
    The parameters are described as follows.
    Parameter Description
    ${bdp.system.bizdate} It does not need to be configured.
    startTime It is the Start Time variable set in Scheduling Configuration. In this example, it is set to $[yyyymmddhh24miss-10/24/60], indicating a time equal to the scheduling task start time minus 10 minutes.
    endTime It is the End Time variable set in Scheduling Configuration. In this example, it is set to $[yyyymmddhh24miss-5/24/60], indicating a time equal to the scheduling task start time minus 5 minutes.

Step 6. Submit the task

  1. At the top of the page, click Submit.

  2. In the displayed box, click Confirm Submission.

    After the task is submitted, the current file is read-only.

Step 7. Check the task

  1. At the top of the page, click Operation Center.

  2. In the left-side navigation pane, click Task List > Cycle Task to view the newly created task OTStoODPS.
  3. The task starts running at 00:00 on the next day.
    • In the left-side navigation pane, click Task O&M > Cycle Instance to view scheduling tasks to be executed on the day. Click the instance name to view the details.
    • You can view the log when a task is running or after it is completed.

Step 8. View the data that has been imported to MaxCompute

  1. At the top of the page, click Data Management.

  2. In the left-side navigation pane, click All Data.
  3. Find the table (ots_gps_data) to which the data is imported, and click the table to go to its corresponding details page.
  4. At the right-side, click the preview data tab to view the imported data.