edit-icon download-icon

Full export (script mode)

Last Updated: Mar 20, 2018

Data Integration supports data synchronization in wizard mode and script mode. Wizard mode is simpler while script mode is more flexible.

This topic describes how to export full data from Table Store (generated by the Put, Update, and Delete actions) to MaxCompute through Data Integration.

Step 1. Create a Table Store data source

Note:

  • Skip this step if a data source is already created.
  • If you do not want to create the data source, you can specify the endpoint, instanceName, AccessKeyID, and AccessKeySecret on the subsequent configuration page.

For more information about how to create a data source, see Create a Table Store data source.

Step 2. Create a MaxCompute data source

This operation is similar to Step 1. You only need to select MaxCompute as the data source.

In this example, the data source is named OTS2ODPS.

Step 3. Create a full export tunnel

  1. On the Data IDE page, click Sync Tasks.

    Select Synchronization Task

  2. Select Script Mode.

  3. In the Import Template dialog box that appears, set Source Type to Table Store (OTS) and Type of Objective to MaxCompute (ODPS).

  4. Click OK to go to the configuration page.

  5. Set configuration parameters.

    1. {
    2. "type": "job",
    3. "version": "1.0",
    4. "configuration": {
    5. "setting": {
    6. "errorLimit": {
    7. "record": "0" # Maximum number of errors allowed
    8. },
    9. "speed": {
    10. "mbps": "1", # Maximum traffic, in Mbps.
    11. "concurrent": "1" # Number of concurrent tasks.
    12. }
    13. },
    14. "reader": {
    15. "plugin": "ots", # Name of the plugin read
    16. "parameter": {
    17. "datasource": "", # Name of the data source
    18. "table": "", # Name of the table
    19. "column": [ # Name of the column in Table Store that needs to be exported to MaxCompute
    20. {
    21. "name": "column1"
    22. },
    23. {
    24. "name": "column2"
    25. },
    26. {
    27. "name": "column3"
    28. },
    29. {
    30. "name": "column4"
    31. },
    32. {
    33. "name": "column5"
    34. }
    35. ],
    36. "range": { # Range of the data to be exported. In full export mode, the range is from INF_MIN to INF_MAX.
    37. "begin": [ # Start position of the data to be exported. The minimum position is INF_MIN. The number of configuration items set in "begin" must be the same as the number of primary key columns of the table in Table Store.
    38. {
    39. "type": "INF_MIN"
    40. },
    41. {
    42. "type": "INF_MIN"
    43. },
    44. {
    45. "type": "STRING", # Indicates that the start position in the third column is begin1.
    46. "value": "begin1"
    47. },
    48. {
    49. "type": "INT", # Indicates that the start position in the fourth column is 0.
    50. "value": "0"
    51. }
    52. ],
    53. "end": [ # End position of the data to be exported
    54. {
    55. "type": "INF_MAX"
    56. },
    57. {
    58. "type": "INF_MAX"
    59. },
    60. {
    61. "type": "STRING",
    62. "value": "end1"
    63. },
    64. {
    65. "type": "INT",
    66. "value": "100"
    67. }
    68. ],
    69. "split": [ # Indicates the partition scope, which is not configured in normal cases. If performance is poor, you can open a ticket to submit a query.
    70. {
    71. "type": "INF_MIN"
    72. },
    73. {
    74. "type": "STRING",
    75. "value": "splitPoint1"
    76. },
    77. {
    78. "type": "STRING",
    79. "value": "splitPoint2"
    80. },
    81. {
    82. "type": "STRING",
    83. "value": "splitPoint3"
    84. },
    85. {
    86. "type": "INF_MAX"
    87. }
    88. ]
    89. }
    90. }
    91. },
    92. "writer": {
    93. "plugin": "odps", # Name of the plugin written by MaxCompute
    94. "parameter": {
    95. "datasource": "", # Name of the MaxCompute data source
    96. "column": [], # Name of the column in MaxCompute. The column name sequence corresponds to that in Table Store.
    97. "table": "", # Name of a table in MaxCompute. It must be created first; otherwise, the task may fail.
    98. "partition": "", # It is required if the table is partitioned. For non-partition tables, do not set this parameter. The partition information of the data table must be written. Specify the parameter until the last-level partition.
    99. "truncate": false # Indicates whether to clear the previous data
    100. }
    101. }
    102. }
    103. }

    Note: For detailed configurations, see Configure Table Store Reader and Configure MaxCompute Writer.

  6. Click Save.

Step 4. Run the task (test)

  1. At the top of the page, click operation.

    If no variable is included in the configurations, the task is executed immediately. If a variable exists, you must enter the actual value of the variable, and then click OK. Then, the task starts running.

  2. After running the task, you can check whether the task is successful, and view the number of exported data rows in the log.

Step 5. Set scheduling parameters

  1. At the top of the page, click Data Development.

  2. On the Task Development tab, double-click the created task OTStoODPS.

    Open a task

  3. Click Scheduling Configuration to set the scheduling parameters.

  1. To set the task to start running on the next day, configure the following parameters as shown.
  2. ![Scheduling configuration](http://docs-aliyun.cn-hangzhou.oss.aliyun-inc.com/assets/pic/62869/intl_en/1516179279350/61034-9-en.png)
  3. The parameters are described as follows.
  4. <table>

Parameter

Description

Scheduling status

It is not selected by default, indicating running the task.

Auto retry

We recommend that you select this parameter so that the system can retry after an error occurs.

Activation date

The default value is recommended.

Scheduling period

Minute is used in this example.

Start time

It is set to 00:00 in this example.

Interval

The scheduling interval is set to 5 minutes in this example.

End time

It is set to 23:59 in this example.

Dependency attribute

Set the Dependency Attribute based on your business needs, or retain the default value.

Cross-cycle dependency

Select Self-dependent; operation can continue after the conclusion of the previous scheduling period.

  1. Click Parameter Configuration to set the parameters.

    The parameters are described as follows.

    ParameterDescription
    ${bdp.system.bizdate}It does not need to be configured.
    startTimeIt is the Start Time variable set in Scheduling Configuration. In this example, it is set to $[yyyymmddhh24miss-10/24/60], indicating a time equal to the scheduling task start time minus 10 minutes.
    endTimeIt is the End Time variable set in Scheduling Configuration. In this example, it is set to $[yyyymmddhh24miss-5/24/60], indicating a time equal to the scheduling task start time minus 5 minutes.

Step 6. Submit the task

  1. At the top of the page, click Submit.

    Submit a task

  2. In the displayed box, click Confirm Submission.

    After the task is submitted, the current file is read-only.

Step 7. Check the task

  1. At the top of the page, click Operation Center.

    Go to the O&M Center

  2. In the left-side navigation pane, click Task List > Cycle Task to view the newly created task OTStoODPS.

  3. The task starts running at 00:00 on the next day.

    • In the left-side navigation pane, click Task O&M > Cycle Instance to view scheduling tasks to be executed on the day. Click the instance name to view the details.

    • You can view the log when a task is running or after it is completed.

Step 8. View the data that has been imported to MaxCompute

  1. At the top of the page, click Data Management.

    Go to Data Management

  2. In the left-side navigation pane, click All Data..

  3. Find the table (ots_gps_data) to which the data is imported, and click the table to go to its corresponding details page.

  4. At the right-side, click the preview data tab to view the imported data.

    Data preview

Thank you! We've received your feedback.