This topic describes how to synchronize data from one table to another table in Tablestore by using Tunnel Service, DataWorks, or DataX.

Prerequisites

The destination table is created. For more information, see Create a data table.

Notice The destination table must contain the columns that you want to synchronize from the source table.

Use Tunnel Service to synchronize data

After the tunnel of the source table is created, you can use a Tablestore SDK to synchronize data from the source table to the destination table. You can specify custom logic to process data for your business during synchronization.

  1. Create the tunnel of the source table in the Tablestore console or by using a Tablestore SDK. Record the tunnel ID. For more information, see Quick start or SDK usage.
  2. Synchronize data by using a Tablestore SDK.

    Sample code:

    public class TunnelTest {
    
        public static void main(String[] args){
           TunnelClient tunnelClient = new TunnelClient("endpoint",
                   "accessKeyId","accessKeySecret","instanceName");
    
            TunnelWorkerConfig config = new TunnelWorkerConfig(new SimpleProcessor());
    
            // You can view the tunnel ID on the Tunnels tab of the Tablestore console or call describeTunnelRequest to query the tunnel ID. 
            TunnelWorker worker = new TunnelWorker("tunnelId", tunnelClient, config);
            try {
                worker.connectAndWorking();
            } catch (Exception e) {
                e.printStackTrace();
                worker.shutdown();
                tunnelClient.shutdown();
            }
        }
    
        public static class SimpleProcessor implements IChannelProcessor{
        
           // Connect the tunnel to the destination table. 
           TunnelClient tunnelClient = new TunnelClient("endpoint",
                   "accessKeyId","accessKeySecret","instanceName");
                   
           @Override
            public void process(ProcessRecordsInput processRecordsInput) {
            
                // Incremental data or full data is returned in ProcessRecordsInput. 
                List<StreamRecord> list = processRecordsInput.getRecords();
                for(StreamRecord streamRecord : list){
                    switch (streamRecord.getRecordType()){
                        case PUT:
                            // Specify the custom logic that you want to use to process data for your business. 
                            //putRow
                            break;
                        case UPDATE:
                            //updateRow
                            break;
                        case DELETE:
                            //deleteRow
                            break;
                    }
    
                    System.out.println(streamRecord.toString());
                }
            }
    
            @Override
            public void shutdown() {
                
            }
        }
    }

Use DataWorks or DataX to synchronize data

You can use DataWorks or DataX to synchronize data from the source table to the destination table. This section describes how to synchronize data by using DataWorks.

  1. Add data sources of Tablestore.

    Add the Tablestore instances of the source table and the destination table as data sources. For more information, see Add a data source.

  2. Create a synchronization task node.
    1. Log on to the DataWorks console as the project administrator.
      Note Only the project administrator role can be used to add data sources. Members who are assigned other roles can only view data sources.
    2. In the left-side navigation pane, click Workspaces and select a region.
    3. On the Workspaces page, find the workspace that you want to manage and click DataStudio in the Actions column.
    4. On the Scheduled Workflow page of the DataStudio console, click Business Flow and select a business flow.

      For information about how to create a business flow, see Create a workflow.

    5. Click Data Integration, and then click Offline synchronization. The Create Node dialog box appears.
    6. In the Create Node dialog box, enter a node name in the Name field and click Commit.
  3. Configure the data sources.
    1. Click Data Integration. Double-click the name of the node that is created for the data synchronization task.
    2. On the edit page of the synchronization task node, configure the Data source parameter and Data Destination parameter in the Connections section.
      Set the data source type to OTS for Data source and Data Destination. Then, select a data source for Data source and Data Destination. Click the script icon or Click to convert to script to configure a script.
      Notice Tablestore supports only the script mode.
      • Configure Tablestore Reader

        Tablestore Reader reads data from Tablestore. You can specify a data range to extract incremental data from Tablestore. For information about specific operations, see Tablestore Reader.

      • Configure Tablestore Writer

        Tablestore Writer connects to the Tablestore server and writes data to the Tablestore server by using Tablestore SDK for Java. Tablestore Writer provides features to allow users to optimize the write process, such as retries upon write timeouts, retries upon exceptions, and batch submission. For information about specific operations, see Tablestore Writer.

    3. Click the save icon to save the data source configurations.
  4. Run the synchronization task.
    1. Click the start icon.
    2. In the Parameters dialog box, select the resource group that you want to use for scheduling.
    3. Click OK to run the task.

      After the task is run, you can check whether the task is successful and the number of rows that are exported on the Runtime Log tab.

  5. (Optional) Execute the synchronization task at the scheduled time. For more information, see Configure recurrence and dependencies for a node.