This topic describes the causes of and solutions to slow synchronization when Tablestore Reader is used to synchronize full data.

Problem description

When Tablestore Reader is used to synchronize full data, data is synchronized at a low speed. The following script provides an example on how to configure full data synchronization:
"reader": {
  "plugin": "ots",
  "parameter": {
    "datasource": "",
    "table": "",
    "column": [],
    "range": {
      "begin": [
        {
          "type": "INF_MIN"
        }
      ],
      "end": [
        {
          "type": "INF_MAX"
        }
      ]
    }
  }
}

Cause

A large amount of data must be synchronized. However, no split points are configured in the script, and only one thread is created to obtain the data, which affects the speed at which the data is synchronized.

Solution

If you want to synchronize a large amount of data by using Tablestore Reader, configure split points in the script. To configure split points in the script, perform the following steps:
  1. Obtain the information about the required split points by using one of the following methods:
    • Use Tablestore SDK to call the ComputeSplitPointsBySize operation. For more information, see Split data by a specified size.

      Sample response:

      LowerBound:pkname1:INF_MIN, pkname2:INF_MIN
      UpperBound:pkname1:cbcf23c8cdf831261f5b3c052db3479e, pkname2:INF_MIN
      LowerBound:pkname1:cbcf23c8cdf831261f5b3c052db3479e, pkname2:INF_MIN
      UpperBound:pkname1:INF_MAX, pkname2:INF_MAX
    • Download the Tablestore CLI tool. Then, run the following command: points -s splitSize -t tablename. For more information, see Tablestore CLI.
      Note The unit of the splitSize value is 100 MB. If the amount of data that you want to synchronize is small, you do not need to configure split points. If the amount of data that you want to synchronize is large, we recommend that you specify a value for the splitSize parameter based on the maximum number of concurrent threads supported in your environment.

      Sample response:

      [
      
          {
      
              "LowerBound": {
      
                  "PrimaryKeys": [
      
                      {
      
                          "ColumnName": "pkname1",
      
                          "Value": null,
      
                          "PrimaryKeyOption": 2
      
                      },
      
                      {
      
                          "ColumnName": "pkname2",
      
                          "Value": null,
      
                          "PrimaryKeyOption": 2
      
                      }
      
                  ]
      
              },
      
              "UpperBound": {
      
                  "PrimaryKeys": [
      
                      {
      
                          "ColumnName": "pkname1",
      
                          "Value": "cbcf23c8cdf831261f5b3c052db3479e\u0000",
      
                          "PrimaryKeyOption": 0
      
                      },
      
                      {
      
                          "ColumnName": "pkname2",
      
                          "Value": null,
      
                          "PrimaryKeyOption": 2
      
                      }
      
                  ]
      
              },
      
              "Location": "80310717938EDF503FB1E26F70710391"
      
          },
      
          {
      
              "LowerBound": {
      
                  "PrimaryKeys": [
      
                      {
      
                          "ColumnName": "pkname1",
      
                          "Value": "cbcf23c8cdf831261f5b3c052db3479e\u0000",
      
                          "PrimaryKeyOption": 0
      
                      },
      
                      {
      
                          "ColumnName": "pkname2",
      
                          "Value": null,
      
                          "PrimaryKeyOption": 2
      
                      }
      
                  ]
      
              },
      
              "UpperBound": {
      
                  "PrimaryKeys": [
      
                      {
      
                          "ColumnName": "pkname1",
      
                          "Value": null,
      
                          "PrimaryKeyOption": 3
      
                      },
      
                      {
      
                          "ColumnName": "pkname2",
      
                          "Value": null,
      
                          "PrimaryKeyOption": 3
      
                      }
      
                  ]
      
              },
      
              "Location": "80310717938EDF503FB1E26F70710391"
      
          }
      
      ]
      Find the values of the first primary key columns. For example, the pkname1 value of the first LowerBound is null, the pkname1 value of the first UpperBound is "cbcf23c8cdf831261f5b3c052db3479e\u0000", the pkname1 value of the second LowerBound is "cbcf23c8cdf831261f5b3c052db3479e\u0000", and the pkname1 value of the second UpperBound is null. To synchronize full data, configure the following settings in the script:
      "split" : [
      
         {
      
             "type":"STRING",
      
             "value":"cbcf23c8cdf831261f5b3c052db3479e\u0000"
      
         }
      
      ]

      When you run the script, Tablestore splits full data into two parts and concurrently obtains data based on the (INF_MIN,cbcf23c8cdf831261f5b3c052db3479e\u0000) and [cbcf23c8cdf831261f5b3c052db3479e\u0000,INF_MAX) ranges. This way, data synchronization is accelerated.

  2. Configure split points in the script used to synchronize data. The following script provides an example on how to configure split points:
    "range": {
          "begin": [
            {
              "type": "INF_MIN"
            }
          ],
          "end": [
            {
              "type": "INF_MAX"
            }
          ],
          "split": [
            {
              "type": "STRING",
              "value": "splitPoint1"
            },
            {
              "type": "STRING",
              "value": "splitPoint2"
            },
            {
              "type": "STRING",
              "value": "splitPoint3"
            }
          ]
    }

If the synchronization remains slow after you configure split points, submit a ticket to contact the technical support.