×
Community Blog How Does PolarDB-X Optimize Batch Insert?

How Does PolarDB-X Optimize Batch Insert?

This article describes how to specify an appropriate batch size and DOP based on your business requirements.

The Batch Insert statement is a common way to write data to databases. The PolarDB-X is compatible with the MySQL protocol and syntax. The batch insert syntax is listed below:

INSERT [IGNORE] [INTO] table_name(column_name, ...) VALUES (value1, ...), (value2, ...), ...;

The following factors can affect the performance of a batch insert query:

  1. Batch size
  2. The degree of parallelism (DOP)
  3. The number of shards
  4. The number of columns
  5. The number of global secondary indexes
  6. The number of sequences

Factors such as the number of shards, the number of columns, the number of GSIs, and the number of sequences are domestic demand factors. They are set according to actual requirements and often affect the read performance. For example, if the number of GSIs is large, the write performance will decrease, but the read performance will be improved. This topic does not describe the impacts of the factors. This topic describes how to specify an appropriate batch size and DOP based on your business requirements.

Test Environment

The following table displays the settings used in the test:

Specification Parameter
PolarDB-X Version polarx-kernel_5.4.11-16279028_xcluster-20210802
Node Specifications 16 CPU cores, 64 GB memory
Nodes 4

You can use the following statement to create a table:

CREATE TABLE `sbtest1` (
    `id` int(11) NOT NULL,
    `k` int(11) NOT NULL DEFAULT '0',
    `c` char(120) NOT NULL DEFAULT '',
    `pad` char(60) NOT NULL DEFAULT '',
    PRIMARY KEY (`id`),
    KEY `k_1` (`k`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4;

BATCH_INSERT_POLICY=SPLIT

PolarDB-X is optimized for batch data writing. Batch Insert is optimized to ensure better concurrency. When the size of a single Batch Insert statement exceeds 256K, PolarDB-X dynamically splits the Batch Insert statement into multiple small batches and executes them in series between multiple small batches. This feature is called SPLIT.

The best performance is ensured through the BATCH_INSERT_POLICY=SPLIT mechanism while reducing the cost of batch inserts PolarDB-X in parallel execution. The load imbalance of multiple nodes is avoided in distributed areas as much as possible.

Parameters:

  1. BATCH_INSERT_POLICY: You can set the value to SPLIT or NONE. The default value is SPLIT. If the value of this parameter is set to SPLIT, the automatic split feature is enabled.
  2. MAX_BATCH_INSERT_SQL_LENGTH: This parameter specifies the value of a size threshold for SQL statements. The default value is 256, and the unit is KB. If the default value is used, the system automatically splits batch insert statements whose size exceeds 256 KB.
  3. BATCH_INSERT_CHUNK_SIZE_DEFAULT: This parameter specifies the maximum size of each batch after the statement is split. The default value is 200. They represent the batch size of small batches after each split when the dynamic split batch is triggered.

You can add the following hint statement: /*+ TDDL:CMD_EXTRA(BATCH_INSERT_POLICY=NONE)*/ to disable the BATCH_INSERT_POLICY=SPLIT mechanism. The goal of this parameter is to disable the BATCH_INSERT_POLICY policy. It ensures the batch size is not automatically split when the PolarDB-X is executed. It can be used to verify the performance when the batch size is 2000, 5000, and 10000. The test results show the increase is not obvious after the batch size exceeds 1000.

Performance Benchmark for a Single Table

Background

In distributed scenarios, an order table is only placed on one host. Its performance can be used as a basic performance baseline to evaluate the horizontal scaling capability of the partition table. Data are evenly distributed to multiple hosts through a partition table.

The test method is to perform the Batch Insert operation on a single table in the PolarDB-X. The data of a single table only exists in one data storage node. PolarDB-X writes data to the corresponding data storage node based on the table definition.

Scenario 1: Batch Size

Parameters: [parallelism: 16] [column: 4] [gsi: none] [sequence: none]

batch size 1 10 100 500 1000 2000 5000 10000
PolarDB-X [Single Table] Performance (Rows per Second) 5397 45653 153216 211976 210644 215103 221919 220529

Scenario 2: Parallelism

Parameters: [batch size:1,000] [column: 4] [gsi: none] [sequence: none]

thread 1 2 4 8 16 32 64 128
PolarDB-X [Single Table] Performance (Rows per Second) 22625 41326 76052 127646 210644 223431 190138 160858

Summary

We recommend specifying a batch size of 1,000 rows and setting the DOP to a value in the range of 16 to 32 to test the performance of a table.

Note: When the batch size is 2,000, 5,000, and 10,000, you need to add a hint parameter to disable the SPLIT feature. According to the test results, the increase is not obvious after the batch size exceeds 1000. Example: /*+ TDDL:CMD_EXTRA(BATCH_INSERT_POLICY=NONE)*/

Benchmark Testing for the Performance of Sharded Tables

The batch size and DOP affect the performance of batch insert queries. The tests described in this section test query performance based on batch size and DOP.

Scenario 1: Batch Size

When a batch insert statement is executed on a sharded table, the sharding function used to shard the table shards the values specified in the statement. The batch size of the data that is pushed down to data nodes is different from the specified batch size. The following figure shows the process:

1

Therefore, under data shards, the Batch Insert statement of the PolarDB-X can be larger, or the data of the same physical table can be put into one Batch Insert statement as much as possible. This ensures that the batch size that is pushed down to a single data shard after the shard function separates values is appropriate to improve the performance of the storage node.

Sub-Scenario 1: BATCH_INSERT_POLICY=SPLIT

Parameters: [BATCH_INSERT_POLICY: on] [parallelism: 32] [number of parts: 32] [column: 4] [gsi: none] [sequence: none]

batch size 1 10 100 500 1000 2000 5000 10000
PolarDB-X [Number of Parts: 32] Performance (Rows per Second) 12804 80987 229995 401215 431579 410120 395398 389176

Note: If batch size >= 2000, the BATCH_INSERT_POLICY policy is triggered.

Sub-Scenario 2: BATCH_INSERT_POLICY=NONE

Parameters: [BATCH_INSERT_POLICY: disabled] – The other parameters are the same.

batch size 1000 2000 5000 10000 20000 30000 50000
PolarDB-X [Number of Parts: 32] Performance (Rows per Second) 431579 463112 490350 526751 549990 595026 685500

Summary

  1. BATCH_INSERT_POLICY=SPLIT, batch size is 1000, and the overall performance is 43w rows per second, which is twice that of a single table.
  2. BATCH_INSERT_POLICY=NODE, the values in this test are randomly distributed, and the splitting function is Hash, so the data distributed to each shard is uniform. In theory, the performance will be better under the batch size with the number of shards * 1,000. When the maximum batch size is 50,000, the overall performance will be 680,000 rows per second.

Scenario 2: Parallelism

The appropriate criterion for determining the degree of parallelism is to fill up the CPU utilization of PolarDB-X data nodes or fill up the IOPS to achieve better performance. Since the Batch Insert statement has no calculation, the overhead of PolarDB-X compute nodes is small. The main overhead is in PolarDB-X data nodes. If the degree of parallelism is too small or too large, the performance will be affected. The factors that affect the value of parallelism include the number of nodes, node specifications (number of cores and CPU), and thread pool pressure. Therefore, it is difficult to obtain an exact number of parallelism. We recommend testing it in a practical environment to find out the best parallelism suitable for this environment.

Sub-Scenario 1: Test the different parallelism of the 1,000 batch size under four nodes.

Parameters: [batch size:1000] [column: 4] [gsi: none] [sequence: none]

thread 1 2 4 8 16 32 64 80 96
PolarDB-X [Number of Parts: 32] Performance (Rows per Second) 40967 80535 151415 246062 367720 431579 478876 499918 487173

Summary

In the test, the performance reaches the peak value when the value of DOP is in the range of 64 to 80. The highest performance is approximately 500,000 rows per second.

Sub-Scenario 2: A batch insert statement is executed on different sharded tables deployed on different groups of nodes.

Parameters: [2 nodes: 2CN*2DN] [batch size: 20000] [column: 4] [gsi: none] [sequence: none]

thread 4 8 12 16
PolarDB-X [Number of Parts: 16] Performance (Rows per Second) 159794 302754 296298 241444

Parameters: [3 nodes: 3CN*3DN] [batch size: 20000] [column: 4] [gsi: none] [sequence: none]

thread 9 12 15 18
PolarDB-X [Number of Parts: 24] Performance (Rows per Second) 427212 456050 378420 309052

Parameters: [4 nodes: 4CN*4DN] [batch size:20000] [column: 4] [gsi: none] [sequence: none]

thread 16 32 40 64
PolarDB-X [Number of Parts: 32] Performance (Rows per Second) 464612 549990 551992 373268

Summary

When the number of database nodes increases, the value of DOP needs to be increased to ensure high performance. The peak value under two nodes is 8 concurrent 30w. The peak value under three nodes is 12 concurrent 45w. The peak value under four nodes is 32 concurrent 55w. The overall linear rate is about 0.9~1 with the performance improvement of the number of nodes.

Sub-Scenario 3: The Degree of Parallelism Test under Different Node Specifications

Parameters: [batch size: 20000] [column: 4] [gsi: none] [sequence: none]

thread 4 8 10 12 16
PolarDB-X [4-core 16GB] Performance (Rows per Second) 165674 288828 276837 264873 204738
thread 8 10 12 16
PolarDB-X [8-core 32GB] Performance (Rows per Second) 292780 343498 315982 259892
thread 16 32 40 64
PolarDB- [16-core 64GB] Performance (Rows per Second) 464612 549990 551992 373268

Summary

If the node specifications are upgraded, the value of DOP needs to be increased to ensure high performance. If the statement is executed on a node whose specification is 4 cores and 16 GB of memory, the system reaches the highest performance when the DOP is set to 8. The highest performance is approximately 280,000 rows per second. If the statement is executed on a node whose specification is 8 cores and 32 GB of memory, the system reaches the highest performance when the DOP is set to 10. The highest performance is approximately 340,000 rows per second. If the statement is executed on a node whose specification is 16 cores and 64 GB of memory, the system reaches the highest performance when the DOP is set to 32. The highest performance is approximately 550,000 rows per second. The linear rate of the performance increase is approximately from 50% to 60% when the specification of the node is upgraded to the nearest specification.

Test Summary

  • We recommend specifying a batch size of 1,000 rows and setting the DOP to a value in the range of 16 to 32 to ensure high concurrency and resource usage of your PolarDB-X instance.
  • If you want to increase the batch size, we recommend setting the batch size to the number of shards * [100 to 1,000], such as 20,000 to 50,000. Make sure the size of each statement used to insert a batch of data is in the range of 2 MB to 8 MB. The maximum size is 16 MB. If you do not want to use the automatic split feature, you must use a hint to specify BATCH_INSERT_POLICY=NONE. If you use an SQL statement to query a large size of data, high workloads exist on each compute node in a distributed system, and issues may occur (such as high memory usage and unbalanced loads among multiple nodes).
  • Batch Insert queries consume IOPS resources. CPU utilization and memory usage are not the primary factors that cause performance bottleneck issues. If you want to increase the resources of your PolarDB-X instance to improve the performance of queries, we recommend adding nodes to your PolarDB-X instance.
  • We recommend using Batch Tool, a PolarDB-X supporting import, and export tool for batch import of offline text data. This tool will be open-source soon. Please visit this link for more information about how to use it.
0 0 0
Share on

ApsaraDB

202 posts | 17 followers

You may also like

Comments