PolarDB-X lets you control the degree of parallelism (DOP) for DDL statements, so you can trade off execution speed against resource consumption based on your cluster capacity and traffic patterns.
Prerequisites
Before you begin, ensure that:
Your PolarDB-X kernel version is V5.4.15-16715927 or later to control DOP for physically executed DDL statements
Your PolarDB-X kernel version is V5.4.16-16825599 or later to control DOP for logically executed DDL statements
Usage notes
Increasing DDL parallelism consumes more CPU and I/O on compute nodes (CNs) and data nodes (DNs). Before raising DOP, verify that your cluster has enough headroom and that the DOP adjustment is required based on your business requirements.
Most performance tuning applies to logically executed DDL — operations such as adding global secondary indexes (GSIs) or repartitioning large tables. Physically executed DDL (such as adding a local index) benefits from parallelism too, but the bottleneck and the parameters differ.
How it works
DDL statements in PolarDB-X execute in one of two modes. For the full list of which statement types use which mode, see Online DDL.
Physically executed DDL
CNs forward the DDL statement to DNs, which execute it directly on physical tables. The resource bottleneck is on the DNs. PolarDB-X exposes parameters that control how many physical DDL statements are sent in parallel.
Logically executed DDL
CNs handle the entire process — data backfilling, multi-write, verification, and metadata management. Data backfilling consumes most of the time and resources. PolarDB-X exposes parameters that control the backfilling DOP and throughput limits.
Track any running DDL with:
show ddl status;Control DOP for physically executed DDL
For physically executed DDL, the following parameters control how physical DDL statements are forwarded to DNs.
| Parameter | Type | Level | Description | Default value | Max recommended value |
|---|---|---|---|---|---|
MERGE_DDL_CONCURRENT | bool | Session or global | Send physical DDL statements in parallel. | false | true |
MERGE_CONCURRENT | bool | Session or global | Establish connections to database shards in parallel. | false | true |
PREFETCH_SHARDS | int | Session or global | Maximum DOP for sending physical DDL statements across all physical databases. | 1 × number of DNs | CPU cores per DN × number of DNs① |
①A single physical DDL statement saturates approximately one CPU core. Setting PREFETCH_SHARDS above the total CPU core count across all DNs yields diminishing returns.
Example
Instance: 2 CNs, 2 DNs, 16 CPU cores and 64 GB memory each. Add a local index to bmsql_order_line (Transaction Processing Performance Council Benchmark C (TPC-C) 10,000 warehouse). With sufficient resources, pass the parameters as a hint:
/*+TDDL:cmd_extra(MERGE_DDL_CONCURRENT=true,MERGE_CONCURRENT=true,PREFETCH_SHARDS=32)*/
ALTER TABLE `bmsql_order_line` ADD LOCAL INDEX `i_ol_d_id`(`ol_d_id`);Monitor concurrent execution on DNs:
show physical processlist where info like "%i_ol_d_id%";Control DOP for logically executed DDL
For logically executed DDL, the bottleneck is data backfilling on CNs. The parameters differ depending on the DDL operation type.
Add GSIs, repartition tables, or perform Online Modify Column (OMC)
| Parameter | Type | Level | Description | Default value | Max recommended value |
|---|---|---|---|---|---|
BACKFILL_PARALLELISM | int | Global | Maximum concurrent backfilling threads globally. | max(CPU_cores, 8) * 4 | Generally no adjustment needed |
SLIDE_WINDOW_TIME_INTERVAL | int | Session or global | Delay before concurrent backfilling starts for large physical tables. Unit: seconds. | 60 | 0 when resources are sufficient |
GSI_BACKFILL_PARALLELISM | int | Session or global | Maximum concurrent backfilling threads per DDL task. | -1 | Generally no adjustment needed |
PHYSICAL_TABLE_BACKFILL_PARALLELISM | int | Session or global | Maximum concurrent backfilling threads per physical table. | 4 | CPU cores per CN × 2, based on load |
GSI_BACKFILL_SPEED_LIMITATION | int | Session or global | Maximum backfilling rate. Unit: row/s. | 150000 | -1 based on load |
Move partitions or scale tables
| Parameter | Type | Level | Description | Default value | Max recommended value |
|---|---|---|---|---|---|
BACKFILL_PARALLELISM | int | Global | Maximum concurrent backfilling threads globally. | max(CPU_cores, 8) * 4 | Generally no adjustment needed |
SLIDE_WINDOW_TIME_INTERVAL | int | Session or global | Delay before concurrent backfilling starts for large physical tables. Unit: seconds. | 60 | 0 when resources are sufficient |
SCALEOUT_BACKFILL_PARALLELISM | int | Session or global | Maximum concurrent backfilling threads per DDL task. | -1 | Generally no adjustment needed |
PHYSICAL_TABLE_BACKFILL_PARALLELISM | int | Session or global | Maximum concurrent backfilling threads per physical table. | 4 | CPU cores per CN × 2, based on load |
SCALEOUT_BACKFILL_SPEED_LIMITATION | int | Session or global | Maximum backfilling rate. Unit: row/s. | 300000 | -1 based on load |
Recommended configurations
| Scenario | Key parameters | Notes |
|---|---|---|
| Production hours | PHYSICAL_TABLE_BACKFILL_PARALLELISM=4, GSI_BACKFILL_SPEED_LIMITATION=150000 | Default settings limit rate to ~150,000 rows/s; minimal business traffic impact |
| Maintenance window (off-peak) | SLIDE_WINDOW_TIME_INTERVAL=0, PHYSICAL_TABLE_BACKFILL_PARALLELISM=8, GSI_BACKFILL_SPEED_LIMITATION=-1 | No throttling; CN CPU usage increases significantly — verify CN headroom first |
CN CPU usage is a significant factor for logically executed DDL. Before increasing PHYSICAL_TABLE_BACKFILL_PARALLELISM, confirm that your CNs have available capacity.
Example: add a global index at default settings
Instance: 2 CNs, 2 DNs, 16 CPU cores and 64 GB memory each. Add a global index to bmsql_order_line (TPC-C 10,000 warehouse) using default parameter values:
/*+TDDL:cmd_extra(PURE_ASYNC_DDL_MODE=true)*/
ALTER TABLE `bmsql_order_line` ADD GLOBAL INDEX `g_i_ol_d_id`(`ol_d_id`) partition by hash(`ol_d_id`) partitions 32;Check progress:
show ddl status;
+----------------------------------+-----------+
| METRIC | VALUE |
+----------------------------------+-----------+
| BACKFILL_PARALLELISM | 32 |
| BACKFILL_ROWS_FINISHED | 288679866 |
| BACKFILL_ROWS_SPEED | 150186 |
| BACKFILL_TASK_FAILED | 0 |
| BACKFILL_TASK_FINISHED | 1 |
| BACKFILL_TASK_TOTAL | 0 |
| BACKFILL_TIME_MILLIS | 3450 |
| CHANGESET_APPLY_PARALLELISM | 0 |
| CHANGESET_APPLY_ROWS_SPEED | 0 |
| CHECKER_ROWS_FINISHED | 0 |
| CHECKER_TIME_MILLIS | 0 |
| DDL_EXECUTION_TIME_MILLIS | 696412 |
| DDL_JOBS_FINISHED | 5 |
| DDL_JOBS_TOTAL | 6 |
| DDL_TASK_FAILED | 4 |
| DDL_TASK_FINISHED | 107 |
| DDL_TASK_TOTAL | 63 |
| FASTCHECKER_TASK_RUNNING | 0 |
| FASTCHECKER_TASK_WAITING | 0 |
| FASTCHECKER_THREAD_POOL_MAX_SIZE | 1 |
| FASTCHECKER_THREAD_POOL_NOW_SIZE | 0 |
| FASTCHECKER_THREAD_POOL_NUM | 2 |
| THROTTLE_RATE | 150000 |
+----------------------------------+-----------+Key metrics to watch:
BACKFILL_PARALLELISM — total active backfilling threads
BACKFILL_ROWS_SPEED — current backfilling rate (rows/s)
THROTTLE_RATE — active rate limit;
-1means no limit
With default settings, THROTTLE_RATE is 150000, so the backfilling rate is capped at about 150,000 rows/s.
Example: add a global index at maximum DOP
To remove throttling and maximize parallelism (maintenance window or off-peak hours):
/*+TDDL:cmd_extra(SLIDE_WINDOW_TIME_INTERVAL=0,PURE_ASYNC_DDL_MODE=true,PHYSICAL_TABLE_BACKFILL_PARALLELISM=8,GSI_BACKFILL_PARALLELISM=32,GSI_BACKFILL_SPEED_LIMITATION=-1)*/
ALTER TABLE bmsql_order_line ADD GLOBAL INDEX g_i_ol_d_id(`ol_d_id`) partition by hash(`ol_d_id`) partitions 32;With THROTTLE_RATE at -1 and all 32 threads active, the backfilling rate reaches approximately 860,000 rows/s:
show ddl status;
+----------------------------------+----------+
| METRIC | VALUE |
+----------------------------------+----------+
| BACKFILL_PARALLELISM | 32 |
| BACKFILL_ROWS_FINISHED | 37272038 |
| BACKFILL_ROWS_SPEED | 867669 |
| BACKFILL_TASK_FAILED | 0 |
| BACKFILL_TASK_FINISHED | 1 |
| BACKFILL_TASK_TOTAL | 0 |
| BACKFILL_TIME_MILLIS | 3450 |
| CHANGESET_APPLY_PARALLELISM | 0 |
| CHANGESET_APPLY_ROWS_SPEED | 0 |
| CHECKER_ROWS_FINISHED | 0 |
| CHECKER_TIME_MILLIS | 0 |
| DDL_EXECUTION_TIME_MILLIS | 112397 |
| DDL_JOBS_FINISHED | 3 |
| DDL_JOBS_TOTAL | 4 |
| DDL_TASK_FAILED | 2 |
| DDL_TASK_FINISHED | 75 |
| DDL_TASK_TOTAL | 42 |
| FASTCHECKER_TASK_RUNNING | 0 |
| FASTCHECKER_TASK_WAITING | 0 |
| FASTCHECKER_THREAD_POOL_MAX_SIZE | 1 |
| FASTCHECKER_THREAD_POOL_NOW_SIZE | 0 |
| FASTCHECKER_THREAD_POOL_NUM | 2 |
| THROTTLE_RATE | -1 |
+----------------------------------+----------+
23 rows in set (0.02 sec)Performance testing
The following benchmarks use this instance configuration and test table:
Instance: 2 CNs + 2 DNs, each with 16 CPU cores and 64 GB memory
Table:
bmsql_order_linein a TPC-C AUTO mode database (~3 billion rows, 230 GB, TPC-C 10,000 warehouse). Import the dataset using Benchmark Boot.
CREATE TABLE `bmsql_order_line` (
`ol_w_id` int(11) NOT NULL,
`ol_d_id` tinyint(4) NOT NULL,
`ol_o_id` int(11) NOT NULL,
`ol_number` tinyint(4) NOT NULL,
`ol_i_id` int(11) NOT NULL,
`ol_delivery_d` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`ol_amount` decimal(6, 2) DEFAULT NULL,
`ol_supply_w_id` int(11) DEFAULT NULL,
`ol_quantity` int(11) DEFAULT NULL,
`ol_dist_info` char(24) COLLATE latin1_bin DEFAULT NULL,
PRIMARY KEY (`ol_w_id`, `ol_d_id`, `ol_o_id`, `ol_number`)
) ENGINE = InnoDB DEFAULT CHARSET = latin1 DEFAULT COLLATE = latin1_bin
PARTITION BY KEY(`ol_w_id`)
PARTITIONS 32Test 1: add a local index
/*+TDDL:cmd_extra(PURE_ASYNC_DDL_MODE=true,ENABLE_TWO_PHASE_DDL=false,ENBALE_DRDS_MULTI_PHASE_DDL=false,MERGE_DDL_CONCURRENT=true,MERGE_CONCURRENT=true,PREFETCH_SHARDS=xxx)*/
ALTER TABLE `bmsql_order_line` ADD LOCAL INDEX i_ol_o_id(ol_o_id);PREFETCH_SHARDS setting | Execution time (s) | Speedup | Avg DN CPU usage | Avg DN IOPS usage |
|---|---|---|---|---|
| 2 — default (DOP per DN: 1) | 6,880 | 1× | 135% | 55% |
| 4 (DOP per DN: 2) | 3,503 | 1.96× | 240% | 95% |
| 8 (DOP per DN: 4) | 2,514 | 2.69× | 470% | 100% |
| 16 (DOP per DN: 8) | 2,152 | 3.19× | 896% | 100% |
| 32 (DOP per DN: 16) | 1,954 | 3.52× | 1,590% | 100% |
IOPS saturates at PREFETCH_SHARDS=8. Beyond that, additional threads improve throughput only modestly while CPU usage continues to climb.
Test 2: add a global index
/*+TDDL:cmd_extra(PURE_ASYNC_DDL_MODE=true,SLIDE_WINDOW_TIME_INTERVAL=0,PHYSICAL_TABLE_BACKFILL_PARALLELISM=xxx,GSI_BACKFILL_SPEED_LIMITATION=xxx)*/ ALTER TABLE `bmsql_order_line` ADD GLOBAL INDEX `g_i_ol_o_id`(`ol_o_id`) PARTITION BY HASH(`ol_o_id`) PARTITIONS 32;| Setting | Execution time (s) | Speedup | Avg DN CPU usage | Avg DN IOPS usage | Avg CN CPU usage |
|---|---|---|---|---|---|
PHYSICAL_TABLE_BACKFILL_PARALLELISM=1, GSI_BACKFILL_SPEED_LIMITATION=150000 (no per-table concurrency, rate-limited) | 17,441 | 1× | 190% | 8% | 357% |
SLIDE_WINDOW_TIME_INTERVAL=0, PHYSICAL_TABLE_BACKFILL_PARALLELISM=4, GSI_BACKFILL_SPEED_LIMITATION=-1 (default settings, no throttling) | 7,124 | 2.45× | 594% | 23% | 723% |
SLIDE_WINDOW_TIME_INTERVAL=0, PHYSICAL_TABLE_BACKFILL_PARALLELISM=8, GSI_BACKFILL_SPEED_LIMITATION=-1 (8 threads per physical table, no throttling) | 4,020 | 4.34× | 1,175% | 54% | 1,476% |
CN CPU usage is a significant factor for logically executed DDL. Before increasing PHYSICAL_TABLE_BACKFILL_PARALLELISM, confirm that your CNs have available capacity.