PolarDB-X supports ultra-large transactions—transactions that span multiple shards, write large volumes of data, or execute a large number of SQL statements. This page describes how PolarDB-X behaves under each of these conditions, the system limits that apply, and how to size your shards accordingly.
Key limits
Two MySQL-level parameters govern the maximum transaction size in PolarDB-X:
| Parameter | Default | Maximum | Modifiable |
|---|---|---|---|
MAX_ALLOWED_PACKET |
16 MB | — | Yes, via Parameter settings |
MAX_BINLOG_CACHE_SIZE |
2 GB (2,147,483,648 bytes) | 4 GB (MySQL limit) | No |
These limits apply per shard, not per transaction. The total data a transaction can write is:
-
INSERT-only transactions:
number of shards × MAX_BINLOG_CACHE_SIZE— binlog records inserted values only. -
UPDATE-only transactions:
number of shards × (MAX_BINLOG_CACHE_SIZE ÷ 2)— binlog records both before-image and after-image, doubling the binlog footprint.
To increase transaction capacity, split your table into more shards.
MAX_BINLOG_CACHE_SIZE defaults to 2 GB and cannot be modified in PolarDB-X. Even in standard MySQL, the value cannot exceed 4 GB due to MySQL's own limits.
Test environment
All scenarios on this page use the following setup:
| Component | Specification |
|---|---|
| PolarDB-X version | polarx-kernel_5.4.11-16301083_xcluster-20210805 |
| Node specification | 4 CPU cores, 16 GB memory |
| Number of nodes | 4 |
| Client (Elastic Compute Service (ECS) instance) | 4 CPU cores, 16 GB memory, same CIDR block and vSwitch as the PolarDB-X instance |
The test table tb has the following schema. The data type of column c and the number of shards vary by scenario.
CREATE TABLE `tb` (
`id` BIGINT(20) NOT NULL,
`c` LONGBLOB/CHAR(255)/CHAR(1),
PRIMARY KEY (`id`)
);
Shard configurations tested:
-
1 shard: PolarDB-X optimizes the transaction commit as a one-phase commit.
-
8 shards: split on the
idcolumn, 2 shards per data node. -
16 shards: split on the
idcolumn, 4 shards per data node.
Scenario 1: Moderate number of statements, large data per statement
This scenario tests transactions that write a large total volume of data across a moderate number of INSERT statements.
| Parameter | Value |
|---|---|
| SQL statements per transaction | 2,048 |
| Data per statement | 256 KB–8 MB |
| Total data written | 512 MB–16 GB |
| Rows written | 2,048 |
Column c type |
LONGBLOB |
All statements follow this format:
INSERT INTO `tb` VALUES (id, c)
Each statement must fit within MAX_ALLOWED_PACKET. If you connect via JDBC and send a packet larger than this limit, the following error is reported:
CommunicationsException: Communications link failure
When all 2,048 INSERTs run in a single non-sharded transaction and the total LONGBLOB data exceeds 2 GB, binlog capacity is exhausted:
ERR-CODE: [TDDL-4614][ERR_EXECUTE_ON_MYSQL] Error occurs when execute on GROUP ...:
Multi-statement transaction required more than 'max_binlog_cache_size' bytes of storage;
increase this mysqld variable and try again
In a distributed transaction, each branch transaction runs on one shard. The binlog limit applies per shard, so the effective total INSERT capacity is number of shards × MAX_BINLOG_CACHE_SIZE. For 8 shards, the limit is 16 GB.
-
Each SQL statement must fit within
MAX_ALLOWED_PACKET, regardless of whether a transaction is active. -
The data written to binlogs per shard per transaction cannot exceed
MAX_BINLOG_CACHE_SIZE. -
To write more data in a single transaction, split the table into more shards.
Scenario 2: Single statement, large number of rows updated
This scenario tests transactions that execute a single UPDATE touching a large number of rows.
| Parameter | Value |
|---|---|
| SQL statements per transaction | 1 |
| Data per statement | ~256 KB |
| Total data updated | 256 MB–8 GB |
| Rows updated | 1,048,576 (2^20) to 33,554,432 (2^25) |
Column c type |
CHAR(255) |
The table is pre-loaded with 33,554,432 rows. The id column contains sequential values from 0 to 2^26−1; the c column contains 255 consecutive a characters. Each transaction executes:
UPDATE `tb` SET c = 'BBBBB...BBB' WHERE id < ?
Setting ? to x updates x rows, 256 bytes each, for a total of x × 256 bytes. In this test, x ranges from 1,048,576 to 33,554,432.
Because UPDATE statements write both the before-image and after-image to the binlog, each 256-byte update generates 512 bytes of binlog. On a non-sharded table, the binlog limit is reached when about 1 GB of data is updated. With 8 shards, the limit rises to 8 GB.
When the limit is exceeded, the same binlog overflow error from Scenario 1 is reported.
Conclusions:
-
For UPDATE transactions, the data updated per shard cannot exceed
MAX_BINLOG_CACHE_SIZE ÷ 2(1 GB by default), because binlogs record both the before-image and after-image. -
Non-sharded table: maximum UPDATE volume is ~1 GB. 8-shard table: ~8 GB.
-
To update more data in a single transaction, split the table into more shards.
Scenario 3: Large number of statements, small data per statement
This scenario tests transactions that execute a very large number of small statements.
| Parameter | Value |
|---|---|
| SQL statements per transaction | 64,000–1,280,000 |
| Data per statement | Several bytes |
| Total data updated | 32 bytes |
| Rows in table | 32 |
Column c type |
CHAR(1) |
The table contains 32 rows with sequential id values (0–31) and c = 'a'. Each iteration runs the following 64 statements:
UPDATE `tb` SET c = 'b' WHERE id = 0;
UPDATE `tb` SET c = 'a' WHERE id = 0;
UPDATE `tb` SET c = 'b' WHERE id = 1;
UPDATE `tb` SET c = 'a' WHERE id = 1;
...
UPDATE `tb` SET c = 'b' WHERE id = 31;
UPDATE `tb` SET c = 'a' WHERE id = 31;
Conclusion:
PolarDB-X does not hit a performance bottleneck even at 1.28 million SQL statements per transaction, regardless of shard count. In practice, transaction performance is more likely to be limited by data volume than by statement count.
Scenario 4: Effect of shard count on transaction performance
This scenario isolates how the number of shards affects transaction running time, independent of data volume.
| Parameter | Value |
|---|---|
| SQL statements per transaction | 1 |
| Data per statement | Several bytes |
| Total data updated | 8 KB–8 MB |
| Shards tested | 1–2,048 |
Column c type |
CHAR(1) |
Each transaction runs:
UPDATE `tb` SET c = 'b' WHERE id < x
This updates x rows and x bytes of data.
Conclusions:
-
Small data volume (8 KB): Transaction commit time dominates. Running time increases with shard count—roughly linearly above 1,024 shards. Below 64 shards, DML execution time still contributes, so the increase is not strictly linear.
-
Large data volume (8 MB): DML execution time dominates. Shard count has little effect on running time.
-
More shards increase total transaction capacity but also increase commit overhead. For transactions that operate on small amounts of data across many shards, commit time becomes the bottleneck. Size your shard count based on the data volume your transactions actually process.
Sizing guidance:
| Data volume per transaction | Recommended approach |
|---|---|
| Large (hundreds of MB or more) | Increase shard count to raise capacity; commit overhead is negligible |
| Small (KB range) | Avoid excessive shard counts; commit time dominates and degrades performance |
Scenario 5: High compute node workload
This scenario tests PolarDB-X stability when a single transaction drives high memory pressure on compute nodes through a large hash join.
The logical table tb is split into 16 shards and contains 67,108,864 (2^26) rows of CHAR(255) data (~16 GB). A second table tb2 has the same schema, data, and size. A temporary table tmp shares the same schema.
Each transaction runs:
INSERT INTO tmp
SELECT tb.id, tb.c
FROM tb, tb2
WHERE tb.c = tb2.c AND tb.id > x AND tb2.id > x
At x = 0, the hash join processes all data from both tables—32 GB in total on compute nodes with 16 GB of memory each.
The execution plan:
LogicalInsert(table="tmp", columns=RecordType(BIGINT id, CHAR c))
Project(id="id", c="c")
HashJoin(condition="c = c", type="inner")
Gather(concurrent=true)
LogicalView(tables="[000000-000015].tb_VjuI", shardCount=16,
sql="SELECT `id`, `c` FROM `tb` AS `tb` WHERE (`id` > ?)")
Gather(concurrent=true)
LogicalView(tables="[000000-000015].tb2_IfrZ", shardCount=16,
sql="SELECT `c` FROM `tb2` AS `tb2` WHERE (`id` > ?)")
Conclusion:
PolarDB-X handles high compute node workloads from large single transactions without errors. Transaction running time scales linearly with the amount of data processed.
Summary
| Rule | Detail |
|---|---|
| Per-statement size limit | Each SQL statement must fit within MAX_ALLOWED_PACKET (default 16 MB). This applies with or without an active transaction. |
| Per-shard binlog limit | Binlog data per shard per transaction cannot exceed MAX_BINLOG_CACHE_SIZE (default 2 GB, max 4 GB). |
| INSERT capacity | Total insertable data = number of shards × MAX_BINLOG_CACHE_SIZE. For INSERT-only transactions, the binlog records the inserted value only. |
| UPDATE capacity | Total updatable data = number of shards × (MAX_BINLOG_CACHE_SIZE ÷ 2). Binlog records both before-image and after-image, doubling the footprint. |
| Statement count | No performance bottleneck observed at up to 1.28 million statements per transaction. Data volume is the practical limit. |
| Shard count and commit overhead | More shards increase total capacity but also increase transaction commit time. For small-data transactions on many shards, commit time dominates. Size shards to match your actual data volumes. |
| Compute node stability | PolarDB-X delivers stable, linear performance under high compute node workloads caused by large single transactions. |