BLOB performance optimization - PolarDB - Alibaba Cloud Documentation Center

Writing Binary Large Objects (BLOBs) in MySQL under high concurrency can lead to performance issues, such as lock contention and redo log bottlenecks. PolarDB for MySQL addresses these challenges with kernel-level optimizations that improve the entire lifecycle of BLOB data, including writes, updates, and space reclamation. These features significantly improve the performance of both Data Manipulation Language (DML) and Data Definition Language (DDL) operations, ensuring your applications run efficiently under high-concurrency workloads.

How it works

A BLOB is a data type used to store large objects like images, documents, or long text. In the native MySQL InnoDB engine, when a BLOB exceeds a certain size, it is stored on separate external data pages (off-pages). This mechanism creates three main performance challenges:

Concurrent write bottleneck: Updating an off-page BLOB triggers pessimistic locking. This allows only one thread to perform BLOB writes on a single table at a time, which severely limits concurrency.
Increased I/O pressure: Large object fields generate large redo logs. If disk I/O is insufficient, log flushing becomes a bottleneck and blocks foreground writes.
Delayed space reclamation: The cleanup (Purge) of old BLOB data is inefficient. This can lead to storage space bloat and performance degradation.

PolarDB's optimizations solves these problems at the kernel level without changing how your application uses BLOBs. The following diagrams illustrates how it works:

Billing

The BLOB performance optimization feature is built into the PolarDB kernel and is free of charge.

Optimize BLOB write performance

If you experience write throughput bottlenecks or high latency for concurrent writes on BLOB tables during peak business hours, the cause is usually write lock contention. PolarDB provides page pre-allocation and index latch optimization to solve this problem.

Pre-allocated pages

This optimization breaks down the BLOB write process into faster, more concurrent steps:

Calculate and request all required data pages in a single batch.
Copy the data without a lock and fill the newly allocated pages with the BLOB content.
Attach the generated data page chain to the index record using a single optimistic lock operation.

During this process, only the initial space allocation phase is exclusive. The time-consuming data copy operation can run in parallel with other BLOB writes. This design minimizes the time a global lock is held, dramatically improving concurrent write throughput.

Index latch optimization

For tables with small (under 8 KB) or mixed-size BLOBs, native MySQL often abandons optimistic locking too early during an update and falls back to an inefficient pessimistic lock. The latch optimization solves this by performing more work on the optimistic path.

Maintain optimistic path: When the system detects that a BLOB field may need an update, the optimization logic does not immediately switch to a pessimistic lock. Instead, it continues to hold a shared lock (S-latch), which has a lower probability of conflict, and remains on the optimistic update path.
Pre-build and attempt update: On the optimistic path, it pre-builds the BLOB data structure required for the update and tries to modify the index page directly.
Check and commit or rollback:
- Success: If the updated record fits on the current index page, the operation completes quickly on the optimistic path.
- Failure: If the page has insufficient space, the optimization automatically gives up the attempt. It then seamlessly rolls back to the native MySQL pessimistic locking path to complete the update operation, which ensures data consistency.

This mechanism significantly increases the success rate of optimistic updates, boosting throughput for mixed-size BLOB workloads under high concurrency.

Enable write optimization

To enable the optimization, go to the Configurations And Management > Parameter Settings page in the PolarDB console to view and modify the following parameter:

Parameter	Description	Supported versions
`innodb_blob_prepare_pages`	Enables or disables BLOB page pre-allocation optimization. Valid values: ON: enabled OFF: disabled	MySQL 5.6: Not supported. MySQL 5.7: The minor engine version must be 5.7.1.0.35 or later. MySQL 8.0.1: The minor engine version must be 8.0.1.1.47 or later. MySQL 8.0.2: Not supported.
`innodb_blob_prepare_max_extern_size`	Sets the maximum length of a single BLOB column for BLOB page pre-allocation optimization.
`innodb_blob_latch_optimize`	Enables or disables BLOB index latch optimization.

Reduce redo log volume for BLOB writes

If your application writes a large amount of BLOB data and the redo log generation rate becomes an I/O bottleneck, you can enable redo compression to alleviate the I/O pressure.

This feature compresses redo log records for BLOBs in parallel before they are written to the log buffer. It can reduce the actual volume of redo logs written to disk by an average of 40%-60%, which lowers I/O pressure and prevents the redo module from impacting overall write performance. The logs are automatically decompressed during a database crash recovery.

Enable redo compression

To enable the optimization, go to the Configurations And Management > Parameter Settings page in the PolarDB console to view and modify the following parameters:

Parameter

Description

Supported versions

innodb_blob_redo_compress_algorithm

Sets the compression algorithm for BLOB redo logs. Valid values:

none (default)
lz4
zstd
zlib

MySQL 5.6: Not supported.
MySQL 5.7: The minor engine version must be 5.7.1.0.39 or later.
MySQL 8.0.1: Not supported.
MySQL 8.0.2: Not supported.

innodb_blob_redo_compress_level

Sets the compression level for the selected compression algorithm.

The value ranges from 0 to 9. The default value is 6.

Accelerate BLOB space reclamation

If your table space continues to grow even after you frequently delete or update BLOB data, the cause is the inefficient background Purge process in native MySQL. The native Purge process holds a core lock while performing potentially slow disk I/O to read BLOB pages. PolarDB separates these two operations to solve this bottleneck.

Identify target pages: Before locking and starting the cleanup, the Purge process first identifies the complete BLOB external page chain that needs to be reclaimed.
Asynchronous pre-read: It initiates asynchronous I/O requests to load the target pages that are not yet in memory from the disk into the buffer pool.
Fast cleanup: After all pages are loaded into memory, the Purge process acquires core locks, such as Index SX. It then quickly performs in-memory page releases and metadata modifications.

This mechanism avoids performing any disk I/O while holding a critical lock. It dramatically shortens the lock-holding time, which not only improves BLOB data reclamation efficiency but also reduces the impact of background cleanup on foreground operations.

Enable Purge pre-read

To enable the optimization, go to the Configurations And Management > Parameter Settings page in the PolarDB console to view and modify the following parameter:

Parameter

Description

Supported versions

innodb_purge_blob_read_ahead

Enables or disables page pre-read optimization for the BLOB Purge process.

Valid values:

ON: enabled
OFF: disabled

Supported on all versions.

Improve DDL execution speed

If ALTER TABLE operations on tables with large amounts of BLOB data are too slow, you can enable DDL-specific BLOB optimizations to significantly reduce the execution time.

Page pre-allocation optimization: Because BLOB overflow columns are part of the primary table, DDL operations that rebuild the primary table also involve writing BLOB fields. Page pre-allocation optimization uses a more direct and simplified locking method to write BLOB data. This avoids the overhead of handling complex concurrency conflicts and significantly improves rebuild efficiency.
Latch optimization: Parallel DDL splits the primary key tree, so multiple worker threads do not operate on the same data page. This optimization protects the new page allocation operation independently, which allows other operations to run concurrently and improves efficiency.
redo optimization:
1. Page aggregation: During a bulk load of data, instead of generating a redo log for each record, the system writes the changes of an entire data page as a single large redo record after the page is full. This improves redo operation efficiency.
2. Aggregation compression: The system compresses the large, aggregated redo records to further reduce the amount of redo logs generated during the DDL process.

Enable DDL optimization

To enable the optimization, go to the Configurations And Management > Parameter Settings page in the PolarDB console to view and modify the following parameters:

Note

Some parameters in the following table cannot be modified manually. To change these parameters, submit a ticket to contact us.

Parameter	Description	Supported versions
`innodb_blob_prepare_pages_in_ddl`	Specifies whether to enable large object optimization during DDL processes. Valid values: OFF (default) ON	MySQL 5.6: Not supported. MySQL 5.7: The minor engine version must be 5.7.1.0.39 or later. MySQL 8.0.1: The minor engine version must be 8.0.1.1.50 or later. MySQL 8.0.2: Not supported.
`innodb_bulk_load_without_index_lock_enable`	Specifies whether to enable index lock optimization during DDL processes. Valid values: OFF (default) ON	MySQL 5.6: Not supported. MySQL 5.7: The minor engine version must be 5.7.1.0.39 or later. MySQL 8.0.1: The minor engine version must be 8.0.1.1.47 or later. MySQL 8.0.2: The minor engine version must be 8.0.2.2.30 or later.
`innodb_bulk_load_page_grained_redo_enable`	Specifies whether to enable redo page aggregation during DDL bulk loading. Valid values: OFF ON (default)	MySQL 5.6: Not supported. MySQL 5.7: Supported. No minor engine version restrictions. MySQL 8.0.1: Supported. No minor engine version restrictions. MySQL 8.0.2: Supported. No minor engine version restrictions.
`innodb_bulk_load_redo_compress_algorithm`	Sets the redo compression algorithm for DDL bulk loading. Valid values: `none` (default) `lz4` `zstd`	MySQL 5.6: Not supported. MySQL 5.7: The minor engine version must be 5.7.1.0.39 or later. MySQL 8.0.1: Not supported. MySQL 8.0.2: The minor engine version must be 8.0.2.2.30 or later.
`innodb_bulk_load_redo_compress_enable`	Sets the application target for redo compression during DDL processes. Valid values: 0: none (default) 1: secondary index 2: all	MySQL 5.6: Not supported MySQL 5.7: The minor engine version must be 5.7.1.0.39 or later. MySQL 8.0.1: Not supported MySQL 8.0.2: The minor engine version must be 8.0.2.2.30 or later.

Related system optimizations

To achieve the best possible performance, PolarDB also optimizes related system modules that can become bottlenecks in high-throughput BLOB scenarios:

Parallel asynchronous redo writes: The system shards data in the redo buffer and uses multiple threads to issue asynchronous I/O tasks concurrently. This greatly improves the write throughput of redo logs, achieving up to 4 GB/s in tests.
File extension optimization: Based on PolarDB's self-developed distributed file system, tablespace extension operations only need to modify a small amount of metadata. The native file zero-filling operation is removed, so the time and lock overhead of file extension are no longer a bottleneck.
Lock-free dirty page flushing: The system uses Shadow Page technology. When flushing dirty pages, it first builds an in-memory copy, releases the page lock, and then uses the copy for I/O operations. This avoids blocking write requests by holding page locks during dirty page flushing.

Performance data

The following data shows the performance improvements in DML and DDL scenarios after enabling BLOB optimization.

DML performance In high-concurrency scenarios with row lengths of 100 KB and 200 KB, PolarDB's optimization solution for inserts and updates provides a performance improvement of nearly three times. When combined with redo compression, the performance can be improved by four to five times.
DDL performance For a 40 GB table that contains BLOB fields, the DDL write rate increases by 5 to 6 times and the total execution time is reduced by 84% after the optimization is enabled.