All Products
Search
Document Center

PolarDB:DingTalk secures App Store top rank with X-Engine

Last Updated:Mar 28, 2026

X-Engine, the storage engine built into PolarDB, cuts storage consumption by about 62% compared to InnoDB — without requiring code changes. DingTalk adopted X-Engine to sustain hundreds of millions of users and handle the explosive traffic surge that pushed the app to the top of the App Store download charts in 2020.

Why InnoDB was not enough

Enterprise instant messaging (IM) has a storage profile that strains conventional engines like InnoDB:

  • Permanent history: every message must be retained indefinitely, unlike transient sessions in individual messaging apps.

  • Multi-terminal roaming: messages must be accessible and consistent across all of a user's devices.

  • Read receipts: DingTalk must ensure timely delivery acknowledgment for all messages at scale.

InnoDB stores data in fixed-size pages and updates those pages in place. This results in page fragmentation and write amplification: updating one row can dirty multiple pages, forcing many pages to flush to storage. At DingTalk's scale — hundreds of millions of users, with traffic surging further during COVID-19 — the storage volume and I/O cost of this model became a critical engineering constraint.

How X-Engine reduces storage costs

X-Engine uses a log-structured merge-tree (LSM tree) architecture, which writes new data to new read-only pages rather than modifying existing ones. This design enables three complementary cost-reduction techniques.

Compact pages

Because pages are never modified in place, they can be stored compactly and compressed using prefix encoding. A compaction operation periodically removes invalid records, ensuring only valid records remain.

Compared to InnoDB, X-Engine requires only 10–50% of the equivalent storage.

Data compression and invalid-record cleanup

Pages can be further compressed using general-purpose algorithms: zlib, zstd, or Snappy. Data at lower levels of the LSM tree is compressed by default. After extensive comparative testing, the X-Engine team selected zstd as the default algorithm for its balance of compression ratio and decompression speed.

Compression trades compute for storage. Select a compression algorithm that matches your workload's tolerance for CPU overhead — lower compression ratios process faster.

The X-Engine team also developed field-programmable gate array (FPGA) compaction, which offloads the compaction process to heterogeneous computing hardware. On hosts without FPGA hardware, X-Engine uses a scheduling algorithm that reduces the performance impact of compaction.

Intelligent hot/cold data separation

Most access requests target a small fraction of the total dataset. X-Engine maps this pattern directly onto the LSM tree hierarchy:

  • Frequently accessed (hot) data stays at higher LSM tree levels, on fast storage such as non-volatile memory (NVM) or dynamic random-access memory (DRAM).

  • Infrequently accessed (cold) data is pushed to lower levels on slower, cheaper storage.

The separation algorithm runs during compaction and dump processes:

  • Pages and records with the lowest predicted access probability are moved to the bottom of the LSM tree.

  • Hot data is backfilled into BlockCache and RowCache to prevent cache-hit jitter from degrading performance.

  • An AI-based prefetch algorithm identifies data likely to be accessed soon and preloads it into memory, improving first-access cache hit rates.

Accurate hot/cold classification avoids unnecessary compression and decompression of data that will be immediately accessed, improving overall system throughput.

Storage efficiency benchmarks

X-Engine storage efficiency was tested against two datasets: Link-Bench and Alibaba internal transaction data.

Comparison
BaselineX-Engine storage reduction
InnoDB with compression enabled2x less storage
InnoDB with compression disabled3–5x less storage

Migration compatibility

X-Engine supports transactions and secondary indexes. Existing code running on InnoDB can be migrated to ApsaraDB RDS instances powered by X-Engine without changes.

What's next

Related papers