GeminiStateBackend is a key-value (KV) storage engine built for stream processing, and the default state backend for Realtime Compute for Apache Flink. This topic describes its core design and compares its performance with RocksDBStateBackend.
When to use GeminiStateBackend
GeminiStateBackend is the default backend for Realtime Compute for Apache Flink.
GeminiStateBackend is particularly well suited for the following scenarios:
| Scenario | Why GeminiStateBackend helps |
|---|---|
| Jobs with large state — state data exceeds or risks exceeding local disk capacity | Decouples storage and compute so state operates independently of local disks |
| Dual-stream or multi-stream joins — low join success rates or large state values | KV separation significantly improves throughput in state-intensive join workloads |
| Jobs sensitive to checkpoint duration — slow or unstable checkpoints in large-state jobs | Decouples checkpoints from the LSM compaction mechanism to make them faster and more predictable |
| Operators with varied access patterns — different operators in the same job need different tuning | Adaptive parameter tuning eliminates manual configuration across all operators |
How it works
Stream processing places two demands on state storage:
-
High random-access volume, few range queries — state lookups are mostly point reads, not scans.
-
Dynamic traffic and hot spots — access patterns shift frequently, and different concurrent instances of the same operator can behave differently.
GeminiStateBackend addresses these demands with an architecture built on a Log-Structured Merge-tree (LSM tree). It combines three mechanisms:
-
Adaptive adjustments based on data scale and access patterns
-
Tiered storage for hot and cold data
-
Flexible switching between anti-caching and caching architectures
A hash-based storage structure handles random queries on top of this foundation.
Key capabilities
Storage and compute decoupling
Problem: When local disk space is limited, jobs with large state fail from disk exhaustion. Resolving this with RocksDBStateBackend typically means adding resources—such as increasing concurrency—to resolve the disk limit.
Solution: GeminiStateBackend decouples state storage from local disks. State storage operates independently of local disks, preventing job failures caused by state data exceeding local disk capacity.
For configuration details, see Storage and compute decoupling configuration.
Adaptive KV separation
Problem: Dual-stream and multi-stream joins are among the most state-intensive workloads in stream processing. When join success rates are low or state values are large, state storage becomes the bottleneck.
Solution: GeminiStateBackend introduces KV separation to address this. The feature is fully adaptive—it requires no extra configuration or tuning. Verified by Alibaba Group's core services during the Double 11 shopping festival:
-
Job throughput capacity increased by 50%–70%
-
Average compute resource utilization increased by 50%
-
In the most-improved scenarios, utilization increased by 100%–200%
For configuration details, see KV separation configuration.
Lightweight job snapshots
Problem: LSM compaction can interfere with checkpoint and snapshot completion, making them slower and less stable for large-state jobs.
Solution: GeminiStateBackend supports more fine-grained job snapshots and decouples checkpoints from the LSM compaction mechanism. This makes checkpoints faster and more stable. It also supports native incremental savepoints. Combined with the native snapshots provided by Realtime Compute for Apache Flink, savepoint performance approaches that of checkpoints—greatly improving snapshot availability for large-state jobs.
Adaptive parameter tuning
Problem: Different operators within the same job often have different state access patterns and require different parameter combinations for optimal performance. Manual tuning across all operators is impractical at scale.
Solution: GeminiStateBackend automatically adjusts parameters at runtime based on current access patterns and traffic. Verified by Alibaba Group's core services during the Double 11 shopping festival:
-
Eliminates manual tuning in over 95% of cases
-
Increases single-core throughput capacity by 10%–40%
For configuration details, see Adaptive parameter tuning configuration.
Nexmark performance comparison
The following results are from Nexmark state-bottlenecked use cases, tested on identical hardware. Performance is measured as single-core throughput capacity (TPS/Core).
The Nexmark website is a third-party site. Access may be slow or unavailable.
| Case | Gemini TPS/Core | RocksDB TPS/Core | Improvement |
|---|---|---|---|
| q4 | 83.63 K/s | 53.26 K/s | 57.02% |
| q5 | 84.52 K/s | 57.86 K/s | 46.08% |
| q8 | 468.96 K/s | 361.37 K/s | 29.77% |
| q9 | 59.42 K/s | 26.56 K/s | 123.72% |
| q11 | 93.08 K/s | 48.82 K/s | 90.66% |
| q18 | 150.93 K/s | 87.37 K/s | 72.75% |
| q19 | 143.46 K/s | 58.5 K/s | 145.23% |
| q20 | 75.69 K/s | 22.44 K/s | 237.30% |
In about half of these cases, GeminiStateBackend outperforms RocksDBStateBackend by over 70%.
What's next
-
To create, view, and delete state sets and recover from a specified state, see Manage job state sets.
-
For migration efficiency and job performance differences when moving state data from RocksDB to Gemini, see Overview.
-
For the compatibility impact of SQL modifications, see SQL modifications and compatibility.
-
To test Realtime Compute for Apache Flink performance using Nexmark, see Performance Whitepaper (Nexmark Performance Testing).
-
For frequently asked questions about system checkpoints or job snapshots, see System checkpoints or job snapshots.