This topic describes the enterprise-level state backend storage GeminiStateBackend and compares the performance between GeminiStateBackend and RocksDBStateBackend.

Overview

Stateful computing is a complex and challenging scenario in stream processing. Data access for stream processing has the following characteristics:
  • A large number of random accesses exist and few range queries are performed.
  • Data traffic and hotspots frequently change. In this case, different parallel threads of the same operator use different data access modes.
GeminiStateBackend is a key-value storage engine that is designed for stream processing based on these characteristics. GeminiStateBackend serves as the default state backend of Realtime Compute for Apache Flink and is widely used in the production of Alibaba Group and Alibaba Cloud customers. GeminiStateBackend has the following core design highlights:
  • Uses a new architecture and data structure design to improve the overall data processing performance.

    The overall architecture of GeminiStateBackend is designed based on the log-structured merge-tree (LSM tree) data structure. GeminiStateBackend provides three capabilities: adaption with changes in the data volume and access characteristics, tiered storage of hot and cold data, and switchover between anti-caching and caching architectures. GeminiStateBackend also supports the hash storage structure that allows random access. The performance comparison by using Nexmark shows that GeminiStateBackend provides better performance than RocksDBStateBackend.

  • Supports compute-storage separation to eliminate the dependency of state data on local disks.

    The space of local disks is limited. Therefore, a job that has a large amount of state data often encounters the issue of insufficient local disk space. In most cases, if a job that runs based on RocksDBStateBackend encounters the issue of insufficient local disk space, you need to increase the parallelism of threads or use other methods to increase resources. GeminiStateBackend supports compute-storage separation. This way, state storage can be independent of local disks. This prevents job failures that are caused by excessive local state data. For more information about the configurations of compute-storage separation, see Parameters for compute-storage separation.

  • Supports key-value separation to significantly improve the performance of jobs that involve dual-stream JOIN or multi-stream JOIN.

    Dual-stream JOIN or multi-stream JOIN is one of the most challenging scenarios in stream processing and a typical scenario in which state storage encounters a bottleneck. GeminiStateBackend is developed based on key-value separation to significantly improve the performance of jobs that involve dual-stream JOIN or multi-stream JOIN. The verification during Double 11 Shopping Festival of Alibaba Group shows that the computing resource utilization can be increased by an average of 50% after key-value separation is enabled. In typical scenarios, the computing resource utilization can be increased by 100% to 200%. For more information about the configurations of key-value separation, see Parameters for key-value separation.

  • Supports adaptive parameter tuning.

    In stream processing tasks, different operators have different state access modes. In most cases, different combinations of parameters are required to implement the optimal performance of state storage. The configurations of these parameters involve underlying technologies. Manual parameter tuning has high learning and understanding costs. To resolve this issue, GeminiStateBackend supports the adaptive parameter tuning technology. When a job is running, parameter configurations can be automatically tuned based on the current data access mode and traffic to achieve the optimal performance of state storage in various scenarios. The verification during Double 11 Shopping Festival of Alibaba Group shows that this technology can reduce manual parameter tuning by more than 95% and increase the single-core throughput by 10% to 40%. For more information about the configurations of adaptive parameter tuning, see Parameters for adaptive parameter tuning.

Performance comparison by using Nexmark

In this example, the use cases related to state bottlenecks and hardware resources in Nexmark are used to compare the performance between RocksDBStateBackend and GeminiStateBackend. The comparison result shows that GeminiStateBackend significantly optimizes the overall performance (single-core throughput) of jobs. The following table shows the comparison result.
Case name Gemini TPS/core RocksDB TPS/core Improved by
q4 83.63 K/s 53.26 K/s 57.02%
q5 84.52 K/s 57.86 K/s 46.08%
q8 468.96 K/s 361.37 K/s 29.77%
q9 59.42 K/s 26.56 K/s 123.72%
q11 93.08 K/s 48.82 K/s 90.66%
q18 150.93 K/s 87.37 K/s 72.75%
q19 143.46 K/s 58.5 K/s 145.23%
q20 75.69 K/s 22.44 K/s 237.30%