This topic describes the major updates and bug fixes of the Realtime Compute for Apache Flink version released on September 19, 2022.
Overview
This release ships two new Ververica Runtime (VVR) versions: VVR 4.0.15 for Apache Flink 1.13 and VVR 6.0.2 for Apache Flink 1.15.
VVR 6.0.2 is the first enterprise-grade Flink engine built on Apache Flink 1.15, bringing upstream improvements to window table-valued functions, CAST functions, type systems, and JSON functions to the cloud platform.
State management is now centralized. Checkpoints and savepoints are managed independently in a status set, decoupled from deployment lifecycle. Savepoints are no longer deleted when you cancel a deployment. The native savepoint format significantly improves creation speed, restoration speed, and reduces storage overhead. Object Storage Service (OSS) storage costs drop by 15–40% per year through status set management. You can also start a deployment from a savepoint that belongs to a different deployment, which simplifies A/B testing and dual-run validation.
Scheduled tuning lets you define time-based resource policies for deployments with predictable peak and off-peak hours, reducing manual intervention and labor costs.
This version also supports quick task restart, which provides a fast recovery capability in case of deployment failover. This improves business continuity. If you are tolerant of duplicate copies and loss of data and have high requirements for business continuity, you can configure quick task restart to quickly recover the failed tasks. The delay caused by deployment failover can be reduced from minutes to as little as milliseconds.
This feature cannot prevent duplicate copies and loss of data in this version. Therefore, make sure that your business is tolerant of loss and duplicate copies of data before you use the feature. Quick task restart is disabled by default. To enable this feature for a deployment, you must add additional configuration items. For more information about the principles and configuration details, see Configure quick task restart.
Health score introduces a diagnostic scoring model for deployments in any state. The feature runs expert rules against your deployment and surfaces actionable suggestions.
Flink complex event processing (CEP) has been verified in production and is now available to all users. The hot update feature lets you update CEP rules during peak hours without restarting the deployment, eliminating the 10-minute task rerelease interruption that risk-control workloads previously experienced. CEP SQL syntax is also enhanced: the new MATCH_RECOGNIZE extensions let you express complex patterns in SQL instead of DataStream API code, and new metrics (patternMatchedTimes, patternMatchingAvgTime) give you visibility into pattern-matching behavior.
Data integration: A new API supported on the platform side is available to integrate business.
Performance: Dual-stream Join deployments see an average 40%+ performance improvement through automatic key-value separation inference in GeminiStateBackend. Deployment startup speed improves by an average of 15%.
Connector and catalog additions: Hive Catalog now supports Hive 2.1.0–2.3.9 and Hive 3.1.0–3.1.3. The built-in Java Database Connectivity (JDBC) connector supports source, dimension, and sink tables. Tablestore incremental log reading, AnalyticDB for MySQL catalog, and database synchronization to Kafka are also included in this release.
New features
<table> <thead> <tr> <td><p><b>Feature</b></p></td> <td><p><b>Description</b></p></td> <td><p><b>References</b></p></td> </tr> </thead> <colgroup></colgroup> <colgroup></colgroup> <colgroup></colgroup> <tbody> <tr> <td><p>Status set management</p></td> <td><p>Manage all checkpoints and savepoints for a deployment in one place, independent of start and stop operations. Savepoints are no longer deleted when you cancel a deployment. Create and delete savepoints on a dedicated management page on a schedule you define.</p></td> <td> <ul> <li><p><a href="https://www.alibabacloud.com/help/en/document_detail/444393.html#task-2233675">Status set management</a></p></li> <li><p><a href="https://www.alibabacloud.com/help/en/document_detail/188702.html#task-2047214">Start a Flink SQL deployment</a></p></li> <li><p><a href="https://www.alibabacloud.com/help/en/document_detail/207352.html#task-2047214">Start a Flink JAR deployment</a></p></li> <li><p><a href="https://www.alibabacloud.com/help/en/document_detail/208019.html#task-2047214">Start a Flink Python deployment</a></p></li> <li><p><a href="https://www.alibabacloud.com/help/en/document_detail/197389.html#task-2021003">Cancel a deployment</a></p></li> </ul></td> </tr> <tr> <td><p>Scheduled tuning</p></td> <td><p>Define time-based resource policies for deployments with clear peak and off-peak patterns. Resources are automatically adjusted to preset sizes at the configured times, reducing data fluctuations and eliminating manual scaling.</p></td> <td> <ul> <li><p><a href="https://www.alibabacloud.com/help/en/document_detail/173651.html#task-2563742">Configure automatic tuning</a></p></li> <li><p><a href="https://www.alibabacloud.com/help/en/document_detail/290059.html#task-2101351">Configure a deployment</a></p></li> </ul></td> </tr> <tr> <td><p>Health score</p></td> <td><p>Get a diagnostic health score for any deployment that is starting or running. The feature applies expert rules to detect issues and surfaces suggestions, giving you a clearer view of deployment status without manual investigation.</p></td> <td><p><a href="https://www.alibabacloud.com/help/en/document_detail/405994.html#task-2177393">Perform intelligent deployment diagnostics</a></p></td> </tr> <tr id="row_ypf_ui4_agw" props="china"> <td id="entry_mvi_l5k_42h" colspan="1" rowspan="1"><p id="b8528f0037sgw">OpenAPI</p></td> <td id="entry_q2l_saj_3qf" colspan="1" rowspan="1"><p id="b8528f0137k6j">Flink全托管提供新版本API协助用户将产品集成到自己的服务上,新版API与旧版本SDK存在一定程度的不兼容,但提供了完整的产品核心能力。</p></td> <td id="entry_d2q_zm9_2sl" colspan="1" rowspan="1"><p id="b8528f02375ss"><a data-tag="xref" id="xref_h1j_f8i_ofv" href="t2004183.dita#concept_354409" baseurl="t2238209_v3_0_0.dita" data-node="2614651"></a></p></td> </tr> <tr id="row_ap6_i95_4aw" props="limitout"> <td id="entry_mci_2u9_c6q" colspan="1" rowspan="1"><p id="b852b61037ss6">作业批量操作</p></td> <td id="entry_51v_b9r_mr4" colspan="1" rowspan="1"><p id="b852b61137bbc">作业批量操作支持用户同时对多个作业进行启动或者停止,在批量处理问题的场景下,可以做到快速对目标作业的启停操作。例如某个共用的UDF异常需要停止线上所有使用该UDF的作业,修复问题后需要重新启动。</p></td> <td id="entry_i20_93v_y85" colspan="1" rowspan="1"><ul data-tag="ul" id="ul_o72_7xf_ehb"><li data-tag="li" id="li_8c3_6iy_d69"><p id="b852b612378or"><a data-tag="xref" id="xref_siq_k48_s2t" href="t2021003.dita#task_2021003" baseurl="t2238209_v3_0_0.dita" data-node="2699643">作业停止</a></p></li><li data-tag="li" id="li_tm7_e0q_a45"><p id="b852dd2037qlh"><a data-tag="xref" id="xref_482_0yj_0i4" href="t1990470.dita#task_2047214" baseurl="t2238209_v3_0_0.dita" data-node="2580259"></a></p></li></ul></td> </tr> <tr> <td><p>Process optimization on granting permissions to an account</p></td> <td><p>All RAM users are now listed for selection when you grant permissions to a namespace. Manual entry of RAM user information is no longer required.</p></td> <td><p><a href="https://www.alibabacloud.com/help/en/document_detail/175771.html#task-2569065">Grant permissions on namespaces</a></p></td> </tr> <tr> <td><p>Flink CEP</p></td> <td><p>Complex event processing (CEP) is a capability for matching patterns over real-time data streams. CEP rules are stored outside the database and loaded dynamically via the DataStream API, so rule updates take effect without restarting the deployment. The hot update capability eliminates the 10-minute task rerelease interruption that previously affected risk-control workloads during peak hours.</p></td> <td> <ul> <li><p><a href="https://www.alibabacloud.com/help/en/document_detail/459860.html#concept-2258817">Definitions of rules in the JSON format in dynamic Flink CEP</a></p></li> <li><p><a href="https://www.alibabacloud.com/help/en/document_detail/459880.html#task-2258065">Get started with dynamic Flink CEP</a></p></li> </ul></td> </tr> <tr> <td><p>Enhancement of CEP SQL</p></td> <td><p>The MATCH_RECOGNIZE statement now supports extended SQL syntax for expressing CEP rules, including output of events that do not arrive within a specified interval and relaxed non-contiguity via <code>notFollowedBy()</code>. This lets you simplify complex DataStream deployments into SQL deployments and integrate them into data governance lineage systems.</p> <p>Two new metrics are available:</p> <ul> <li><p><b>patternMatchedTimes:</b> number of successful pattern matches.</p></li> <li><p><b>patternMatchingAvgTime:</b> average duration of pattern matches.</p></li> </ul></td> <td><p><a href="https://www.alibabacloud.com/help/en/document_detail/457157.html#concept-2255968">CEP statements</a></p></td> </tr> <tr id="row_gv2_esw_186" props="limitout"> <td id="entry_yor_nnd_8cg" colspan="1" rowspan="1"><p id="b853796237v8x">Fast recovery capability in case of deployment failover</p></td> <td id="entry_4km_bcz_fyb" colspan="1" rowspan="1"><p id="b853796337q2u">After the quick task restart feature is enabled, you can restart only the failed task to reduce the impact of deployment failover on the deployment if an exception occurs.</p><note data-tag="note" id="note_mbm_n56_rcb" type="warning"><p id="b853796437g08">This is an experimental feature. Make sure that your business is tolerant of loss and duplicate copies of data before you use the feature.</p></note></td> <td id="entry_lz5_tjx_68u" colspan="1" rowspan="1"><p id="b853796537vfh"><a data-tag="xref" id="xref_ch2_c99_jxn" href="t2236600.dita#task_2236600" baseurl="t2238209_v3_0_0.dita" data-node="3763982"></a></p></td> </tr> <tr> <td><p>Database synchronization that supports synchronizing data to Kafka</p></td> <td><p>Write data from MySQL database tables to the corresponding Upsert Kafka tables. Use Kafka sink tables instead of MySQL sink tables to reduce the load on the MySQL database when multiple deployments read from the same source.</p></td> <td> <ul> <li><p><a href="https://www.alibabacloud.com/help/en/document_detail/446799.html#task-2240113">Synchronize data from all tables in a MySQL database to Kafka</a></p></li> <li><p><a href="https://www.alibabacloud.com/help/en/document_detail/428806.html#task-2159375">Manage Kafka JSON catalogs</a></p></li> </ul></td> </tr> <tr> <td><p>A DDL statement that defines a partitioned table in a Hologres result table</p></td> <td><p>PARTITION BY is supported when creating a Hologres sink table.</p></td> <td><p><a href="https://www.alibabacloud.com/help/en/document_detail/374303.html#concept-2156279">CREATE TABLE AS statement</a></p></td> </tr> <tr> <td><p>Timeout period for performing an asynchronous request in a Hologres dimension table</p></td> <td><p>Set the <code>asyncTimeoutMs</code> parameter to bound the duration of asynchronous data requests in a Hologres dimension table.</p></td> <td><p><a href="https://www.alibabacloud.com/help/en/document_detail/184860.html#concept-878569">Create a Hologres dimension table</a></p></td> </tr> <tr> <td><p>You can configure table attributes when you create a Hologres table.</p></td> <td><p>Configure physical table properties in the WITH clause when creating a Hologres table to control how data is sorted and queried efficiently.</p></td> <td><p><a href="https://www.alibabacloud.com/help/en/document_detail/290056.html#task-1970619">Manage Hologres catalogs</a></p></td> </tr> <tr> <td><p>MaxCompute sink connectors support the Binary data type.</p></td> <td> <ul> <li><p><b>Binary data type:</b> MaxCompute supports Binary values up to 8 MB.</p></li> <li><p><b>Streaming Tunnel:</b> MaxCompute Streaming Tunnel is now supported.</p></li> <li><p><b>Flush efficiency:</b> MaxCompute flush performance is optimized.</p></li> </ul></td> <td><p><a href="https://www.alibabacloud.com/help/en/document_detail/177931.html#concept-pql-cdz-lgb">Create a MaxCompute result table</a></p></td> </tr> <tr> <td><p>Hive Catalog supports more Hive versions.</p></td> <td><p>Hive Catalog now supports Hive 2.1.0–2.3.9 and Hive 3.1.0–3.1.3.</p></td> <td><p><a href="https://www.alibabacloud.com/help/en/document_detail/187802.html#task-1970619">Manage Hive catalogs</a></p></td> </tr> <tr> <td><p>Tablestore connector</p></td> <td><p>Read incremental logs from Tablestore source tables.</p></td> <td><p><a href="https://www.alibabacloud.com/help/en/document_detail/451302.html#concept-62526-zh">Create a Tablestore source table</a></p></td> </tr> <tr> <td><p>JDBC connector</p></td> <td><p>The Java Database Connectivity (JDBC) connector is now built in, with support for source, sink, and dimension tables.</p></td> <td> <ul> <li><p><a href="https://www.alibabacloud.com/help/en/document_detail/445351.html#concept-1953492">Create a JDBC source table</a></p></li> <li><p><a href="https://www.alibabacloud.com/help/en/document_detail/445677.html#concept-1953492">Create a JDBC result table</a></p></li> <li><p><a href="https://www.alibabacloud.com/help/en/document_detail/444957.html#concept-1953492">Create a JDBC dimension table</a></p></li> </ul></td> </tr> <tr> <td><p>The parallelism for a Message Queue for Apache RocketMQ source table can be greater than the number of partitions that are defined in a Message Queue for Apache RocketMQ message topic.</p></td> <td><p>Set parallelism higher than the current partition count to pre-allocate resources for anticipated partition growth before consumption begins.</p></td> <td><p><a href="https://www.alibabacloud.com/help/en/document_detail/178353.html#concept-62523-zh">Create a Message Queue for Apache RocketMQ source table</a></p></td> </tr> <tr> <td><p>The message keys of a Message Queue for Apache RocketMQ result table can be specified.</p></td> <td><p>Specify the message key for a Message Queue for Apache RocketMQ result table to control message routing and ordering.</p></td> <td><p><a href="https://www.alibabacloud.com/help/en/document_detail/178355.html#concept-62528-zh">Create a Message Queue for Apache RocketMQ result table</a></p></td> </tr> <tr> <td><p>AnalyticDB for MySQL catalog</p></td> <td><p>Read metadata directly from AnalyticDB for MySQL using the catalog. Manual table registration is no longer required, which reduces setup time and eliminates schema drift between the catalog and the actual database.</p></td> <td><p><a href="https://www.alibabacloud.com/help/en/document_detail/456454.html#task-2257227">Manage AnalyticDB for MySQL catalogs</a></p></td> </tr> </tbody> </table>
Performance optimization
Native savepoint format is introduced to address timeout issues that affected canonical-format savepoints in large-state jobs. All savepoint operations benefit from the native format:
<table> <thead> <tr> <td><p><b>Metric</b></p></td> <td><p><b>Improvement</b></p></td> </tr> </thead> <colgroup></colgroup> <colgroup></colgroup> <tbody> <tr> <td><p>Savepoint creation speed</p></td> <td><p>Average 500–1,000% faster; up to 10,000% in typical deployments. Improvement scales with the decrease of the incremental state.</p></td> </tr> <tr> <td><p>Deployment restoration speed</p></td> <td><p>Average ~500% faster. Improvement scales with state size.</p></td> </tr> <tr> <td><p>Savepoint storage overhead</p></td> <td><p>Average 200% reduction. Saving ratio improves with state size.</p></td> </tr> <tr> <td><p>Savepoint network overhead</p></td> <td><p>Average 500–1,000% reduction. Saving ratio improves with the proportion of incremental state.</p></td> </tr> </tbody> </table>
Dual-stream Join optimization: JOIN operators in SQL streaming deployments now automatically infer whether to enable key-value separation in GeminiStateBackend based on deployment characteristics. In typical scenario benchmarks, the average performance improvement exceeds 40%. For more information, see Optimize Flink SQL and Configurations of GeminiStateBackend.
Deployment startup speed improves by an average of 15%.
Bug fixes
The following issues are fixed:
-
The modification time of a deployment was abnormally updated.
-
The state could not be determined after specific deployments were suspended and restarted.
-
JAR packages could not be uploaded locally from Alibaba Finance Cloud.
-
The total number of resources configured for running a deployment was inconsistent with that on the Statistics page.
-
Users could not log on to the Logs page.
-
An error occurred when accessing Upsert Kafka tables via the Kafka catalog.
-
A NullPointerException was returned when intermediate results were used in nested operations of multiple user-defined functions (UDFs).
-
In MySQL CDC, abnormal chunks and out-of-memory (OOM) errors occurred, and the time zone of initialization data was inconsistent with that of incremental data. For more information, see Create a MySQL CDC source table.