VVR 4.0.13 Updates for Apache Flink (May 2022) - Realtime Compute for Apache Flink

Ververica Runtime (VVR) 4.0.13, based on Apache Flink V1.13, was released on May 16, 2022. This release expands connector capabilities (Kafka catalogs, Hologres full+incremental sync, MaxCompute Streaming Tunnel), improves sharded database multi-table synchronization, and enhances the developer experience in the console.

New features

Summary

Feature	Category
Kafka catalog support	New connector
Multi-table sync from sharded databases via CREATE DATABASE AS	SQL enhancement
Hologres connector: full and incremental data consumption	Connector enhancement
ApsaraDB for Redis result table: TTL for keys	Connector enhancement
MaxCompute Streaming Tunnel and data compression	Connector enhancement
Hologres DataStream connector	New connector
Elasticsearch connector: `retry_on_conflict` parameter	Connector enhancement
Flink CDC 2.2 compatibility for MySQL CDC and Postgres CDC connectors	Connector enhancement
Heartbeat events for binary log position tracking	Connector enhancement
UNSIGNED FLOAT, DOUBLE, and DECIMAL data types for MySQL CDC connector	Connector enhancement
JDBC parameters for MySQL CDC connector	Connector enhancement
Forced termination of session clusters	Console
Intelligent analysis of JobManager exceptions	Console
Built-in Alibaba Cloud documentation	Console
Service notices	Console
UI optimization	Console

Kafka catalog support

Kafka catalogs automatically parse Kafka messages to infer table schema, so you can access topics in a Kafka cluster directly in Flink SQL without writing DDL statements. JSON-formatted messages are supported for schema inference.

What it means to you: Eliminates the need to manually define source or result tables for Kafka topics, reducing boilerplate and the risk of schema mismatches.

Reference: Manage Kafka JSON catalogs

Multi-table sync from sharded databases via CREATE DATABASE AS

The CREATE DATABASE AS statement now supports regular expressions for database names, matching source tables across multiple database shards. After merging the shard data, Realtime Compute for Apache Flink synchronizes it to a downstream destination table whose name corresponds to each source table—eliminating manual table-by-table configuration.

What it means to you: Significantly reduces setup effort when synchronizing data from large sharded databases into a single destination such as Hologres.

Reference: CREATE DATABASE AS statement

Hologres connector: full and incremental data consumption

The Hologres connector now supports a combined full-then-incremental sync mode in a single job. The job first performs a full data snapshot from a Hologres source table and then smoothly switches to incremental binary log consumption.

What it means to you: Simplifies building real-time data warehousing pipelines. Previously, full and incremental synchronization required separate jobs or manual handover.

Reference: Create a Hologres source table

ApsaraDB for Redis result table: TTL for keys

When writing to an ApsaraDB for Redis result table, configure a time to live (TTL) for keys directly in the connector parameters. This ensures data expires automatically according to your retention policy.

Reference: Create an ApsaraDB for Redis result table

MaxCompute Streaming Tunnel and data compression

Two enhancements are available for MaxCompute connectors:

Streaming Tunnel: Write data to MaxCompute in streaming mode. For jobs that do not require exactly-once semantics, Streaming Tunnel avoids the performance issues that occur when checkpoints are created at a low speed.
Data compression: Both Streaming Tunnel and Batch Tunnel now support data compression to improve transmission efficiency.

References:

Hologres DataStream connector

The Hologres DataStream connector is now supported.

Elasticsearch connector: `retry_on_conflict` parameter

Configure the retry_on_conflict parameter to specify the maximum number of retries when version conflicts occur during data updates to an Elasticsearch result table.

Reference: Create an Elasticsearch result table

Flink CDC 2.2 compatibility for MySQL CDC and Postgres CDC connectors

The MySQL Change Data Capture (CDC) connector and Postgres CDC connector are now fully compatible with Flink CDC 2.2. All bug fixes from the Flink CDC 2.2 release are included.

Heartbeat events for binary log position tracking

The MySQL CDC connector now uses heartbeat events to track the latest binary log file position read from the source. For slowly updated tables, the connector advances the binary log position based on heartbeat events rather than waiting for update events—preventing the binary log position from expiring when tables have low write volume.

Reference: Create a MySQL CDC source table

UNSIGNED FLOAT, DOUBLE, and DECIMAL data types for MySQL CDC connector

The UNSIGNED FLOAT, DOUBLE, and DECIMAL data types are now supported by the MySQL CDC connector and MySQL catalogs.

Reference: Create a MySQL CDC source table

JDBC parameters for MySQL CDC connector

Java Database Connectivity (JDBC) parameters can now be configured directly on the MySQL CDC connector to control how it connects to MySQL instances.

Reference: Create a MySQL CDC source table

Forced termination of session clusters

Session clusters can now be forcefully terminated from the console. Because session clusters share resources across jobs, an abnormal session cluster can affect all jobs running in it.

Do not run production jobs in session clusters. If a job fails because of a session cluster exception, force-terminate the session cluster to recover.

Reference: Configure a development and test environment (session cluster)

Intelligent analysis of JobManager exceptions

When a job fails, the JobManager records TaskManager exceptions into logs, which you can view on the Logs tab in the development console of Realtime Compute for Apache Flink. Exception log storage time has been extended and logs are now classified by type—making it faster to identify the actual root cause when a job fails consecutively.

Reference: View the exception logs of a deployment

Built-in Alibaba Cloud documentation

Documentation from the Alibaba Cloud Documentation Center is now accessible directly within the fully managed Flink console, eliminating the need to switch browser windows during job development and O&M.

Service notices

Service notices—including product updates and maintenance announcements—are now displayed in the Realtime Compute for Apache Flink console. This avoids the issue that notices fail to be sent to users through SMS, internal messages, or DingTalk groups.

UI optimization

The new Alibaba Cloud theme style is applied across the console.
Job status descriptions are updated for clarity.

References:

Performance optimization

No performance optimizations in this release.

Fixed issues

Log Service connector: shard list not refreshed after shard count changes

Previously, if the number of shards changed, the Log Service connector failed to obtain the updated shard list, causing data reads to stop. This is now fixed.

Aggregation optimization error: `[J cannot be cast to [Ljava.lang.Object;`

Previously, aggregation optimization features such as miniBatch triggered a [J cannot be cast to [Ljava.lang.Object; error in certain cases. This is now fixed.

ApsaraDB for HBase result table: out-of-order data during async writes

Previously, data written to an ApsaraDB for HBase result table became out-of-order when asynchronous processing was enabled. This is now fixed.

Null pointer in two-stream join operations

Previously, join operations between two data streams could trigger a null pointer exception. This is now fixed.

Checkpointing failure when using MySQL CDC connector with Apache Hudi

Previously, checkpointing consistently failed when a job used the MySQL CDC connector to write data to Apache Hudi. This is now fixed.

`pendingRecords` metric for Message Queue for Apache Kafka source tables

The computational logic that is used to report the pendingRecords metric for Message Queue for Apache Kafka source tables is optimized.

Member names not displayed in the development console

Previously, specific member names were not shown in the development console of Realtime Compute for Apache Flink. This is now fixed.

DDL syntax validation error for valid statements

Previously, certain valid DDL statements triggered an error during syntax validation. This is now fixed.

New features

Summary

Kafka catalog support

Multi-table sync from sharded databases via CREATE DATABASE AS

Hologres connector: full and incremental data consumption

ApsaraDB for Redis result table: TTL for keys

MaxCompute Streaming Tunnel and data compression

Hologres DataStream connector

Elasticsearch connector: retry_on_conflict parameter

Flink CDC 2.2 compatibility for MySQL CDC and Postgres CDC connectors

Heartbeat events for binary log position tracking

UNSIGNED FLOAT, DOUBLE, and DECIMAL data types for MySQL CDC connector

JDBC parameters for MySQL CDC connector

Forced termination of session clusters

Intelligent analysis of JobManager exceptions

Built-in Alibaba Cloud documentation

Service notices

UI optimization

Performance optimization

Fixed issues

Elasticsearch connector: `retry_on_conflict` parameter