Ververica Runtime (VVR) 4.0.13, based on Apache Flink V1.13, was released on May 16, 2022. This release expands connector capabilities (Kafka catalogs, Hologres full+incremental sync, MaxCompute Streaming Tunnel), improves sharded database multi-table synchronization, and enhances the developer experience in the console.
New features
Summary
| Feature | Category |
|---|---|
| Kafka catalog support | New connector |
| Multi-table sync from sharded databases via CREATE DATABASE AS | SQL enhancement |
| Hologres connector: full and incremental data consumption | Connector enhancement |
| ApsaraDB for Redis result table: TTL for keys | Connector enhancement |
| MaxCompute Streaming Tunnel and data compression | Connector enhancement |
| Hologres DataStream connector | New connector |
Elasticsearch connector: retry_on_conflict parameter |
Connector enhancement |
| Flink CDC 2.2 compatibility for MySQL CDC and Postgres CDC connectors | Connector enhancement |
| Heartbeat events for binary log position tracking | Connector enhancement |
| UNSIGNED FLOAT, DOUBLE, and DECIMAL data types for MySQL CDC connector | Connector enhancement |
| JDBC parameters for MySQL CDC connector | Connector enhancement |
| Forced termination of session clusters | Console |
| Intelligent analysis of JobManager exceptions | Console |
| Built-in Alibaba Cloud documentation | Console |
| Service notices | Console |
| UI optimization | Console |
Kafka catalog support
Kafka catalogs automatically parse Kafka messages to infer table schema, so you can access topics in a Kafka cluster directly in Flink SQL without writing DDL statements. JSON-formatted messages are supported for schema inference.
What it means to you: Eliminates the need to manually define source or result tables for Kafka topics, reducing boilerplate and the risk of schema mismatches.
Reference: Manage Kafka JSON catalogs
Multi-table sync from sharded databases via CREATE DATABASE AS
The CREATE DATABASE AS statement now supports regular expressions for database names, matching source tables across multiple database shards. After merging the shard data, Realtime Compute for Apache Flink synchronizes it to a downstream destination table whose name corresponds to each source table—eliminating manual table-by-table configuration.
What it means to you: Significantly reduces setup effort when synchronizing data from large sharded databases into a single destination such as Hologres.
Reference: CREATE DATABASE AS statement
Hologres connector: full and incremental data consumption
The Hologres connector now supports a combined full-then-incremental sync mode in a single job. The job first performs a full data snapshot from a Hologres source table and then smoothly switches to incremental binary log consumption.
What it means to you: Simplifies building real-time data warehousing pipelines. Previously, full and incremental synchronization required separate jobs or manual handover.
Reference: Create a Hologres source table
ApsaraDB for Redis result table: TTL for keys
When writing to an ApsaraDB for Redis result table, configure a time to live (TTL) for keys directly in the connector parameters. This ensures data expires automatically according to your retention policy.
Reference: Create an ApsaraDB for Redis result table
MaxCompute Streaming Tunnel and data compression
Two enhancements are available for MaxCompute connectors:
-
Streaming Tunnel: Write data to MaxCompute in streaming mode. For jobs that do not require exactly-once semantics, Streaming Tunnel avoids the performance issues that occur when checkpoints are created at a low speed.
-
Data compression: Both Streaming Tunnel and Batch Tunnel now support data compression to improve transmission efficiency.
References:
Hologres DataStream connector
The Hologres DataStream connector is now supported.
Elasticsearch connector: retry_on_conflict parameter
Configure the retry_on_conflict parameter to specify the maximum number of retries when version conflicts occur during data updates to an Elasticsearch result table.
Reference: Create an Elasticsearch result table
Flink CDC 2.2 compatibility for MySQL CDC and Postgres CDC connectors
The MySQL Change Data Capture (CDC) connector and Postgres CDC connector are now fully compatible with Flink CDC 2.2. All bug fixes from the Flink CDC 2.2 release are included.
Heartbeat events for binary log position tracking
The MySQL CDC connector now uses heartbeat events to track the latest binary log file position read from the source. For slowly updated tables, the connector advances the binary log position based on heartbeat events rather than waiting for update events—preventing the binary log position from expiring when tables have low write volume.
Reference: Create a MySQL CDC source table
UNSIGNED FLOAT, DOUBLE, and DECIMAL data types for MySQL CDC connector
The UNSIGNED FLOAT, DOUBLE, and DECIMAL data types are now supported by the MySQL CDC connector and MySQL catalogs.
Reference: Create a MySQL CDC source table
JDBC parameters for MySQL CDC connector
Java Database Connectivity (JDBC) parameters can now be configured directly on the MySQL CDC connector to control how it connects to MySQL instances.
Reference: Create a MySQL CDC source table
Forced termination of session clusters
Session clusters can now be forcefully terminated from the console. Because session clusters share resources across jobs, an abnormal session cluster can affect all jobs running in it.
Do not run production jobs in session clusters. If a job fails because of a session cluster exception, force-terminate the session cluster to recover.
Reference: Configure a development and test environment (session cluster)
Intelligent analysis of JobManager exceptions
When a job fails, the JobManager records TaskManager exceptions into logs, which you can view on the Logs tab in the development console of Realtime Compute for Apache Flink. Exception log storage time has been extended and logs are now classified by type—making it faster to identify the actual root cause when a job fails consecutively.
Reference: View the exception logs of a deployment
Built-in Alibaba Cloud documentation
Documentation from the Alibaba Cloud Documentation Center is now accessible directly within the fully managed Flink console, eliminating the need to switch browser windows during job development and O&M.
Service notices
Service notices—including product updates and maintenance announcements—are now displayed in the Realtime Compute for Apache Flink console. This avoids the issue that notices fail to be sent to users through SMS, internal messages, or DingTalk groups.
UI optimization
-
The new Alibaba Cloud theme style is applied across the console.
-
Job status descriptions are updated for clarity.
References:
Performance optimization
No performance optimizations in this release.
Fixed issues
Log Service connector: shard list not refreshed after shard count changes
Previously, if the number of shards changed, the Log Service connector failed to obtain the updated shard list, causing data reads to stop. This is now fixed.
Aggregation optimization error: `[J cannot be cast to [Ljava.lang.Object;`
Previously, aggregation optimization features such as miniBatch triggered a [J cannot be cast to [Ljava.lang.Object; error in certain cases. This is now fixed.
ApsaraDB for HBase result table: out-of-order data during async writes
Previously, data written to an ApsaraDB for HBase result table became out-of-order when asynchronous processing was enabled. This is now fixed.
Null pointer in two-stream join operations
Previously, join operations between two data streams could trigger a null pointer exception. This is now fixed.
Checkpointing failure when using MySQL CDC connector with Apache Hudi
Previously, checkpointing consistently failed when a job used the MySQL CDC connector to write data to Apache Hudi. This is now fixed.
`pendingRecords` metric for Message Queue for Apache Kafka source tables
The computational logic that is used to report the pendingRecords metric for Message Queue for Apache Kafka source tables is optimized.
Member names not displayed in the development console
Previously, specific member names were not shown in the development console of Realtime Compute for Apache Flink. This is now fixed.
DDL syntax validation error for valid statements
Previously, certain valid DDL statements triggered an error during syntax validation. This is now fixed.