VVR 11.6.0 release notes - Realtime Compute for Apache Flink - Realtime Compute for Apache Flink

VVR 11.6.0 adds multimodal AI functions (PDFs, images), the Variant type for semi-structured data, and promotes Data Ingestion (Flink CDC) to GA. It also improves Kafka, MySQL CDC, OceanBase, Elasticsearch, and Hologres connectors, and incorporates Apache Flink 1.20.2 and 1.20.3 community fixes.

Important

This upgrade rolls out in phases across all regions. If your account has not been upgraded, these features are unavailable. To request an expedited upgrade, submit a ticket.

Engine enhancements

AI

Multimodal

New built-in functions for real-time image and document processing:

PDF-to-image conversion — convert PDF pages to images within a Flink job.
File retrieval — read file contents directly from OSS and Message Notification Service (MNS).
Image sharpness detection — assess image sharpness using OpenCV.
Image compression — reduce image size inline.
Base64 image pass-through — pass Base64-encoded images to downstream model calls.

These functions support end-to-end inference pipelines with Vision Language Models (VLMs) such as Qwen-VL.

Simple Message Queue (MNS) connector

A new MNS connector lets you subscribe to OSS change events, closing the loop on real-time multimodal pipelines.

Flink SQL

Variant type

The new VARIANT type handles semi-structured data without a fixed schema. Access fields with dot notation (variant.field) or bracket notation (variant['key']), convert to and from basic types, and write Variant columns to Paimon sinks.

New functions

parse_json — converts a JSON string to a Variant value (available in Flink CDC Transform).
MD5 and other hash functions — available in Flink CDC Transform.

Data Ingestion (Flink CDC) — now GA

Data Ingestion (Flink CDC) has exited public preview and is GA.

Connector GA status

Status	Connectors
Generally available	Paimon, StarRocks, Hologres, MySQL, Kafka
Public preview	Doris, OceanBase, MaxCompute, SLS, MongoDB, Postgres

What's new in this release

Column merging

Merge multiple upstream fields with different names or casing into a single target column. Supports regex matching, case normalization, and custom mappings — useful for consolidating JSON sources with inconsistent field naming.

Append-only writes to Paimon partitioned tables

The Paimon sink now writes to partitioned tables without a primary key. Previously, the partition key had to be part of the primary key; this requirement is removed for append-only use cases.

If you currently rely on the old behavior, verify your Paimon sink configuration after upgrading.

Transform enhancements

Clear a primary key or partition key by passing a null value.
Define complex table name routing with regular expressions.

Variant support

Access and convert VARIANT type fields; write Variant data to Paimon.

Source enhancements

PolarDB-X CDC source (public preview) — high-parallelism binlog subscription at the table level, and subscription by table dimension.
SLS source — enforce specific field parsing types.
Kafka source — split a single Kafka message into multiple records based on field values (field routing) and write them to different target tables. Custom partitioners are also supported.

Sink enhancements

Sink	Enhancement
Paimon	Separate parallelism configuration for commit nodes
MaxCompute	DATETIME type mapping; optimized commit logic to reduce queries per second (QPS) consumption
Iceberg	Built-in catalog references; automatic retrieval of connection URLs and credentials

Connectors

Kafka

The sink writes three-part table IDs (Database.Schema.Table) in Debezium JSON format.
After a topic change, a stateful restart now throws an incompatible state exception instead of silently consuming from both old and new topics. This prevents dual consumption and data inconsistencies.

If your jobs rely on restarting after a topic change, review this behavior before upgrading.

MySQL CDC

Error messages for expired global transaction identifiers (GTIDs) now clearly state the root cause.
The consumer server ID is included in logs to simplify troubleshooting.

PolarDB-X

Now supported as a CDC YAML source (public preview).

OceanBase

The JDBC sink now supports manual transaction rollbacks and connection pool reuse, resolving frequent disconnections caused by wait_timeout.

Elasticsearch

The source table and dimension table now support Elasticsearch 8.x (using the ES7-compatible client).

Doris

Error messages for incorrect port configurations are now more descriptive.

Data lakehouse integration

Iceberg

The sink now supports the numRecordsOutOfSinkPerSecond (OUT RPS) metric.
Hadoop-related parameters can be configured for connection flexibility.
Flink CDC jobs can write to Data Lake Formation (DLF) Iceberg.

Hologres

The Binlog source table supports consuming from the LATEST offset.
The Hologres catalog exposes secondary indexes and prefix scan keys.
Reading the varchar[] array type is now supported.
Parameter detection caching is optimized to prevent initialization timeouts when many tables exist.
The sink supports parallelism greater than the number of shards when sink.reshuffle-by-holo-distribution-key.enabled is configured.

MaxCompute

The catalog uses paginated queries to prevent metadata center freezing.
Optimized YAML sink commit logic reduces OOM errors from QPS limits.

Hive

The catalog lets you specify a storage format (such as Parquet) when creating a table.

Paimon

Added support for the Lance file format.

Observability

New metrics:

Metric	Description
`geminiDB.disk_space_*`	Local disk usage
`geminiDB.native_memory_usage`	GeminiDB native memory in use
`geminiDB.native_memory_limit`	GeminiDB native memory limit
`sourceParallelismUpperBound`	Auto-pilot operator parallelism upper bound

Non-essential WARN logs — such as alerts when a format does not support snapshots — are now suppressed.

Bug fixes

Stability

Apache Flink community fixes — incorporates fixes from Apache Flink 1.20.2 and 1.20.3.
Kafka data loss — fixed data loss when reading from Kafka and writing to OSS with transactions enabled.
PolarDB-X disconnection — fixed a sharp latency spike and EOFException triggered by a dropped PolarDB-X connection.
OceanBase `wait_timeout` — fixed frequent disconnections in the OceanBase JDBC sink caused by wait_timeout.

Correctness

Flink CDC Canal Protobuf — fixed inconsistent timestamp format and tinyint type handling.
MySQL CDC source debugging — fixed an off-by-one count when reuse was enabled (the table count now shows the correct number).
Flink CDC MaxCompute (ODPS) sink — fixed Metaspace OOM errors caused by frequent commits.

Experience

Temporal Join error messages — error messages for Temporal Join syntax are now clearer.
Internal WARN logs — Cannot snapshot the table and similar internal alerts are now logged at DEBUG level.
Hologres binlog null fields — fixed an exception when some fields were null during Hologres binlog consumption.