All Products
Search
Document Center

Realtime Compute for Apache Flink:June 4, 2025

Last Updated:Jul 02, 2025

This topic describes the major feature changes and bug fixes in the Realtime Compute for Apache Flink released on June 4, 2025.

Important

The version upgrade is gradually released to users. For more information, see the latest announcement in the Realtime Compute for Apache Flink console. You can use the new features in this version only after the upgrade is complete for your account. To apply for an expedited upgrade, submit a ticket.

Overview

This release includes updates to the platform, engine, and connectors, along with performance optimizations and bug fixes.

Platform updates

Platform updates focus on user experience and O&M efficiency. Highlights:

  • Git integration: Integrates with GitHub, GitLab, and Gitee remote repositories, optimizing code version management and improving team collaboration efficiency.

  • SLS integration: Persists job startup logs, runtime event logs, and resource usage data to Alibaba Cloud Simple Log Service (SLS) for historical data query and auditing.

  • Job log experience optimization: Optimizes log output configuration, and supports visualized operation and configuration, reducing the risks of misoperations.

  • CloudMonitor alert experience optimization: Displays job names in alert notifications, facilitating the identification of abnormal jobs.

Engine updates

Ververica Runtime (VVR) version 11.1 is officially released. Built on Apache Flink 1.20.1, VVR 11.1 provides additional optimizations and enhancements. Highlights:

Before you upgrade

See Upgrade the engine version of a deployment and VVR version numbering strategy.

Incompatible changes

  • Java SDK: JDK 11 is now the standard runtime environment. Support for JDK 8 is officially ended. For JAR jobs compiled with JDK 8, recompile and re-package the programs using JDK 11. SQL jobs are not affected.

  • Hologres connector: The Hologres connector has undergone architectural optimizations, with some connector options being modified or removed.

New features

  • Real-time vectorization and inference powered by LLMs

    • Vector construction: Integrates with Alibaba Cloud Model Studio to vectorize streaming data in real time, supporting real-time feature computation for recommendation systems and intelligent search.

    • Text inference: Dynamically generates summaries, translations and other content by using pre-trained models, supporting intelligent text processing for your business use cases.

Capability enhancements

  • Backfilling historical data in materialized tables on schedule: Supports backfilling data in materialized tables on schedule.

  • Performing real-time lookup joins with a StarRocks dimension table

  • SLS CDC and SLS-Paimon schema evolution: YAML CDC adds support for SLS data sources and automatic schema evolution from SLS to Paimon, simplifying metadata management in data lakehouse scenarios.

  • Data ingestion engine upgraded to Flink CDC version 3.4: Supports core capabilities of Flink CDC version 3.4, with enhancements to data capture capability.

  • Ingesting data ingestion from SLS via YAML: Supports data ingestion and schema evolution from SLS to Paimon.

  • Ingesting data into MaxCompute via YAML: Ingests up to a terabyte of data into MaxCompute.

  • Optimizing a set operation of Flink SQL: Supports retaining field aliases in the debugging results of drafts containing UNION ALL.

Performance improvements

  • Enhanced snapshot reading performance for the PostgreSQL CDC connector: Implements an optimized data segmentation mechanism featuring asynchronous segmentation and parallel reading, significantly improving snapshot reading performance.

  • Enhanced cache policy for the Tair (Redis OSS-compatible) connector: Supports time-based cache disabling functionality, allowing users to specify days or time periods for cache deactivation. This helps prevent resource competition during peak hours and improves overall cache stability.

Experience optimization

Input/output volume tracking: Monitors input and output volumes at source and sink operators, enabling users to verify data integrity and optimize job performance.

Security enhancements

  • Security enhancement for the Tair (Redis OSS-compatible) connector: The Tair (Redis OSS-compatible) connector now supports TLS/SSL connections, enabling secure data communication. This enhancement is beneficial for industries with stringent security requirements, such as finance and government services.

  • Fixed Apache Parquet vulnerability: Upgraded Apache Parquet to 1.15.1 to fix a remote code execution (RCE) vulnerability associated with data deserialization in the Parquet format. This update mitigates potential security risks in data processing using Parquet.

  • MaxCompute identity reporting: The service now supports reporting applicationName to the MaxCompute server, enabling third-party services to accurately track resource usage by Flink jobs. This improves the auditability of cross-system resource access.

Features

Feature

Description

References

Integrate with Git repositories

Supports bidirectional synchronization of SQL code and job configurations with remote Git repositories (GitHub, GitLab, and Gitee), with built-in conflict resolution mechanisms.

Integrate with Git (public preview)

Deliver messages to SLS

Delivers job startup logs, runtime events, and resource usage data to SLS.

Deliver messages

Experience optimization of the log configuration page

Exports job logs to other datastores (such as SLS, OSS, and Kafka) in the console.

Configure parameters to export logs of a deployment

Ingest data into MaxCompute via YAML

Supports using the MaxCompute connector as a data ingestion sink for jobs developed in YAML.

MaxCompute connector

Optimize data ingestion

The converter-after-transform option now supports the FIELD_NAME_LOWER_CASE converter type, which automatically converts uppercase field names in source tables to lowercase, streamlining data preprocessing during the cleansing phase.

Data ingestion development references

Elasticsearch connector optimization

Supports ignoring null values when updating Elasticsearch tables, enhancing the robustness of data writing.

Elasticsearch connector

Hologres connector optimization

Adjusts and removes some connector options to optimize system architecture and improve maintenance efficiency. Functionality implementations and usage vary based on the VVR version you are using. To ensure job compatibility and operational stability, see the document specific to your VVR version.

MongoDB connector optimization

Adds the ignore.delete-events.enabled option to support filtering MongoDB delete events during change data capture, reducing the load of data synchronization and improving the efficiency of incremental synchronization.

MongoDB connector

MySQL connector optimization

The default value of the property-version option is changed to 1.

Manage MySQL Catalog

Kafka connector optimization

  • Supports the canal-json.infer-schema.strategy option, which allows you to configure schema parsing policies. You can choose to parse schemas either based on JSON data or sqlType fields in Canal JSON data.

  • Adds the json.decode.parser-table-id.fields option to support generating table schema fields by parsing JSON data.

  • Supports the debezium-json.include-schema.enabled sink-specific option for data ingestion, which checks whether Debezium JSON messages contain schema information.

Kafka connector

ApsaraMQ for RocketMQ connector optimization

Adds deliveryTimestampMode and other connector options that allow you to deliver messages on schedule and configure triggering rules flexibly. This provides fine-grained control over using message queue systems for time-series tasks.

ApsaraMQ for RocketMQ connector

Tair (Redis OSS-compatible) connector optimization

  • Supports reading a hashmap with multiple values for a key when cache is set to ALL.

  • Adds the cacert.filepath option to support TLS/SSL encryption for data links.

  • Adds the cacheReloadTimeBlackList option to disable caching during specific time periods daily.

Tair (Redis OSS-compatible) connector

StarRocks connector optimization

  • Supports compatible column type conversions.

  • Supports real-time lookup joins with StarRocks dimension tables, enabling complex real-time analytical scenarios.

Paimon connector optimization

  • Supports writing and consuming VARIANT data through the PARSE_JSON and TRY_PARSE_JSON built-in functions, improving the performance of querying and processing JSON data.

AnalyticDB for PostgreSQL connector optimization

The writeMode option supports COPY mode.

AnalyticDB for PostgreSQL connector

Materialized table optimization

  • Supports modifying the SQL queries and connector options of materialized tables.

  • Supports creating workflows for materialized tables for periodic scheduling and data backfilling.

Flink SQL support for Hive Kerberos authentication

Securely accesses Hive data in a Kerberized cluster through SQL. This ensures bidirectional identity verification and encrypted transmission between the client and the server, effectively preventing data theft and unauthorized access.

PyFlink Docker image upgrade

The basic Docker image of PyFlink is upgraded to improve compatibility with different Python and glibc versions.

Window function optimization

Supports Apache Flink version 1.20's improved SESSION WINDOW behavior, which offers increased flexibility compared to VVR version 8.x's SESSION WINDOW that requires coupling with aggregation statements. We recommend upgrading to the latest version to leverage these enhancements.

Window aggregation

Built-in support for SelectDB connector

The SelectDB connector has completed its public preview and is now officially a built-in connector of Realtime Compute for Apache Flink.

SelectDB connector (public preview)

Table API job optimization

Supports calling built-in functions in Table API queries.

Supported functions

LLM-powered real-time vectorization and inference

  • Introduces data definition language (DDL) statements for AI models.

  • Adds the ML_PREDICT function function, which uses AI models for inference in real-time computation.

  • Integrates with Alibaba Cloud Model Studio.

Major bug fixes

Connectors

Fixed the following issues:

  • Null is returned when the SLS connector used new architecture to consume escape characters.

  • Messages were lost in a delayed ApsaraMQ for RocketMQ topic.

  • Conflicts occured during concurrent writing to two Tair (Redis OSS-compatible) databases.

  • NullPointerException errors occured during lookup joins with a Hologres dimension table.

  • IllegalStateException errors occured when data was written to a Paimon table with the primary key.

  • Only a single row was matched during lookup joins with a Lindorm dimension table.

SQL and transformation

Fixed the following issues:

  • Issues with YAML transformation operators, to be specific, compilation errors caused by the Calcite parser not recognizing string comparison operators.

  • Conflicts occurred during schema consolidation through CTAS.

  • Access to a non-Hive table in a Hive catalog was denied.

Stability and performance

Fixed the following issues:

  • Paimon sinks took too long to close.

  • Table filtering exceptions. These can be bypassed using the debezium.table.exclude.list option.

  • Data was inconsistent due to MiniBatch.

  • Incompatibility issues between PyFlink Table API and Realtime Compute for Apache Flink's built-in functions.