All Products
Search
Document Center

Realtime Compute for Apache Flink:October 23, 2023

Last Updated:Nov 21, 2023

This topic describes the release notes for fully managed Flink and provides links to relevant references. The release notes provide the major updates and bug fixes in fully managed Flink in the version that was released on October 23, 2023.

Important

A canary release will be complete within two weeks for the upgrade. If you cannot use new features in fully managed Flink, the new version is still unavailable for your account. If you want to perform an upgrade at the earliest opportunity, submit a ticket to apply for an upgrade. For more information about the upgrade plan, see the most recent announcement on the right side of the homepage of the Realtime Compute for Apache Flink console.

Overview

A new official version of fully managed Flink was released on October 23, 2023. This version includes the platform upgrade, engine updates, connector updates, performance optimization, and bug fixes.

The engine version Ververica Runtime (VVR) 8.0.3 is released, which is a new enterprise-level Flink engine based on Apache Flink 1.17.1. This version of fully managed Flink enhances the interoperability between Flink and various storage and computing services of Alibaba Cloud. The features of the MaxCompute, StarRocks, ApsaraDB for Redis, Simple Log Service, and ApsaraMQ for RocketMQ connectors are optimized to improve the connector performance and stability. The usability of the CREATE TABLE AS statement is also improved.

In this version of fully managed Flink, the following changes are implemented on the platform: Automatic tuning is optimized. Autopilot supports the stable strategy, which converges the tuning results of a deployment to prevent the deployment from being frequently restarted. Multiple groups of scheduled tuning strategies can be configured in a scheduled tuning plan. You can change the scheduled tuning plan for a deployment without the need to cancel the deployment. Field-level data lineage is supported for SQL deployments to help you manage real-time assets. Dynamic parameter updates are supported to reduce the number of times a deployment is canceled and restarted. You can log on to the Realtime Compute for Apache Flink console by using a RAM role, a member of a resource directory, or a CloudSSO user.

Note

The DataStream APIs for the related connectors are planned to be released in the next version.

The canary release will be complete within two weeks on the entire network. After the canary release is complete, the platform capabilities are upgraded and you can view the new engine version in the Engine Version drop-down list of your deployment. You can upgrade the engine that is used by your deployment to the new version. For more information, see Upgrade the engine version of deployments. We look forward to your feedback.

Features

Feature

Description

References

Optimized automatic tuning capabilities

The stable strategy is supported in Autopilot. The strategy helps the system identify the resource configuration that is suitable for the entire running cycle of a deployment. The system automatically adjusts the resource configuration of a deployment only when the system finds a resource configuration that is more suitable for the entire running cycle of the deployment. Otherwise, the system does not modify the existing resource configuration. This prevents the modification of the entire deployment resources due to changes in traffic bursts. This strategy helps the deployment run in a stable manner and reduces unnecessary changes and fluctuations to make the deployment reach the convergence state.

Configure automatic tuning

Scheduled tuning plans

Multiple scheduled tuning plans can be configured for a deployment, and the scheduled tuning plan in use can be changed without the need to cancel the deployment. One scheduled tuning plan can also contain multiple scheduled tuning strategies.

Configure automatic tuning

Data lineage of Flink SQL deployments

Data lineage of Flink SQL deployments can be viewed. You can view data lineage to find the Flink SQL deployment that uses a specific field in a table. You can also view data lineage to identify the field-level relationships between the source and result tables of the Flink SQL deployment. This helps you manage deployments and data assets in an efficient manner.

View data lineage

Dynamic updates of parameter configurations

The configuration of the Parallelism parameter and specific runtime parameters of a Flink deployment can be dynamically modified without the need to cancel the deployment.

Dynamically update the parameter configuration of a deployment

Optimization suggestions on SQL deployments

When you perform a syntax check on an SQL deployment, information about the risks and optimization suggestions on the SQL deployment is returned. You can optimize SQL statements based on the information about the risks and optimization suggestions.

Develop an SQL draft

Label-based search

When you create an SQL deployment, a JAR deployment, or a Python deployment, a label can be specified for the deployment. After you specify a label for the deployment, you can search for all deployments that use the label key or label value by label on the Deployments page in the console of fully managed Flink. This helps you manage deployments in an efficient manner.

Create a deployment

Enhanced deployment sorting and filtering capabilities

On the Deployments page in the console of fully managed Flink, deployments can be sorted by health score or business latency and can be filtered based on the user who modifies a deployment.

N/A

Logon authentication for RAM roles, members of a resource directory, and CloudSSO users

Realtime Compute for Apache Flink is fully adapted to the Alibaba Cloud account system. Alibaba Cloud accounts, RAM users, RAM roles, members of a resource directory, and CloudSSO users are supported. This simplifies user identity and access management and provides more comprehensive resource management and authorization mechanisms.

Complex data structures and exclusive Tunnel access supported by the MaxCompute connector

The MaxCompute connector can be used to read and write data of the JSON, ARRAY, MAP, and STRUCT types in MaxCompute. The MaxCompute connector also allows you to specify an exclusive Tunnel for service access.

MaxCompute connector

Support for the ALL cache policy when the ApsaraDB for Redis connector is used for a dimension table

When the ApsaraDB for Redis connector is used for a dimension table, the cache parameter can be set to ALL. This helps improve the processing performance.

ApsaraDB for Redis connector

ApsaraDB RDS for MySQL instances with transparent data encryption (TDE) enabled supported by MySQL catalogs

ApsaraDB RDS for MySQL instances for which TDE is enabled are supported by MySQL catalogs.

N/A

OSS-HDFS endpoint supported by the oss.endpoint parameter of a Data Lake Formation (DLF) catalog

The oss.endpoint parameter can be set to an OSS-HDFS endpoint for a DLF catalog.

Type normalization mode for the schema change in the CREATE TABLE AS statement when the StarRocks connector is used for a result table

The type normalization mode for the schema change is supported when you run a deployment that executes the CREATE TABLE AS statement to synchronize data and uses the StarRocks connector for a result table. In this case, if the length of a field of a specific data type in the source table changes but the length after the change is compatible with the length of the related field in the StarRocks result table, the synchronization is not affected.

N/A

Column of the DATE data type used as a partition key for a Hologres result table

If a column of the DATE data type is used as a partition key, a partitioned table can be automatically created.

Hologres connector

Data reading from and writing to ApsaraMQ for RocketMQ 5.x tables by using the ApsaraMQ for RocketMQ connector

The ApsaraMQ for RocketMQ connector can be used to read data from or write data to ApsaraMQ for RocketMQ 5.x tables.

ApsaraMQ for RocketMQ connector

Performance optimization for miniBatch and JOIN operations in specific scenarios

The performance of the miniBatch feature and JOIN operations is optimized. If you perform a JOIN operation after an operation such as Change Data Capture (CDC) or deduplication and aggregation is performed, the throughput in specific scenarios can be increased by about 130%.

N/A

Bug fixes

  • The following issue is fixed: An error occurs when MaxCompute uses a catalog to write data to a partitioned table.

  • The following issue is fixed: The performance of JOIN operations on Hologres dimension tables is poor.