All Products
Search
Document Center

Realtime Compute for Apache Flink:Configure a deployment

Last Updated:Sep 13, 2024

Before you start a Realtime Compute for Apache Flink deployment, you must configure the deployment. This topic describes how to configure a deployment.

Prerequisites

  • The required permissions are granted to the Alibaba Cloud account or Resource Access Management (RAM) user that you want to use to access a namespace in the Realtime Compute for Apache Flink console and perform operations such as deployment resource configuration. For more information, see Grant namespace permissions.

  • A deployment is created. For more information, see Create a deployment.

Procedure

  1. Log on to the Realtime Compute for Apache Flink console.

  2. find the workspace that you want to manage and click Console in the Actions column.

  3. In the left-side navigation pane, click O&M > Deployments. On the Deployments page, find the deployment that you want to manage and click its name.

  4. On the Configuration tab, find the section in which you want to configure parameters and click Edit in the upper-right corner.

    Note

    You must go back to the Development > ETL page to edit and deploy the deployment when you configure the basic configuration for the deployment. After you click Edit in the upper-right corner of the Basic section, a message appears. If you want to edit the deployment, click OK.

  5. Modify the configurations of the deployment.

    You can modify the deployment configurations in the following sections:

  6. After the configurations are complete, click Save.

Basic section

Deployment type

Description

SQL deployment

You can write SQL code and configure the Engine Version, Additional Dependencies, Description, and Label parameters. For more information about the parameters, see Develop an SQL draft.

Note

After you click Edit in the upper-right corner of the Basic section, a message appears. If you want to modify the deployment configurations, click OK. Then, you are redirected to the SQL Editor page to edit and deploy the deployment.

JAR deployment

You can configure the Engine Version, JAR Uri, Entry Point Class, Entry Point Main Arguments, Additional Dependencies, Description, Kerberos Name, and Label parameters. For more information about the parameters, see Create a deployment.

Python deployment

You can configure the Engine Version, Python Uri, Entry Module, Entry Point Main Arguments, Python Libraries, Python Archives, Additional Dependencies, Description, Kerberos Name, and Label parameters. For more information about the parameters, see Create a deployment.

Parameters section

Parameter

Description

Checkpointing Interval

The interval at which a checkpoint is generated. If you do not configure this parameter, the checkpointing feature is disabled.

Checkpointing Timeout time

Default value: 10. Unit: minutes. If a checkpoint is not generated within the time specified by this parameter, the checkpoint generation fails.

Min Interval Between Checkpoints

The minimum interval between two checkpoints. If the maximum parallelism of checkpoints is 1, this parameter specifies the minimum interval between the two checkpoints.

table.exec.state.ttl

The time-to-live (TTL) of the state data of a deployment. Default value: 36 h. The default value indicates that the state data of a deployment expires after 36 hours. The system automatically removes the expired data.

Important

The default value is determined based on the best practices of Alibaba Cloud. This default value is different from the default value of the TTL provided by Apache Flink. The default value of the TTL provided by Apache Flink is 0, which indicates that the state data does not expire.

The first time data enters the system and is processed, the data is stored in the state backend memory. If data with the same primary key value enters the system, the system performs data computation based on the stored state data and updates the access time of the data. This process is the core of Realtime Compute for Apache Flink and relies on continuously processed streaming data. If the data is not accessed again within the specified TTL, the system considers the data as expired data and removes the data from the state backend storage.

You must properly configure the TTL to ensure accurate data computing and timely removal of expired data. This reduces the usage of state backend memory, eases the burden on the system memory, and improves computing efficiency and system stability.

Flink Restart Policy

In a Realtime Compute for Apache Flink deployment, the restart policy for failed tasks depends on the topology of the deployment. If multiple tasks of the deployment fail within a short period of time, the connections between the tasks in the deployment topology have an impact on the restart policy.

  • If tasks that are not connected with each other fail, each task is independently restarted. Each restart is separately recorded.

  • If a task that is connected with other tasks fails, the connected tasks are also restarted. However, the restarts of these tasks are counted as only one restart event.

If the restart policy is not specified, Realtime Compute for Apache Flink determines whether to restart a deployment based on whether the checkpointing feature is enabled. If the restart policy is specified, Realtime Compute for Apache Flink restarts a deployment based on the specified policy. Valid values:

  • Failure Rate: The deployment is restarted if the number of failures within the specified interval exceeds the upper limit.

    If you select Failure Rate from the Flink Restart Policy drop-down list, you must configure the Failure Rate Interval, Max Failures per Interval, and Delay Between Restart Attempts parameters.

  • Fixed Delay: The deployment is restarted at a fixed interval. This is the default value.

    If you select Fixed Delay from the Flink Restart Policy drop-down list, you can change the values of the Number of Restart Attempts and Delay Between Restart Attempts parameters based on your business requirements.

  • No Restarts: The deployment is not restarted.

Other Configuration

Other Realtime Compute for Apache Flink settings. For example, you can specify akka.ask.timeout: 10.

Logging section

Parameter

Description

Log Archiving

Specifies whether to enable log archiving. By default, Allow Log Archives is turned on. After you turn on Allow Log Archives in the Logging section, you can view the logs of a historical deployment on the Logs tab. For more information, see View the logs of a historical deployment.

Note
  • In Ververica Runtime (VVR) 3.X, only VVR 3.0.7 and later minor versions support the log archiving feature.

  • In VVR 4.X, only VVR 4.0.11 and later minor versions support the log archiving feature.

Log Archives Expires

The validation period of archived log files. By default, the archived log files are valid for seven days.

Root Log Level

The root log levels. You can specify one of the following levels that are listed in ascending order of urgency:

  1. TRACE: records finer-grained information than DEBUG logs.

  2. DEBUG: records the status of the system.

  3. INFO: records important system information.

  4. WARN: records information about potential issues.

  5. ERROR: records information about errors and exceptions that occur.

Log Levels

The log name and log level.

Logging Profile

The log template that you want to use. You can select default or Custom Template from the drop-down list.

References

  • You can configure the TTL for the state data of an operator to control the size of the state data. This reduces the resource consumption of deployments that have a large amount of state data. For more information, see the "State TTL hints" section of the Hints topic.

  • The log parameter configurations that are described in this topic are configurations for a single deployment. For more information about how to configure parameters to export the logs of all deployments in a namespace, see the "Configure parameters to export the logs of all deployments in a namespace" section of the Configure parameters to export logs of a deployment topic.