Before you start a job, you must configure its deployment. This topic describes how to configure a job deployment.
Prerequisites
Grant the required permissions to the Alibaba Cloud account or Resource Access Management (RAM) user that you use to access the project and configure job resources. For more information, see Grant permissions in the development console.
A job is deployed. For more information, see Deploy a job.
Procedure
Log on to the Realtime Compute for Apache Flink console.
Find the target workspace and click Console in the Actions column.
On the page, click the name of the job.
On the Deployment Details tab, click Edit to the right of the target section.
NoteTo edit the basic configuration of an SQL job, you must return to the page to edit the job draft and redeploy it. After you click Edit to the right of the Basic Configuration section, click OK in the dialog box that appears.
Modify the job deployment information.
You can modify the deployment information in the following sections:
Click Save.
Basic configuration
Job type | Description |
SQL job | Includes SQL code and information for Engine Version, Additional Dependencies, Description, and Job Tags. For more information about the parameters, see Job development map. Note After you click Edit to the right of the Basic Configuration section, you must return to the SQL development page to edit the job draft and redeploy it. To continue editing, click OK. |
JAR job | Includes Engine Version, JAR Uri, Entry Point Class, Entry Point Main Arguments, Additional Dependencies, Description, Kerberos Cluster, and Job Tags. For more information about the parameters, see Deploy a job. |
Python job | Includes Engine Version, Python Uri, Entry Module, Entry Point Main Arguments, Python Libraries, Python Archives, Additional Dependencies, Description, Kerberos Cluster, and Job Tags. For more information about the parameters, see Deploy a job. |
Runtime parameter configuration
Parameter | Description |
System checkpoint interval | The interval at which system checkpoints are periodically performed. If you leave this parameter empty, system checkpoints are disabled. |
System checkpoint timeout | The default value is 10 minutes. If a system checkpoint is not generated within the specified timeout period, the checkpoint fails. |
Minimum interval between system checkpoints | The minimum interval between two system checkpoints. If the maximum degree of parallelism for system checkpoints is 1, this setting ensures a minimum time gap between two consecutive checkpoints. |
State data TTL | The time-to-live (TTL) of state information, in hours. The default value is 36 hours. This means the job's state information automatically expires and is purged after 36 hours. Important This default value is based on Alibaba Cloud best practices and differs from the open source default. The open source default is 0, which means the state information never expires. When data first enters the system and is processed, it is stored in state memory. When new data with the same primary key arrives, the system uses the stored state data for computation and updates its access time. This process is central to real-time computing because it relies on a continuous flow of data. If data is not accessed again within the configured TTL time window, the system considers it expired and purges it from state storage. Setting a proper TTL value maintains computational accuracy and promptly cleans up old data. This reduces state memory usage, lessens the system's memory load, and improves both computational efficiency and system stability. |
Flink restart policy | The restart behavior of a Flink job is determined by two policies: a job-level restart policy and a task-level fault recovery policy. Job-level restart policy Flink decides whether to restart a job based on whether checkpointing is enabled only when a job-level restart policy is not configured.
Explicitly configure a job-level restart policy
Task-level fault recovery policy This policy controls how to restart a task after it fails. Configure this using the
For more information, see Restart Strategies. Note A job contains two source-to-sink pipelines that are not connected. If a task in one pipeline fails and the region fault recovery policy is used, Flink restarts only the failed task's pipeline. The other pipeline continues to run without being affected. This can cause the start times of the two pipelines to be different, which is normal. |
Other configurations | Set other Flink configurations here. For example, Note You cannot use parameters related to |
Log configuration
Parameter | Description |
Log archiving | Log archiving is enabled by default. After you enable log archiving, you can view the logs of historical job instances on the job logs page. For more information, see View logs of historical job instances. Note
|
Archived log validity period | The default validity period for archived logs is 7 days. |
Root log level | The log levels are listed below in ascending order of severity:
|
Class log level | Enter the logger name and log level. |
Log template | You can select the default system template or a custom template. If you select a custom template, you can output logs to other storage services. For more information, see Configure job log output. |
References
You can set the TTL for an operator to more precisely control the state size of each operator and save resources for jobs with large states. For more information, see Operator state lifecycle (State TTL) hints.
This topic describes how to configure logs for a single job. To configure logs for all jobs in a project, see Configure log output channels for all jobs in a project.