Configure job deployment resources runtime parameters and logs - Realtime Compute for Apache Flink

Prerequisites

Before you begin, make sure that you have:

The required permissions to access a namespace and perform job resource configuration. For more information, see Grant permissions on namespaces.
A deployment already created. For more information, see Deploy a job.

Configure deployment settings

Log on to the Realtime Compute for Apache Flink console.
Find the workspace and click Console in the Actions column.
In the left-side navigation pane, choose O&M > Deployments. On the Deployments page, click the deployment name.
On the Configuration tab, find the section to edit and click Edit in its upper-right corner.
Modify the settings in the following sections:
- Basic
- Resources
- Parameters
- Logging
Click Save.

Basic

The following parameters are available in the Basic section. Availability depends on the deployment type.

Parameter	Deployment type	Description
Engine Version	SQL, JAR, Python	The Flink engine version to use.
Additional Dependencies	SQL, JAR, Python	Additional dependency files for the deployment.
Description	SQL, JAR, Python	A text description of the deployment.
Label	SQL, JAR, Python	Labels for organizing and identifying the deployment.
JAR Uri	JAR	The URI of the JAR file to run.
Entry Point Class	JAR	The fully qualified name of the main class.
Entry Point Main Arguments	JAR, Python	Arguments passed to the entry point.
Kerberos Name	JAR, Python	The Kerberos principal name for authentication.
Python Uri	Python	The URI of the Python file to run.
Entry Module	Python	The Python module to use as the entry point.
Python Libraries	Python	Additional Python library dependencies.
Python Archives	Python	Archive files required by the Python job.

For parameter details, see Develop an SQL draft (SQL deployments) or Deploy a job (JAR and Python deployments).

Note

For SQL deployments, clicking Edit in the Basic section displays a confirmation message. Click OK to confirm, and you are redirected to the SQL Editor page to edit and redeploy the deployment.

Parameters

Checkpointing

Parameter	Description
Checkpointing Interval	How often a checkpoint is generated. If not set, checkpointing is disabled.
Checkpointing Timeout time	The maximum time allowed for a checkpoint to complete. Default: 10 minutes. If a checkpoint does not complete within this time, it fails.
Min Interval Between Checkpoints	The minimum gap between two consecutive checkpoints. When the maximum checkpoint parallelism is 1, this defines the minimum interval between checkpoints.

State expiration

Parameter	Description
State Expiration Time	The time-to-live (TTL) of a deployment's state data. Default: 36 h. State data not accessed within this period is automatically removed from the state backend, freeing up memory.

Important

The default TTL of 36 hours is based on Alibaba Cloud best practices and differs from Apache Flink's default of 0, which means state data never expires. Set the TTL based on your data access patterns to balance computation accuracy and memory usage.

How state data works:

When data first enters the system, it is stored in the state backend. If data with the same primary key arrives again, Flink computes against the stored state and updates the last-access time. If the data is not accessed again within the TTL, Flink treats it as expired and removes it. Reducing TTL lowers memory consumption and improves system stability, but may affect accuracy for late-arriving data.

Restart policy

Flink controls job restart behavior through two independent mechanisms: the job restart policy and the task failure recovery policy.

Job restart policy

The job restart policy determines whether and how the job restarts after a failure.

Default behavior (when no policy is explicitly set):

If checkpointing is enabled: the job restarts using Fixed Delay.
If checkpointing is disabled: the job does not restart.

Override the default by selecting one of these policies:

Policy	Description	Additional parameters
No Restarts	The job does not restart if it fails.	—
Fixed Delay (default)	The job restarts at a fixed interval after each failure.	Number of Restart Attempts, Delay Between Restart Attempts
Failure Rate	The job restarts as long as the failure rate stays below a defined threshold.	Failure Rate Interval, Max Failures per Interval, Delay Between Restart Attempts

Task failure recovery policy

The task failure recovery policy determines which tasks are restarted when a failure occurs. Configure it by setting jobmanager.execution.failover-strategy in the Other Configuration field.

Value	Behavior
`full`	Restarts the entire job when any task fails.
`region` (default)	Restarts only the minimum set of tasks needed to recover the failed pipelined region, leaving other regions unaffected.

Note

When failover-strategy is set to region, different regions may have different start timestamps after recovery — this is expected behavior.

For more information, see Task Failure Recovery in the Apache Flink documentation.

Other configuration

Use Other Configuration to set additional Flink parameters as key-value pairs, for example:

akka.ask.timeout: 10
jobmanager.execution.failover-strategy: full

Note

GC type settings (such as -XX:+UseG1GC) cannot be modified via env.java.opts.

Logging

Parameter	Description
Log Archiving	Whether to archive logs. Enabled by default. When enabled, historical deployment logs are available on the Logs tab. In VVR 3.X, only VVR 3.0.7 and later support log archiving. In VVR 4.X, only VVR 4.0.11 and later support log archiving. For more information, see View the logs of a historical deployment.
Log Archives Expires	How long archived logs are retained. Default: 7 days.
Root Log Level	The minimum severity level to log. Levels in ascending order of urgency: TRACE, DEBUG, INFO, WARN, ERROR.
Log Levels	Custom log name and level pairs for specific loggers.
Logging Profile	The log template to use. Select default or Custom Template. Custom Template lets you export logs to external storage. For more information, see Configure parameters to export logs of a deployment.

References

To control state size at the operator level using TTL, see the "State TTL hints" section in Hints.
The logging settings on this page apply to a single deployment. To configure log export for all deployments in a namespace, see Configure parameters to export logs of a deployment.