You can enable the debugging feature to simulate deployment running, check outputs, and verify the business logic of SELECT and INSERT statements. This feature improves development efficiency and reduces the risks of poor data quality. This topic describes how to debug a deployment.
The deployment debugging feature allows you to verify the correctness of the deployment logic in the console of fully managed Flink. During the debugging, data is not written to the result table regardless of the type of the result table. When you use the deployment debugging feature, you can use the upstream online data or specify debugging data. You can debug complex deployments that include multiple SELECT or INSERT statements. SQL query statements allow you to use UPSERT statements that contain update operations, such as
To use the deployment debugging feature, you must create a session cluster. Session clusters are suitable for development and test environments in non-production environments. You can deploy or debug deployments in a session cluster to improve the resource utilization of a JobManager and accelerate the deployment startup. Fully managed Flink supports per-job clusters and session clusters. The two types of clusters have the following differences:
Per-Job Clusters: By default, fully managed Flink deploys or debugs deployments in a per-job cluster. Each deployment requires a separate JobManager to achieve resource isolation between deployments. Therefore, the resource utilization of JobManagers for deployments that process a small amount of data is low. This type of cluster is suitable for deployments that consume a large number of resources or deployments that run in a continuous and stable manner.
Session Clusters: This type of cluster allows multiple deployments to use the same JobManager, which increases the resource utilization of the JobManager. If multiple deployments run on the same JobManager, the stability of deployments is affected. Session clusters do not support the monitoring and alerting feature for a single deployment. Therefore, session clusters are suitable only when you test deployments.Note
You must make sure that a session cluster is created regardless of which type of cluster is used. When you create a session cluster, the cluster resources are consumed. The resource consumption is based on the configurations that you select when you create the cluster.
An addition of 0.5 compute units (CUs) are consumed after a session cluster that uses Ververica Runtime (VVR) 3.0.4 or later is run.
You can debug only SQL deployments by clicking Debug on the SQL Editor page in the console of fully managed Flink.
You cannot debug deployments in which the CREATE TABLE AS or CREATE DATABASE AS statement is executed.
MySQL CDC source tables are not written in append-only mode. Therefore, you cannot debug data of MySQL CDC source tables for session clusters of VVR 4.0.8 or an earlier version.
By default, fully managed Flink reads a maximum of 1,000 data records. If the number of data records that are read by fully managed Flink reaches the upper limit, fully managed Flink stops reading data.
Metrics of deployments that are deployed in session clusters cannot be displayed. Session clusters do not support the monitoring and alerting feature and the Autopilot feature.
Session clusters are suitable for development and test environments. We recommend that you do not use session clusters in production environments. If you use session clusters in production environments, the following stability issues may occur:
If a JobManager is faulty, all deployments of a cluster that runs on the JobManager are affected.
If a TaskManager is faulty, the deployments that have tasks running on the TaskManager are affected.
If processes are not isolated for tasks that run on the same TaskManager, the tasks may be affected by each other.
If the session cluster uses the default configurations, take note of the following points:
For a single small deployment, we recommend that the total number of such deployments in a cluster be no more than 100.
For complex deployments, we recommend that the number of parallel deployments be no more than 512, and the number of clusters in which 64 medium-sized deployments run in parallel be no more than 32. Otherwise, issues such as heartbeat timeout may occur and the stability of the cluster may be affected. In this case, you must increase the heartbeat interval and heartbeat timeout period.
If you want to run more tasks at the same time, you must increase the resource configuration of the session cluster.
Step 1: Create a session cluster
Go to the Session Clusters page.
Log on to the Realtime Compute for Apache Flink console.
On the Fully Managed Flink tab, find the workspace that you want to manage and click Console in the Actions column.
In the left-side navigation pane, click Session Clusters.
In the upper-left corner of the Session Clusters page, click Create Session Cluster.
Configure the parameters.
The following table describes the parameters.
The name of the cluster.
The desired state of the cluster. Valid values:
STOPPED: The cluster is stopped after it is configured, and the deployments that are deployed in the cluster are also stopped.
RUNNING: The cluster keeps running after it is configured.
You can configure labels for deployments in the Labels section. This allows you to find a deployment on the Overview page in an efficient manner.
The version of the Flink engine that is used by the current deployment. For more information about engine versions, see Engine version and Lifecycle policies. We recommend that you use a recommended version or a stable version. Engine versions are classified into the following types:
Recommend: the latest minor version of the current latest major version.
Stable: the latest minor version of a major version that is still in the service period of the product. Defects in previous versions are fixed in such a version.
Normal: other minor versions that are still in the service period of the product.
Deprecated: the version that exceeds the service period of the product.
Flink Restart Policy
Failure Rate: Deployments are restarted based on the failure rate.
If you select this option, you must configure the Failure Rate Interval, Max failures per interval, and Delay Between Restart Attempts parameters.
Fixed Delay: Deployments are restarted with a delay.
If you select this option, you must configure the Number of Restart Attempts and Delay Between Restart Attempts parameters.
No Restarts: Deployments are not restarted if the deployments fail.
If you leave this parameter empty, the default Apache Flink restart policy is used. In this case, if a task fails and checkpointing is disabled, the JobManager is not restarted. If you enable checkpointing, the JobManager is restarted.
Additional Flink Configuration
Configure other Flink settings, such as
Number of Task Managers
By default, the value is the same as the parallelism.
JobManager CPU Cores
Default value: 1.
Minimum value: 1. Unit: GiB. Recommended value: 4. We recommend that you use GiB or MiB as the unit. For example, you can set this parameter to 1024 MiB or 1.5 GiB.
We recommend that you configure JobManager resources and heartbeat-related parameters for the JobManager. When you configure this parameter, take note of the following points:
The JobManager provides features, such as TaskManager heartbeat, task serialization, and resource scheduling. Therefore, we recommend that the resource configuration for the JobManager be no less than the default configuration. Handle this issue based on the workload on your cluster.
To ensure cluster stability, you must prevent heartbeat timeout that is caused by the busy main thread of the JobManager. Therefore, we recommend that you set the heartbeat interval to at least 10 seconds and the heartbeat timeout period to at least 50 seconds. The heartbeat interval is specified by the heartbeat.interval parameter and the heartbeat timeout period is specified by the heartbeat.timeout parameter. You can increase the values of these parameters based on the number of TaskManagers and the increase in the number of deployments.
TaskManager CPU Cores
Default value: 2.
Minimum value: 1. Unit: GiB. Recommended value: 8. We recommend that you use GiB or MiB as the unit. For example, you can set this parameter to 1024 MiB or 1.5 GiB.
We recommend that you specify the number of slots for each TaskManager and the amount of resources that are available for TaskManagers. The number of slots is specified by the taskmanager.numberOfTaskSlots parameter. When you configure this parameter, take note of the following points:
For a single small deployment, we recommend that you set the CPU-to-memory ratio of a single slot to 1:4 and configure at least 1 CPU core and 2 GiB of memory for each slot.
For a complex deployment, we recommend that you configure at least 1 CPU core and 4 GiB of memory for each slot. If you use the default resource configuration, you can configure two slots for each TaskManager.
We recommend that you use the default resource configuration for each TaskManager and set the number of slots to 2.Important
If the resources configured for a TaskManager are insufficient, the stability of the deployments that run on the TaskManager is affected. In addition, the slots of the TaskManager cannot bear the overhead of the TaskManager because of insufficient slots. As a result, the resource utilization is reduced.
If you configure a large number of resources for a TaskManager, a large number of deployments run on the TaskManager. If the TaskManager is faulty, all the deployments are affected.
Root Log Level
The following log levels are supported and listed in ascending order of urgency.
TRACE: records finer-grained information than DEBUG logs.
DEBUG: records the status of the system.
INFO: records important system information.
WARN: records the information about potential issues.
ERROR: records the information about errors and exceptions that occur.
The name and level of the log.
The log template. You can use a system template or configure a custom template.Note
For more information about the options related to the integration between Flink and resource orchestration frameworks such as Kubernetes and Yarn, see Resource Orchestration Frameworks.
In the upper-left corner of the Session Clusters page, click Create Session Cluster.
After a session cluster is created, you can select the session cluster in the Debug dialog box when you debug a deployment or select the session cluster in the Deploy draft dialog box when you deploy a draft.
Step 2: Debug a deployment
Create an SQL deployment and write code for the deployment. For more information, see Develop an SQL draft.
In the upper-right corner of the SQL Editor page, click Debug. In the Debug dialog box, select a session cluster from the Session Cluster drop-down list. Then, click Next.
Configure debugging data.
If you use online data for debugging, click Confirm.
If you use debugging data to debug a deployment, click Download mock data template, enter the debugging data in the template, and then click Upload mock data to upload the debugging data.
The following table describes the parameters in the Debug Mock Data step.
Download mock data template
You can download the debugging data template to edit data. The template is adapted to the data structure of the source table.
Upload mock data
If you need to debug a deployment by using debugging data, you can download the debugging data template, upload the data after you edit the template, and then select Use mock data.
Limits on debugging data files:
Only a CSV file is supported.
A CSV file must contain a table header, such as id (INT).
A CSV file can contain a maximum of 1,000 data records but cannot be greater than 1 MB.
After you upload the debugging data, click the icon on the left side of the source table name to preview the data and download the debugging data.
The deployment debugging feature automatically modifies the DDL statements in source tables and result tables. However, this feature does not change the code in deployments. You can preview code details in the lower part of Code Preview.
After you click Confirm, the debugging result appears in the lower part of the SQL script editor.