Debug a SQL draft - Realtime Compute for Apache Flink - Alibaba Cloud Documentation Center

[INFO] Doc info: docId=7605350, topicId=2002170, spaceId=133 [INFO] Document content read: nodeId=2608841, 26864 chars Debug a SQL draft

Debug SQL drafts to verify your business logic without writing data to production sinks. No data reaches downstream systems during a debugging session, regardless of the sink table type.

Use debugging to:

Test SELECT or INSERT statements with live upstream data or test data that you provide.
Validate jobs that contain multiple SELECT or INSERT statements.
Verify UPSERT queries, including statements with update operations such as count(*).

Limitations

Only SQL jobs are supported.
CREATE TABLE AS SELECT (CTAS) and CREATE DATABASE AS (CDAS) statements are not supported.
Flink pauses after reading 1,000 records by default.
Each debugging session is limited to 3 minutes.

Prerequisites

Before you begin, make sure you have:

A Realtime Compute for Apache Flink workspace in the Running state
A SQL draft with valid SQL statements on the ETL page
Sufficient permissions to create session clusters and run debugging tasks

Step 1: Create a session cluster

Debugging runs on a session cluster, which is designed for development and testing only. Do not use session clusters in production.

In the Flink console, open your workspace.
In the left navigation pane, choose O&M > Session Clusters.
Click Create Session Cluster.
Configure the parameters described in the following sections, and then click Create Session Cluster.

Basic settings

Parameter	Description
Name	The name of the session cluster.
Deployment Target	The resource queue for the session cluster. For more information, see Manage queues.
State	The desired state after creation. RUNNING: The cluster starts immediately. STOPPED: The cluster is created but does not start.
Scheduled Session Management	Automatically shuts down the cluster after no jobs run for a specified period, preventing resource waste from idle clusters.
Labels	Tags for filtering jobs on the Overview page.

Engine and restart settings

Parameter	Description
Engine Version	The VVR engine version. Select a Recommended or Stable version. For details, see Engine versions and Lifecycle policies.
Flink Restart Policy	The restart behavior when a task fails. Failure Rate: Restarts based on failure rate within a configurable interval. Fixed Delay: Restarts at a fixed interval for a set number of attempts. No Restarts: The job does not restart on failure. If not configured, the default strategy applies: the JobManager does not restart on task failure when checkpointing is off, but restarts when checkpointing is on.
Other Configuration	Additional Flink configuration entries (for example, `taskmanager.numberOfTaskSlots: 1`).

Resource settings

Parameter	Description
Number of TaskManagers	By default, this value equals the parallelism.
JobManager CPU Cores	Default: 1.
JobManager Memory	Minimum: 1 GiB. Recommended: 4 GiB.
TaskManager CPU Cores	Default: 2.
TaskManager Memory	Minimum: 1 GiB. Recommended: 8 GiB.

Slot sizing recommendations:

Workload	vCPU per slot	Memory per slot
Small jobs (parallelism of 1)	1	2 GiB
Complex jobs	1	4 GiB

Start with the default configuration of 2 slots per TaskManager.

Important

If a single TaskManager has too few resources, job stability degrades and the TaskManager overhead cannot be shared across slots. If a single TaskManager has too many resources, a failure in that TaskManager affects many jobs at once.

Log settings

Parameter	Description
Root Log Level	Log levels in ascending severity: TRACE, DEBUG, INFO, WARN, ERROR.
Log Levels	The class name and log level.
Logging Profile	A system template or a custom template.

Note

For options related to the integration of Flink with resource orchestration frameworks such as Kubernetes and YARN, see Resource orchestration frameworks.

After you create the session cluster, you can select it on the ETL page when you start a debugging session.

Step 2: Debug the SQL draft

On the ETL page, write or open the SQL code for your job. For more information, see Job development overview.
Click Debug. Select a session cluster, and click Next.

Configure the test data source:

Live data: To use live upstream data, click Confirm without making changes.
Test data: To use test data, click Download mock data template to get a CSV template that matches the schema of your source table. Fill in the template, upload it, and then select Use mock data.

Option	Description
Download mock data template	Downloads a CSV template that matches the schema of the source table.
Upload mock data	Upload a CSV file with test data. The file must include a header row (for example, `id(INT)`). Maximum file size: 1 MB or 1,000 records.
Data Preview	After uploading test data, click the expand icon (+) next to the source table name to preview the data.
Code preview	Shows the modified DDL statements. Debugging modifies DDL statements for source and sink tables automatically but does not change your actual job code.

Click OK. The debug results appear below the SQL editor.

Troubleshooting

Issue	Cause	Solution
Debug button is grayed out	No session cluster is running	Create and start a session cluster in O&M > Session Clusters.
Debugging session times out	Each session is limited to 3 minutes	Simplify the query or reduce the input data volume.
Debug results are empty	Source table has no data or test data is invalid	Check the data preview. If using test data, verify the CSV format and header row.
Session cluster heartbeat timeout	Too many jobs running on the cluster	Reduce the number of concurrent jobs or increase the heartbeat interval and timeout.
Unexpected debug output	DDL auto-modification not visible	Check the Debug code preview to see the modified DDL statements.

What's next

To deploy a job after development or debugging, see Deploy a job.
After deploying a job, see Start a deployment.
For a complete walkthrough of the Flink SQL workflow, see Quick start for Flink SQL jobs.