All Products
Search
Document Center

Dataphin:Create FLINK_SQL tasks

Last Updated:Jan 21, 2025

This topic explains how to create FLINK_SQL tasks using the Ververica Flink engine.

Prerequisites

Before you begin, make sure that the project has enabled the real-time engine and configured the Ververica Flink compute source. For more information, see Create a general project.

Remarks

Only super administrators, project administrators, and developers can create FLINK_SQL tasks.

Step 1: Create FLINK_SQL tasks

  1. In the top menu bar of the Dataphin home page, select Development > Data Development.

  2. In the top menu bar, select Project. If in Dev-Prod mode, also select Environment.

  3. In the left-side navigation pane, select Data Processing > Script Task. In the compute task list on the right, click the image icon and select FLINK_SQL.

  4. In the Create FLINK_SQL Task dialog box, configure the parameters.

    Parameter

    Description

    Task Name

    The naming conventions are as follows:

    • Only lowercase English letters, numbers, and underscores (_) are allowed.

    • The name must be 4 to 63 characters in length.

    • Duplicate names are not allowed within the same project.

    • The name must start with an English letter.

    Production Environment Cluster

    Select the cluster where the FLINK_SQL task resides.

    Production Engine Version

    Select the engine version for running tasks in the production environment.

    Note

    If your project space is in Basic mode, this configuration item is Engine Version.

    Development Environment Cluster And Engine Version

    You can select System Default Configuration or Custom Configuration.

    • System Default Configuration: The default option. Use the same environment cluster and engine version as the production environment.

    • Custom Configuration: You can manually select the environment cluster and engine version for running tasks in the development environment.

    Note

    If your project space is in Basic mode, this configuration item does not need to be configured.

    Storage Directory

    Select the directory where the task is stored.

    If no directory is created, you can Create Folder. The steps are as follows:

    1. Above the compute task list on the left side of the page, click the image icon to open the Create Folder dialog box.

    2. In the Create Folder dialog box, enter the folder Name and select the Directory location as needed.

    3. Click Confirm.

    Creation Method

    The following methods are supported: Blank Creation, Reference Sample Code, and Use Template.

    • Blank Creation: Create a blank FLINK_SQL task.

    • Reference Sample Code: Quickly create a task by referencing built-in sample code.

    • Use Template: Quickly create a task based on a real-time computing task template.

    Description

    Provide a brief description of the FLINK_SQL task, within 1000 characters.

  5. Click OK.

Step 2: Develop and precompile FLINK_SQL node code

  1. In the FLINK_SQL code page, write the code for the task.

    After writing the code, click the Format button in the menu bar to automatically adjust the SQL code format.

  2. Click Precompile to check for syntax and permission issues in the code task.

    If precompilation is successful, a Precompilation Successful message will appear. If it fails, a Precompilation Failed message will display. Click Console at the bottom of the page to view the precompilation failure log.

Step 3: Configure FLINK_SQL task

  1. Click Configuration in the right sidebar of the current compute task.

  2. In the configuration panel, set up the relevant configuration information for the FLINK_SQL task in both Real-time Mode and Offline Mode.

    Note

    Dataphin real-time computing supports stream-batch integrated tasks using a unified stream-batch compute engine. You can configure the task configuration of Stream + Batch on a single code, generating instances in different modes based on the same code. To enable batch processing, activate offline mode on the task configuration page and configure related resources, schedule dependencies, etc.

    • Real-time mode

      • Resource configuration (required): You must configure the cluster, engine version, Job Manager CPUs, and Job Manager Memory to match the production and development environments for the task. For configuration instructions, see Configure Ververica Flink real-time mode resources.

      • Variable Configuration: Variables for this object node can be defined directly within the code without prior declaration. The system will automatically extract them into the parameter list, where you can adjust their types and set their values. For configuration instructions, see Real-time Mode Variable Configuration.

      • Checkpoint Configuration: Setting up the Checkpoint for a Flink SQL task is crucial for enabling the restoration of the task to its pre-crash state in the event of an unexpected failure. For guidance on how to configure this feature, see Real-time mode Checkpoint configuration.

      • State Configuration: Set the interval for automatic data cleanup within the State. For guidance on configuration, see Real-time Mode State Configuration.

      • Run parameters: You can control the execution behavior and performance of Flink applications by configuring run parameters,. For configuration instructions, see real-time mode run parameter configuration.

      • Dependency files: Set up the resource files required by the task. For configuration instructions, refer to real-time mode dependency file configuration.

      • Dependency relationships: Setting up dependency relationships facilitates a swift understanding of the data's upstream and downstream tasks during troubleshooting. For configuration instructions, see the configuration of real-time mode dependency relationships.

    • Offline mode (Beta)

      • Schedule Configuration (Required): The schedule configuration is essential for establishing the recurring schedule pattern of a node within the production environment. Through the schedule properties, you can set the task's scheduling cycle and its effective date. For instructions on how to configure, see Offline Mode Schedule Configuration.

      • Resource Configuration (Required): You must configure the cluster, engine version, degree of parallelism, number of Task Managers, Job Manager Memory, and Task Manager Memory to match the production and development environments of the task. For instructions on configuration, see Configure Ververica Flink offline mode resources.

      • Runtime parameters: You can configure runtime parameters, to control the execution behavior and performance of Flink applications. For configuration instructions, see Offline mode runtime parameter configuration.

      • Dependency files: Set up the resource files required by the Flink SQL task. For configuration instructions, see Offline mode dependency file configuration.

      • Dependency Relationships (Required): Configuring dependency relationships is essential for quickly understanding the data's upstream and downstream tasks during troubleshooting. For more information, see Offline Mode Dependency Relationship Configuration.

  3. Click OK.

Step 2: Develop and precompile FLINK_SQL task code

  1. Dataphin supports testing developed Flink_SQL code. Click the Test button in the top menu bar to sample data for the code task and perform local testing to ensure code correctness.

  2. In the test configuration dialog box, select Real-time Pattern - FLINK Stream Node for real-time pattern testing or Offline Pattern - FLINK Batch Node for offline pattern testing.

    • Real-time Pattern Testing: This process involves sampling the corresponding real-time physical data. Upon completion of data sampling, local testing will be carried out using the Flink Stream pattern. For more information, see real-time pattern testing.

    • Offline Pattern Test: This test utilizes data from the corresponding offline physical table. Upon completing data sampling, a local test will be executed using the Flink Batch pattern. For more information, see offline pattern test .

Note

Currently, only single pattern testing is supported. After selecting a pattern, you can sample the corresponding pattern table data for testing.

Step 5: Submit FLINK_SQL task

  1. Click the Submit button in the top menu bar.

  2. In the Submit dialog box, review the Submission Content and Pre-check information, and fill in the Submission Remarks.

  3. Click Confirm And Submit.

    Note

    If your project follows a Dev-Prod pattern, you must publish the Flink SQL node to the production environment. For detailed instructions, see manage publish nodes.

What to do next

In the Operation Center, view and maintain FLINK_SQL nodes to ensure their normal operation. For specific operations, see view and manage real-time instances or view and manage real-time nodes.