All Products
Search
Document Center

DataWorks:ADB Spark SQL node

Last Updated:Feb 05, 2026

The ADB Spark SQL node in DataWorks lets you develop, periodically schedule, and integrate AnalyticDB for MySQL Spark SQL tasks with other jobs. This topic explains how to develop a task with an ADB Spark SQL node.

Background

AnalyticDB for MySQL Serverless Spark is a managed big data analytics and compute service built on Apache Spark. It supports submitting Spark SQL statements directly in the AnalyticDB for MySQL console, simplifying data analysis with Spark for data developers. After you bind AnalyticDB for MySQL Serverless Spark as a compute resource in DataWorks, you can use an ADB Spark SQL node to develop and run Spark SQL tasks. For more information, see Overview.

Prerequisites

AnalyticDB for MySQL prerequisites:

  • You have created an AnalyticDB for MySQL AnalyticDB for MySQLBasic Edition cluster in the same region as your workspace. For more information, see Create a cluster.

  • You have created an interactive resource group in the AnalyticDB for MySQLcluster with its engine type set to Spark to run Spark SQL tasks in DataWorks. For more information, see Create an interactive resource group.

  • If you plan to use Object Storage Service (OSS) in an ADB Spark SQL node, ensure that the OSS bucket and the cluster are in the same region.

DataWorks prerequisites:

  • You have a workspace with a bound resource group.

  • The resource group is bound to the same Virtual Private Cloud (VPC) as the AnalyticDB for MySQLcluster. AnalyticDB for MySQLThe IP address of the resource group is added to the whitelist of the cluster. For more information, see Configure a whitelist.

  • You have added the AnalyticDB for MySQLcluster instance to DataWorks as an AnalyticDB for Spark compute resource and tested the connectivity through the resource group. For more information, see Associate a computing resource.

  • You have created an ADB Spark SQL node. For more information, see Create a node for a scheduled workflow.

Step 1: Develop the ADB Spark SQL node

  1. Create an external database.

    Develop SQL code in the ADB Spark SQL node editor. The following example creates an external database. To create a managed table, see Use Spark SQL to create an internal table.

    CREATE DATABASE IF NOT EXISTS `adb_spark_db` location 'oss://dw-1127/db_home';
  2. Develop the task.

    Develop the task code in the SQL editor. You can define variables in the format ${variable_name} and assign values to them in the scheduling parameters section on the Properties tab. This enables dynamic parameter passing for scheduled jobs. For more information about scheduling parameters, see Sources and expressions of scheduling parameters. The following is an example:

    CREATE TABLE IF NOT EXISTS adb_spark_db.tb_order_${var}(id int, name string, age int) 
    USING parquet 
    location 'oss://dw-1127/db_home/tb1' 
    tblproperties ('parquet.compress'='SNAPPY');
    
    CREATE TABLE IF NOT EXISTS adb_spark_db.tb_order_result_${var}(id int, name string, age int) 
    USING parquet 
    location 'oss://dw-1127/db_home/tb2' 
    tblproperties ('parquet.compress'='SNAPPY');
    
    INSERT INTO adb_spark_db.tb_order_result_${var} SELECT * FROM adb_spark_db.tb_order_${var};
    Note

    In this example, you can set the ${var} variable to $[yyyymmdd] for daily incremental data processing.

Step 2: Debug the ADB Spark SQL node

  1. Configure the debug properties for the ADB Spark SQL node.

    On the Properties tab in the right pane, configure parameters such as Compute Engine, ADB Compute Resource Group, Resource Group, and Compute CUs. For more information, see the following table.

    Parameter type

    Parameter

    Description

    Compute resource

    Compute engine

    Select the AnalyticDB for Spark compute resource that you bound.

    ADB Compute Resource Group

    Select the interactive resource group that you created in the AnalyticDB for MySQLcluster. For more information, see Create and manage a resource group.

    Note

    The engine of the interactive resource group must be Spark.

    Resource group

    Resource group

    Select the resource group that passed the connectivity test when you bound the AnalyticDB for Spark compute resource.

    Compute CUs

    The node uses the default CU value. You can leave this parameter unchanged.

  2. Debug and run the ADB Spark SQL node.

    To run the task, click Save and then Run.

Step 3: Schedule the ADB Spark SQL node

  1. Configure the scheduling properties for the ADB Spark SQL node.

    To run the task periodically, configure the following parameters in the Scheduling Policy section on the Properties tab. For more information about other parameters, see Node scheduling configuration.

    Parameter

    Description

    Compute engine

    Select the AnalyticDB for Spark compute resource that you bound.

    ADB Compute Resource Group

    Select the interactive resource group that you created in the AnalyticDB for MySQLcluster. For more information, see Create and manage a resource group.

    Note

    The engine of the interactive resource group must be Spark.

    Resource group

    Select the resource group that passed the connectivity test when you bound the AnalyticDB for Spark compute resource.

    Compute CUs

    The node uses the default CU value. You can leave this parameter unchanged.

  2. Deploy the ADB Spark SQL node.

    After configuring the node, deploy it. For more information, see Node and workflow deployment.

Next steps

After the task is deployed, you can view its status in Operation Center. For more information, see Getting started with Operation Center.