All Products
Search
Document Center

DataWorks:Serverless Kyuubi node

Last Updated:Feb 05, 2026

In DataWorks, you can use the Serverless Kyuubi node to develop and periodically schedule Kyuubi tasks using EMR Serverless Spark computing resources and integrate them with other jobs.

Use cases

  • Computing resource limitations: You can only attach EMR Serverless Spark computing resources. Ensure that network connectivity is available between the resource group and the computing resources.

    Resource group: This type of task requires a Serverless resource group.

  • (Optional) If you are a Resource Access Management (RAM) user, ensure that you have been added to the workspace for task development and have been assigned the Developer or Workspace Administrator role. The Workspace Administrator role has extensive permissions. Grant this role with caution. For more information about adding members, see Add members to a workspace.

    If you use an Alibaba Cloud account, you can skip this step.

Create a node

For instructions, see Create a node.

Develop a node

Develop your code in the SQL editor. Define variables in your code using the ${variable_name} format and assign their values in the Scheduling configuration > Scheduling parameters section. This enables dynamic parameter substitution during scheduling. For more information about scheduling parameters, see Sources and expressions of scheduling parameters. The following code provides an example:

SHOW TABLES;
SELECT * FROM kyuubi040702 WHERE age >= '${a}'; -- The variable '${a}' can be set by using scheduling parameters.
Note

The maximum size of an SQL statement is 130 KB.

Debug the node

  1. In the Run Configuration pane, configure the compute resource and resource group.

    Parameter

    Description

    Compute resource

    Select a bound EMR Serverless Spark computing resource. The compute resource must be Configure a Kyuubi connection. If no compute resources are available, select Create compute resource from the drop-down list.

    Resource group

    Select a resource group that is bound to the workspace.

    Script parameter

    If you define variables in your code by using the ${parameter_name} format, you must specify a Parameter name and Parameter value here. At runtime, the system dynamically replaces the variables with their configured values. For more information, see Sources and expressions of scheduling parameters.

    ServerlessSpark node parameter

    Specify native Spark configuration properties. For more information, see open-source Spark properties and Custom SparkConf parameters. Use the following format: spark.eventLog.enabled : false.

    Note

    You can set global Spark parameters for all modules in a workspace. You can specify whether these global parameters take precedence over module-specific Spark parameters. For more information about how to configure global Spark parameters, see Configure global Spark parameters.

  2. In the toolbar at the top of the node editor, click Run.

    Important

    Before publishing the node, you must sync settings from Run Configuration > ServerlessSpark node parameter to Scheduling configuration > ServerlessSpark node parameter.

Next steps

  • Node scheduling configuration: To run a node on a recurring schedule, configure its Time Property and related scheduling properties in the Scheduling configuration panel on the right side of the page.

  • Publish a node: To publish a node to the production environment, click the image icon. Only nodes that are published to the production environment are scheduled.

  • Task O&M: After you publish a node, you can monitor its scheduled runs in the O&M Center. For more information, see Getting started with Operation Center.