All Products
Search
Document Center

DataWorks:Serverless Kyuubi node

Last Updated:Mar 26, 2026

The Serverless Kyuubi node lets you run Kyuubi SQL tasks on EMR Serverless Spark computing resources directly from DataWorks—without provisioning or managing infrastructure. Write SQL once, schedule it on a recurring basis, and integrate it with other nodes in your workflow.

This topic covers:

Prerequisites

Before you begin, ensure that you have:

  • A Serverless resource group bound to your workspace. This node type requires a Serverless resource group.

  • An EMR Serverless Spark compute resource with a Kyuubi connection configured. Network connectivity must be available between the resource group and the compute resource.

  • (Resource Access Management (RAM) users only) Your Alibaba Cloud account administrator must add you to the workspace and assign you the Developer or Workspace Administrator role. The Workspace Administrator role carries extensive permissions—request only what you need. For details, see Add members to a workspace.

If you use an Alibaba Cloud account rather than a RAM user, you can skip the role assignment step.

Create a node

For instructions, see Create a node.

Develop a node

Write your SQL in the SQL editor. To pass dynamic values at runtime, define variables using the ${variable_name} format, then assign their values in Scheduling configuration > Scheduling parameters.

SHOW TABLES;
SELECT * FROM kyuubi040702 WHERE age >= '${a}'; -- Set ${a} via scheduling parameters at runtime.

For details on scheduling parameter syntax and expressions, see Sources and expressions of scheduling parameters.

The maximum size of an SQL statement is 130 KB.

Run and debug a node

  1. In the Run Configuration pane, configure the following settings:

    ParameterDescription
    Compute resourceSelect a bound EMR Serverless Spark compute resource. The resource must have a Kyuubi connection configured. If no compute resources are available, select Create compute resource from the drop-down list.
    Resource groupSelect a resource group bound to the workspace.
    Script parameterIf your code uses ${parameter_name} variables, specify a Parameter name and Parameter value here. The system replaces variables with these values at runtime. For details, see Sources and expressions of scheduling parameters.
    ServerlessSpark node parameterSpecify native Spark configuration properties in the format spark.eventLog.enabled : false. Supported values come from open-source Spark properties and Custom SparkConf parameters. You can also set workspace-level global Spark parameters that apply to all nodes. To configure them and control whether they override node-specific parameters, see Configure global Spark parameters.
  2. In the toolbar at the top of the node editor, click Run.

Important

Before publishing the node, sync your ServerlessSpark node parameter settings from Run Configuration to Scheduling configuration > ServerlessSpark node parameter.

What's next

  • Set up a recurring schedule: Configure Time Property and other scheduling settings in the Scheduling configuration panel. See Node scheduling configuration.

  • Publish to production: Click the image icon to publish the node. Only nodes published to the production environment are scheduled.

  • Monitor runs: After publishing, track scheduled runs in Operation Center (O&M Center). See Getting started with Operation Center.