All Products
Search
Document Center

DataWorks:Lindorm Spark SQL node

Last Updated:Mar 26, 2026

This topic describes how to use a Lindorm Spark SQL node in DataWorks to develop and periodically schedule Lindorm Spark SQL Tasks.

Overview

Lindorm is a distributed computing service built on a cloud-native architecture. It supports community-edition compute models, is compatible with Spark interfaces, and is deeply integrated with the Lindorm storage engine. Lindorm leverages the features and indexing capabilities of its underlying data storage to handle large-scale data processing, interactive analytics, machine learning, and graph computing.

A Lindorm Spark SQL node is the DataWorks development object that encapsulates your SQL logic. After you develop and debug a node, you publish it to make it available for periodic scheduling.

The typical workflow is: Create a node → Develop the SQL logic → Debug the node → Configure schedulingPublish to enable periodic runs.

Prerequisites

Before you begin, make sure that you have:

  • A Lindorm instance created and bound to the DataWorks workspace. For details, see Associate a Lindorm computing resource.

  • (Optional) If you are a Resource Access Management (RAM) user: membership in the relevant workspace with the Developer or Workspace Administrator role assigned. For details, see Add members to a workspace. Alibaba Cloud account users can skip this step.

Create a Lindorm Spark SQL node

See Create a Lindorm Spark SQL node.

Develop a Lindorm Spark SQL node

In the SQL editor, define variables using the ${variable_name} syntax. Assign values to these variables in the Run Configuration or 调度配置 panel on the right side of the node editor.

The following example creates a partitioned Parquet table and inserts daily incremental data into a specific partition. The variable ${var} is a scheduling parameter that controls which partition receives data at runtime — for example, setting it to 2025-04-25 inserts data into the 2025-04-25 partition of lindorm_table_job.

CREATE TABLE IF NOT EXISTS lindorm_table_job (
  id INT,
  name STRING,
  data STRING
)
USING parquet
PARTITIONED BY (partition_date DATE);

INSERT OVERWRITE TABLE lindorm_table_job PARTITION (partition_date='${var}')
VALUES (1, 'Alice', 'Sample data 1'), (2, 'Bob', 'Sample data 2');

For more information about scheduling parameters, see Sources and expressions of scheduling parameters.

For more Lindorm Spark SQL operations, see SQL reference.

Debug a Lindorm Spark SQL node

  1. In the Run Configuration panel on the right, configure the runtime properties.

    Parameter Description
    Compute Resource Select the Lindorm compute resource bound to this workspace.
    Lindorm Resource Group Select the Lindorm Resource Group specified when you bound the Lindorm compute resource.
    Resource Group Select the Resource Group that passed the connectivity test when you bound the Lindorm Spark compute resource.
    Script Parameter Provide a value for each variable defined with the ${variable_name} syntax in the node code. For details, see Sources and expressions of scheduling parameters.
    Spark Parameter Set runtime parameters for the Spark program. For more information about Spark configurations, see Configure parameters for jobs.
  2. Click Save, then click Run to execute the node.

What's next