All Products
Search
Document Center

DataWorks:Develop and schedule SQL tasks with database nodes

Last Updated:Mar 26, 2026

Database nodes let you run SQL tasks against external relational databases, schedule them on a recurring basis, and integrate them into broader DataWorks workflows.

Prerequisites

Before you begin, ensure that you have:

  • A workspace with the Developer or Workspace Administrator role assigned to your RAM user (optional for RAM users). The Workspace Administrator role carries broad permissions — assign it with caution. For more information, see Add members to a workspace.

  • A DataWorks data source created using a JDBC connection string. Other connection types are not supported for database nodes. For more information, see Data Source Management.

  • Network connectivity confirmed between your serverless resource group and the data source. For more information, see Network connectivity solutions.

  • A data source that supports database nodes. See Supported data sources.

  • A database node already created. For more information, see Create a node for a scheduled workflow.

Step 1: Develop the database node

Select a data source

In the Select Data Source drop-down list, click image to open the dialog box. Select the data source to use for task development. If the data source you need is not listed, click Add Data Source to add it.

image
In a Standard mode workspace, the list only shows data sources configured for both the development and production environments. Database nodes support only data sources created using a JDBC connection string.

Write the SQL script

In the SQL editor, write the SQL statements for your task. The following example queries a table and uses a placeholder parameter:

SELECT * FROM your_table_name;  -- Query the table.
SELECT '${var}';                -- Use a placeholder parameter.

Write SQL based on the syntax supported by your configured data source.

Stored procedures are not supported in DataWorks script development, even if the underlying database supports them natively.

Configure the resource group and parameters

Click Run Configuration to open the debugging panel. Configure the following settings before running:

SettingRequiredDescription
DataWorks Resource GroupRequiredA serverless resource group with network connectivity to your data source. Required for both public network and VPC access. See Network connectivity solutions.
Script ParameterOptionalValues assigned to placeholder parameters (such as ${var}) defined in your SQL script.
imageimage

Save and run

Click image to save the node, then click image to run the SQL script and verify it works as expected.

Configure scheduling

After the SQL script runs successfully, click scheduling configuration on the right side of the SQL editor to set the node's schedule. For more information, see Node scheduling configuration.

Step 2: Publish and maintain the node

  1. Submit and publish the database node to the production environment. For more information, see Node and workflow deployment.

  2. After publishing, the task runs automatically on the configured schedule. To monitor and manage it, go to Operations Center > Task O&M > Scheduled Task O&M > Scheduled Task. For more information, see Getting started with Operation Center.

Supported data sources

Database nodes require data sources created using a JDBC connection string. The following databases are supported:

DatabaseDescription
MySQLA widely used relational database management system (RDBMS) known for its small footprint, high speed, and low cost. See MySQL.
SQL ServerA Microsoft RDBMS providing reliable, efficient, and secure data management and analysis. See SQL Server.
OracleAn RDBMS providing reliable, efficient, and secure data management and analysis. See Oracle.
PostgreSQLAn open-source RDBMS with a robust data model, high extensibility, and a rich set of features. See PostgreSQL.
DRDSA distributed database service that scales a MySQL-compatible relational database horizontally to handle large data volumes and traffic. See Product overview.
PolarDB for MySQLA cloud-native database built on a compute-storage separation architecture, 100% compatible with MySQL and PostgreSQL, and highly compatible with Oracle syntax. See What is PolarDB for MySQL Enterprise Edition?.
PolarDB for PostgreSQLA cloud-native relational database 100% compatible with PostgreSQL and highly compatible with Oracle syntax. Supports the Ganos multi-model spatio-temporal engine and the open-source PostGIS geospatial engine. See What is PolarDB for PostgreSQL Enterprise Edition?.
Apache DorisA high-performance, real-time analytical database suited for reporting, ad hoc queries, and federated queries on data lakes. See Introduction to Doris.
MariaDBAn open-source RDBMS that is highly compatible with MySQL and can serve as a drop-in replacement. See MariaDB.
SelectDBA multi-cloud native real-time data warehouse built on Apache Doris, designed for enterprise-grade big data analysis. See SelectDB.
Amazon RedshiftA fully managed, petabyte-scale cloud data warehouse service. Supports serverless access without provisioning. See Amazon Redshift.
SAP HANAA high-performance in-memory database and application platform with enterprise-grade in-memory computing capabilities. See SAP HANA.
VerticaA high-performance columnar storage database management system (DBMS) optimized for big data analytics and real-time queries. See Vertica.
DM (Dameng)An OLTP database for business systems, combining distributed architecture, elastic computing, and cloud computing for flexible, reliable, and secure operation. See DM (Dameng).
KingbaseESA large-scale RDBMS supporting the SQL standard, designed for enterprise workloads requiring high concurrency and high availability (HA). See KingbaseES.
OceanBaseA distributed relational database developed by Ant Group and Alibaba Cloud, providing strong consistency, high availability (HA), online scalability, and broad SQL compatibility. See What is OceanBase Database?.
DB2An IBM RDBMS for complex queries and transaction processing involving high throughput, large datasets, and data warehousing. See DB2.
GBase 8aAn RDBMS for large-scale data storage and high-concurrency read/write operations, widely used in government, finance, telecommunications, and energy sectors. Supports data partitioning, load balancing, disaster recovery, and backup. See GBase 8a.

What's next