All Products
Search
Document Center

DataWorks:Database nodes

Last Updated:Dec 24, 2025

DataWorks lets you create multiple types of database nodes for SQL task development, periodic scheduling, and integration with other jobs.

Prerequisites

  • (Optional) Add a Resource Access Management (RAM) user to the workspace.

    A RAM user for task development must be added to the workspace and granted either the Developer or Workspace Administrator role. The Workspace Administrator role has extensive permissions. Grant this role with caution. For more information about adding members and granting permissions, see Add members to a workspace.

  • Create a DataWorks data source.

    • Ensure that the data source can connect to the serverless resource group. For more information, see Network connectivity solutions.

    • Ensure that the data source is created using a Java Database Connectivity (JDBC) connection string. For more information, see Data source management.

    • Ensure that the data source is suitable for creating database nodes. For more information, see Supported data sources.

  • Before you develop a database node, you must create a node for the corresponding data source. For more information, see Create nodes for a scheduling workflow.

Step 1: Develop a database node

  1. After you create a database node, you can configure it for development.

    1. Select a data source.

      From the Select Data Source drop-down list, click image to open the data source selection dialog box. In the dialog box, select the data source for your task. If the required data source is not available, click Add Data Source to add it.image

      Note
      • In a standard mode workspace, only data sources configured for both the development and production environments are displayed.

      • Database nodes support development only with data sources created using a connection string.

    2. Develop an SQL script.

      In the SQL editor, write SQL statements to create your task. The following code shows an example of a simple query SQL statement.

      SELECT * FROM you_table_name;  --Query a table.
      SELECT '${var}'; --Configure a placeholder parameter.
      Note

      You can write statements using the syntax that is supported by the configured data source.

    3. Configure a resource group for debugging.

      Click Debugging Configurations. From the Computing Resource > DataWorks Resource Group drop-down list, select a serverless resource group that can connect to your data source.image

      Note

      To access a data source over the public network or in a VPC environment, you must use a scheduling resource group that has passed the connectivity test with the data source. For more information, see Network connectivity solutions.

    4. Configure debugging parameters.

      Click Debugging Configurations. In the Script Parameters section, assign values to the parameters that are configured in the database node script.

      image

    5. After you configure the debugging settings, click image to save the SQL node. Then, click image to run the SQL script and verify that it runs as expected.

  2. After you debug the SQL script, click Scheduling Configurations on the right side of the SQL editor to configure scheduling properties for the database node. For more information, see Configure scheduling properties for a node.

Step 2: Publish the database node and perform O&M

  1. After you configure the scheduling properties, submit and publish the database node to the production environment. For more information, see Publish a node or workflow.

  2. After the task is published, it runs periodically based on its configured scheduling properties. You can go to Operation Center > Task O&M > Auto Triggered Task O&M > Auto Triggered Tasks to view the published auto triggered task and perform O&M operations. For more information, see Get started with Operation Center.

Supported data sources

DataWorks supports the creation of database nodes for multiple types of data sources. The following types of data sources are supported:

Note
  • Data sources for database nodes must be created using a JDBC connection string.

  • Although some databases support stored procedures, you cannot use them in DataWorks Data Development.

Creating database node data sources

Data source type

Data Source Overview

MySQL

MySQL is a popular relational database management system (RDBMS) for storing and processing data. It is small, fast, and has a low total cost of ownership. For more information, see MySQL.

SQL Server

SQL Server is an RDBMS for storing and processing data. It provides reliable, efficient, and secure data management and analysis services. For more information, see SQL Server.

Oracle

Oracle is an RDBMS for storing and processing data. It provides reliable, efficient, and secure data management and analysis services. For more information, see Oracle.

PostgreSQL

PostgreSQL is a powerful and flexible open source RDBMS. It has a powerful data model, high scalability and stability, and rich core features. For more information, see PostgreSQL.

DRDS

DRDS is a distributed database service. It lets you horizontally scale a relational database to a distributed system that supports mass data storage and access while maintaining the original features of a relational database such as MySQL. For more information, see Product overview.

PolarDB MySQL

PolarDB for MySQL is a new-generation cloud-native database developed by Alibaba Cloud. It uses a storage-compute decoupled architecture and combines the benefits of software and hardware to provide a database service with high elasticity, high performance, mass storage, and high security and reliability. It is 100% compatible with MySQL and PostgreSQL ecosystems and highly compatible with Oracle syntax. For more information, see What is PolarDB for MySQL Enterprise Edition.

PolarDB PostgreSQL

The cloud-native database PolarDB for PostgreSQL is a cloud-native relational database product developed by Alibaba Cloud. It is 100% compatible with PostgreSQL and highly compatible with Oracle syntax. It provides a database service with rapid elasticity, high performance, mass storage, and high security and reliability. It also supports Ganos, a multi-dimensional and multi-model spatiotemporal information engine developed by Alibaba Cloud, and PostGIS, an open source geographic information engine. For more information, see What is PolarDB for PostgreSQL Enterprise Edition.

Doris

Apache Doris is a high-performance, real-time analytic database that meets the requirements of scenarios such as report analysis, ad hoc queries, and data lake federated query acceleration. For more information, see Introduction to Apache Doris.

MariaDB

MariaDB is an open source RDBMS that is highly compatible with MySQL. It can seamlessly replace MySQL. After you uninstall MySQL, you can install and use MariaDB in the original location of MySQL without changing the application code. For more information, see MariaDB.

SelectDB

SelectDB is a new-generation multi-cloud native real-time data warehouse built on Apache Doris. It focuses on meeting the real-time big data analysis needs of enterprises and provides cost-effective and easy-to-use data analysis services. For more information, see SelectDB.

Redshift

Amazon Redshift is a fully managed, petabyte-scale data warehouse service on the cloud platform. You can access and analyze data through Amazon Redshift Serverless without performing any configuration operations on a provisioned data warehouse. For more information, see Amazon Redshift.

SAP HANA

SAP HANA is a high-performance in-memory database and application platform. It combines database, data processing, and application platform features to provide enterprise-level in-memory computing capabilities. For more information, see SAP HANA.

Vertica

Vertica is a high-performance, column-oriented database management system (DBMS) that can process and query large-scale datasets at high speeds. It is mainly used for big data analysis and real-time queries. For more information, see the Vertica official website.

DM

Dameng (DM) is an online transaction processing (OLTP) type database integrated into business systems. It combines the advantages of distributed, elastic computing, and cloud computing, and is flexible, easy to use, reliable, and highly secure. For more information, see the Dameng (DM) official website.

KingbaseES

KingbaseES is a large RDBMS that supports SQL standards. It is suitable for enterprise-level application scenarios that require processing large amounts of data, high concurrency, and high availability. For more information, see the KingbaseES official website.

OceanBase

OceanBase is a distributed relational database developed by Ant Group and Alibaba Cloud. It features strong data consistency, high availability, high performance, online scalability, and low costs. It is also highly compatible with SQL standards and mainstream relational databases. For more information, see What is OceanBase.

DB2

DB2 is an RDBMS used to store, retrieve, and manage data. It is suitable for processing complex queries and transaction processing for high-throughput, large datasets, and data warehouses. For more information, see the DB2 official website.

GBase 8a

GBase 8a is an RDBMS that supports large data volume storage and high-concurrency read and write capabilities. It is commonly used in sectors such as government, finance, telecommunications, and energy. GBase 8a supports SQL standards and provides a series of enterprise-level features, such as data partitioning, load balancing, and disaster recovery and backup. For more information, see the GBase 8a official website.