Build KingbaseES Data Pipelines in DataWorks via JDBC - DataWorks

Use the KingbaseES node in DataWorks to run SQL tasks against a KingbaseES database on a recurring schedule.

KingbaseES is a large relational database management system (RDBMS) that supports the SQL standard and is suited for enterprise applications that handle large data volumes and require high concurrency and high availability. For more information, see the official KingbaseES website.

Supported regions

China (Hangzhou), China (Shanghai), China (Beijing), China (Shenzhen), China (Chengdu), China (Hong Kong), Singapore, Malaysia (Kuala Lumpur), Germany (Frankfurt), US (Silicon Valley), and US (Virginia).

Prerequisites

Before you begin, ensure that you have:

A Business Flow in DataStudio. DataStudio organizes development by Business Flows. For more information, see Create a workflow.
A KingbaseES data source configured in DataWorks using a Java Database Connectivity (JDBC) connection string. For more information, see Data Source Management and KingbaseES data sources.
Network connectivity established between the data source and the resource group. For more information, see Network connection solutions.
(Optional; required for RAM users) The RAM user added to the workspace with the Develop or Workspace Administrator role assigned. Grant the Workspace Administrator role with caution due to its high privileges. For more information, see Add members to a workspace.

Step 1: Create a KingbaseES node

Go to the DataStudio page. Log in to the DataWorks console. In the top navigation bar, select the region. In the left-side navigation pane, choose Data Development and O\&M > Data Development. Select the workspace from the drop-down list and click Go to Data Development.
Right-click the target Business Flow and choose New Node > Database > KingbaseES.
In the Create Node dialog box, enter a Name and click Confirm. The node is created and the configuration page opens.

Step 2: Develop the KingbaseES task

(Optional) Select a data source

If the workspace has multiple KingbaseES data sources, select the one to use on the node configuration page. If only one exists, it is used by default.

KingbaseES nodes support only KingbaseES data sources created using a JDBC connection string.

Write SQL code

In the code editor, write the SQL to run.

Basic example

SELECT * FROM usertablename;

Example with scheduling parameters

DataWorks scheduling parameters let you pass dynamic values into your SQL at runtime. Define variables in the ${Variable name} format in your code, then assign values in the Parameters section on the Schedule tab.

SELECT '${var}'; -- var is assigned in the Parameters section of the Schedule tab

For supported formats and configuration details, see Supported formats of scheduling parameters and Configure and use scheduling parameters.

Step 3: Configure task scheduling

Click Scheduling Configuration on the right panel and set the scheduling properties.

Configure the Rerun Property and Upstream Dependent Node before submitting.

For a full reference of scheduling options, see Overview.

Step 4: Debug the task

(Optional) Select a debugging resource group and assign parameter values. Click the icon in the toolbar. In the Parameters dialog box, select a resource group and assign values to any scheduling parameters. For parameter assignment logic, see Task debugging process.
Save and run the task. Click the icon to save, then click the icon to run.
(Optional) Run a smoke test. Run a smoke test during or after submission to verify execution in the development environment. For more information, see Perform smoke testing.

Step 5: Submit and publish the task

Click the icon to save the node.
Click the icon to submit the node. In the Submit dialog box, enter a Change Description and select code review options.

- Configure the Rerun Property and Upstream Dependent Node before submitting. - If code review is enabled, a reviewer must approve the code before publication. For more information, see Code review.
In standard mode workspaces, click Publish in the upper-right corner to deploy to production. For more information, see Publish tasks.

What's next

You have created a KingbaseES node, written and debugged your SQL, configured a recurring schedule, and published the task to production. The task now runs automatically based on the schedule you defined.

To monitor task execution, click O\&M in the upper-right corner of the node configuration tab to open Operation Center. In Operation Center, view the scheduling and running status of recurring tasks. For more information, see Manage recurring tasks.