PolarDB for MySQL nodes let you run SQL tasks against a PolarDB for MySQL database directly from DataWorks, schedule them to run periodically, and chain them with other jobs in your pipeline.
Background information
PolarDB for MySQL is a new-generation, cloud-native database developed by Alibaba Cloud. It uses a storage-compute decoupled architecture and combines the advantages of software and hardware to provide a highly elastic, high-performance, and secure database service with mass storage. The service is 100% compatible with the MySQL and PostgreSQL ecosystems and is highly compatible with Oracle syntax. For more information, see PolarDB for MySQL.
Prerequisites
Before you begin, ensure that you have:
-
A Business Flow. DataStudio organizes all development work inside Business Flows. If you haven't created one yet, see Create a workflow.
-
A PolarDB for MySQL data source configured with a Java Database Connectivity (JDBC) connection string. This is the only supported connection type. To add a data source, see Data Source Management. For details on how DataWorks uses PolarDB data sources, see PolarDB data source.
-
Network connectivity between the data source and the resource group. Ensure that the data source can connect to the resource group that you want to use. See Network connection solutions.
-
(RAM users only) Workspace membership with the Develop or Workspace Administrator role. Grant the Workspace Administrator role with caution — it carries elevated privileges. See Add members to a workspace.
Supported regions
PolarDB for MySQL nodes are available in the following regions: China (Hangzhou), China (Shanghai), China (Beijing), China (Shenzhen), China (Chengdu), China (Hong Kong), Singapore, Malaysia (Kuala Lumpur), Germany (Frankfurt), US (Silicon Valley), and US (Virginia).
Step 1: Create a PolarDB for MySQL node
-
Log on to the DataWorks console. In the top navigation bar, select a region. In the left-side navigation pane, choose Data Development and O&M > Data Development. Select your workspace from the drop-down list and click Go to Data Development.
-
In DataStudio, right-click the target Business Flow and choose New Node > Database > PolarDB for MySQL.
-
In the Create Node dialog box, enter a Name and click OK.
Step 2: Develop the task
(Optional) Select a data source
If your workspace has multiple PolarDB for MySQL data sources, select the one you want from the drop-down on the node editing page. If only one data source exists, it is selected by default.
Write SQL
In the code editor, write the SQL to execute. For example:
SELECT * FROM <your_table_name>;
Replace <your_table_name> with the actual table name.
Use scheduling parameters
To pass dynamic values into your SQL at run time, define variables using the ${variable_name} format:
SELECT '${var}'; -- You can use this with scheduling parameters.
Assign values to the variable in the Scheduling Parameters section under Schedule in the right panel. For supported formats, see Supported formats of scheduling parameters. For step-by-step configuration, see Configure and use scheduling parameters.
Step 3: Configure scheduling
Click Scheduling Configuration in the right panel and set the schedule properties. For a full reference, see Scheduling overview.
Configure the Rerun Property and Upstream Dependent Node before submitting the node.
Step 4: Debug the task
-
(Optional) Choose a resource group and assign parameter values for debugging. Click the
icon in the toolbar. In the Parameters dialog box, select a resource group and assign values to any scheduling parameters. For how parameter assignment works during debugging, see Task debugging process. -
Save and run. Click the
icon to save, then click the
icon to run. -
(Optional) Run a smoke test. Run a smoke test to verify the task executes correctly in the development environment before submitting. See Perform smoke testing.
Step 5: Submit and publish the task
-
Click the
icon to save the node. -
Click the
icon to submit. In the Submit dialog box, enter a Change Description and configure code review options.If code review is enabled, a reviewer must approve the code before it can be published. See Code review.
-
In standard mode workspaces, click Publish in the upper-right corner to deploy the node to production. See Publish tasks.
What's next
After the node is published, it runs on a recurring schedule based on its configuration. Click O&M in the upper-right corner to open Operation Center, where you can monitor the scheduling and execution status of recurring tasks. See Manage recurring tasks.