Periodically run Oracle tasks in DataWorks - DataWorks - Alibaba Cloud Documentation Center

Background information

Oracle is a relational database management system (RDBMS) used for storing and processing data. It provides reliable, efficient, and secure data management and analysis services. For more information, see Oracle.

Prerequisites

The business process has been created.

Data Studio performs engine-specific development operations based on business flows. Before creating a node, create a business flow first. For more information, see Create a business flow.
An Oracle data source has been created.

Before you can access your Oracle database, you must add it to DataWorks as an Oracle data source. For more information, see Oracle data source.

Note
Oracle nodes support only Oracle data sources that are created with a JDBC connection string.
(Optional; required for Resource Access Management (RAM) users) The RAM user used for task development has been added to the target workspace and assigned either the Development or Workspace Administrator role (which grants broad permissions—assign with caution). For more information about adding members and granting permissions, see Add members to a workspace.

Limitations

Supported regions: China (Hangzhou), China (Shanghai), China (Beijing), China (Shenzhen), China (Chengdu), China (Hong Kong), Singapore, Malaysia (Kuala Lumpur), Germany (Frankfurt), US (Silicon Valley), and US (Virginia).

Step 1: Create Oracle node

Log on to the DataWorks console. In the target region, click Data Development and O&M > Data Development in the left-side navigation pane. Select a workspace from the drop-down list and click Go to Data Development.
Right-click the target workflow and choose Create Node > Database > Oracle.
In the Create Node dialog box, enter a Name for the node and click OK. After the node is created, you can develop and configure your task.

Step 2: Develop Oracle task

(Optional) Select Oracle data source

If your workspace has multiple Oracle data sources, you must select the appropriate one. If only one Oracle data source exists, it is used by default.

Note

Oracle nodes support only Oracle data sources that are created with a JDBC connection string.

Develop SQL code: Simple example

In the code editor, write the task. The following sample code queries basic information about all tables that the current user has permission to access:

SELECT * FROM ALL_TABLES;

Develop SQL code: Use scheduling parameters

DataWorks provides Scheduling Parameter that allow you to dynamically pass values to your code for scheduled tasks. You can define variables in your task code by using the ${variable_name} format. Then, go to Scheduling Settings > Scheduling Parameter in the right-side pane of the node editor to assign a value to the variable. For more information about supported formats and how to configure scheduling parameters, see Supported formats for scheduling parameters and Configure and use scheduling parameters.

Example: Query the owners, names, and creation dates of tables created from the previous month to the present. The scheduling parameter a is set to $[add_months(yyyymm,-1)], which represents the previous month.

The following code provides an example.

SELECT owner, object_name, created
FROM all_objects
WHERE object_type = 'TABLE' AND created >= TO_TIMESTAMP('${a}', 'YYYY-MM-DD');

Step 3: Configure task scheduling

To periodically run the node task, click Scheduling on the right side of the node editing page and configure scheduling settings based on your needs. For more information, see Overview of task scheduling properties.

Note

You must configure the node’s Rerun attribute and Parent Nodes before you can submit the node.

Step 4: Test task code

Perform the following test operations as needed to verify that the task behaves as expected.

(Optional) Select a resource group and assign custom parameter values.
- Click the icon in the toolbar. In the Parameter dialog box, select the schedule resource group for testing.
- If your task code uses scheduling parameter variables, assign values to them here for testing. For more information about parameter assignment logic, see Task debugging process.
Save and run the task code.

Click the icon in the toolbar to save your task code. Then click the icon to run the task.
(Optional) Perform smoke testing.

To run smoke testing in the development environment and verify that the scheduled node task executes as expected, perform smoke testing either during or after node submission. For more information, see Perform smoke testing.

Step 5: Submit and publish the task

After configuring the node task, submit and publish it. Once published, the node runs periodically based on its scheduling configuration.

Click the icon in the toolbar to save the node.
Click the icon in the toolbar to submit the node task.

In the Submission dialog box, enter a Change Description. Optionally, choose whether to require code review after submission.
Note
- You must configure the node’s Rerun attribute and Parent Nodes before you can submit the node.
- Code review helps ensure code quality and prevents errors caused by unreviewed code being published directly to production. If code review is enabled, the submitted node code must be approved by reviewers before it can be published. For more information, see Code review.

If you are using a workspace in standard mode, after successfully submitting the task, click Publish in the upper-right corner of the node editing page to deploy the task to the production environment. For more information, see Publish a task.

Next steps

After a task is published, it runs periodically based on its configuration. To monitor its status, click O&M in the upper-right corner of the node editor page to go to Operation Center and view its scheduling and running status. For more information, see Manage scheduled tasks.