All Products
Search
Document Center

DataWorks:Create a CDH Presto node

Last Updated:Mar 25, 2026

A Cloudera's Distribution Including Apache Hadoop (CDH) Presto node lets you run distributed SQL queries against real-time data in your CDH environment directly from DataWorks DataStudio. Use this workflow to create the node, write Presto SQL, configure scheduling, and debug and deploy your task.

Prerequisites

Before you begin, ensure that you have:

Limitations

CDH Presto tasks run on serverless resource groups or old-version exclusive resource groups. We recommend that you run tasks on serverless resource groups.

Step 1: Create a CDH Presto node

  1. Go to the DataStudio page. Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose Data Development and O\&M \> Data Development. Select your workspace from the drop-down list and click Go to Data Development.

  2. On the DataStudio page, find the desired workflow, right-click the workflow name, and choose Create Node \> CDH \> CDH Presto.

    Alternatively, move the pointer over the Create icon at the top of the Scheduled Workflow pane and create a CDH node as prompted.
  3. In the Create Node dialog box, configure the Name parameter and click Confirm.

Step 2: Develop a Presto task

Double-click the node name to open its configuration tab, then perform the following operations.

Select a CDH compute engine instance (optional)

If multiple CDH clusters are registered to your workspace, select one from the Engine Instance CDH drop-down list. If only one CDH cluster is registered, skip this step.

image.png

Write SQL code

In the SQL editor, enter your Presto SQL statements. Example:

show tables;

select * from userinfo;

Use scheduling parameters

DataWorks scheduling parameters let you substitute dynamic values into task code at run time. Define variables in your SQL using the ${Variable} format, then assign values in the Scheduling Parameter section of the Properties tab.

select '${var}'; -- Replace var with a scheduling parameter value.

For supported formats, see Supported formats of scheduling parameters.

Step 3: Configure task scheduling properties

Click Properties in the right-side navigation pane to configure how and when the task runs.

Configuration areaWhat to setReference
Basic propertiesBasic task settingsConfigure basic properties
Scheduling cycle and rerunRun frequency, rerun policy, and parent node dependenciesConfigure time properties
Scheduling dependenciesSame-cycle dependencies between nodesConfigure same-cycle scheduling dependencies
Resource propertiesResource group assignment for schedulingConfigure the resource property
Important

Configure Rerun and Parent Nodes on the Properties tab before you commit the task.

If the node needs access to the internet or a virtual private cloud (VPC), select the resource group for scheduling connected to the node. See Network connectivity solutions.

Step 4: Debug task code

  1. (Optional) Select a resource group and assign values to custom parameters.

  2. Save and run the SQL statements. Click the 保存 icon to save, then click the 运行 icon to run.

  3. (Optional) Perform smoke testing. You can perform smoke testing on the task in the development environment when you commit the task or after you commit the task. See Perform smoke testing.

What's next

Commit and deploy the task:

  1. Click the 保存 icon to save the task.

  2. Click the 提交 icon to commit the task.

  3. In the Submit dialog box, fill in the Change description field and click Confirm.

  4. If your workspace is in standard mode, deploy the task to the production environment: click Deploy in the top navigation bar of DataStudio. See Deploy tasks.

View and monitor the task:

  1. Click Operation Center in the upper-right corner of the node configuration tab to go to Operation Center in the production environment.

  2. View your scheduled task. See View and manage auto triggered tasks.

To view more information about the task, click Operation Center in the top navigation bar of the DataStudio page. For more information, see Overview.