All Products
Search
Document Center

DataWorks:CDH Presto node

Last Updated:Mar 26, 2026

A CDH Presto node is a distributed SQL query engine used for real-time data analytics on your CDH cluster in DataWorks.

Prerequisites

Before you begin, ensure that you have:

  • An Alibaba Cloud CDH cluster registered in DataWorks, with the Presto component installed and configured at bind time. For setup instructions, see Data Studio: Associate a CDH computing resource.

    Important

    The Presto component must be installed on the CDH cluster and its settings must be configured when you bind the cluster to your workspace.

  • (Optional) If using a RAM user account, the user added to the workspace with the Developer or Workspace Administrator role. The Workspace Administrator role grants extensive permissions — assign it with caution. For details, see Add members to a workspace. Root account users can skip this step.

  • A Hive data source configured in DataWorks with a passing connectivity test. For setup instructions, see Data Source Management.

Create a node

See Create a node for instructions.

Write SQL

Write your Presto SQL in the SQL editor. A minimal example:

SHOW TABLES;
SELECT * FROM userinfo;

Use scheduling parameters

The editor supports scheduling parameters using the ${variable_name} format. Define the variable in your code, then assign its value in Scheduling configuration > Scheduling parameters on the right panel. This lets you pass dynamic values to scheduled runs without modifying the code.

-- ${var} is resolved at runtime from Scheduling parameters
SELECT '${var}';

For the full list of supported parameter formats and expressions, see Sources and expressions of scheduling parameters.

Run and debug

  1. In Run Configuration > Compute resource, set the following:

    FieldWhat to set
    Compute resourceYour registered CDH cluster
    Resource groupA scheduling resource group that has passed the data source connectivity test. See Network connectivity solutions if none are available.
  2. Click Run on the toolbar.

Next steps

  • Schedule the node: To run the node on a recurring schedule, configure Time Property and related properties in the Scheduling configuration panel. See Node scheduling configuration.

  • Publish to production: Click the image icon to publish the node. Only published nodes run on a schedule in the production environment. See Publish a node.

  • Monitor runs: After publishing, track scheduled runs in O&M Center. See Getting started with Operation Center.