ClickHouse SQL nodes allow you to use a distributed SQL query engine to process structured
data. This improves the task efficiency. This topic describes how to create and use
a ClickHouse SQL node to develop data.
Prerequisites
- An EMR ClickHouse cluster or an ApsaraDB for ClickHouse cluster is created. The security
group to which the cluster belongs has the following inbound rule:
- Action: Allow
- Protocol type: Custom TCP
- Port range: 8898/8898
- Authorization object: 100.104.0.0/16
- The ClickHouse compute engine is associated with your DataWorks workspace. For more
information, see Configure a workspace.
Note After you associate the ClickHouse compute engine with your workspace, you can view
the directory of the ClickHouse compute engine on the DataStudio page.
- The exclusive resource group for scheduling is created and is associated with the
VPC to which the ClickHouse cluster belongs. For more information, see Create and use an exclusive resource group for scheduling.
Limits
You can use only exclusive resource groups for scheduling to run ClickHouse SQL nodes.
Procedure
- Go to the DataStudio page.
- Log on to the DataWorks console.
- In the left-side navigation pane, click Workspaces.
- In the top navigation bar, select the region where the required workspace resides,
find the workspace, and then click Data Analytics.
- Move the pointer over the
icon and choose .
Alternatively, right-click a workflow and select ClickHouse, and choose .
- In the Create Node dialog box, set the Node Name and Location parameters.
Note The node name must be 1 to 128 characters in length and can contain letters, digits,
underscores (_), and periods (.).
- Click Commit.
- Develop data in the code editor.
You can run SQL nodes based on your business requirements. Sample code:
CREATE DATABASE if not EXISTS ck_test;
CREATE TABLE if not EXISTS ck_test.first_table (
`product_code` String,
`package_name` String
) ENGINE = MergeTree ORDER BY package_name SETTINGS index_granularity = 8192;
insert into ck_test.first_table (product_code, package_name) VALUES ('1', '1');
select * from ck_test.first_table;
- Save and commit the node.
Notice You must set the Rerun and Parent Nodes parameters before you can commit the node.
- Click the
icon in the toolbar to save the node.
- Click the
icon in the toolbar.
- In the Commit Node dialog box, enter your comments in the Change description field.
- Click OK.
In a workspace in standard mode, you must click
Deploy in the upper-right corner after you commit the node. For more information, see
Deploy nodes.
- Test the node. For more information, see View auto triggered nodes.