ClickHouse SQL nodes allow you to use a distributed SQL query engine to process structured data. This improves the task efficiency. This topic describes how to create and use a ClickHouse SQL node to develop data.

Prerequisites

  • An EMR ClickHouse cluster or an ApsaraDB for ClickHouse cluster is created. The security group to which the cluster belongs has the following inbound rule:
    • Action: Allow
    • Protocol type: Custom TCP
    • Port range: 8898/8898
    • Authorization object: 100.104.0.0/16
  • The ClickHouse compute engine is associated with your DataWorks workspace. For more information, see Configure a workspace.
    Note After you associate the ClickHouse compute engine with your workspace, you can view the directory of the ClickHouse compute engine on the DataStudio page.
  • The exclusive resource group for scheduling is created and is associated with the VPC to which the ClickHouse cluster belongs. For more information, see Create and use an exclusive resource group for scheduling.

Limits

You can use only exclusive resource groups for scheduling to run ClickHouse SQL nodes.

Procedure

  1. Go to the DataStudio page.
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. In the top navigation bar, select the region where your workspace resides, find the workspace, and then click Data Analytics in the Actions column.
  2. Move the pointer over the Create icon and choose ClickHouse > Click SQL.
    Alternatively, right-click a workflow and select ClickHouse, and choose Create > Click SQL.
  3. In the Create Node dialog box, set the Node Name and Location parameters.
    Note The node name must be 1 to 128 characters in length and can contain letters, digits, underscores (_), and periods (.).
  4. Click Commit.
  5. Develop data in the code editor.
    You can run SQL nodes based on your business requirements. Sample code:
    CREATE DATABASE if not EXISTS ck_test;
    CREATE TABLE if not EXISTS  ck_test.first_table  (
    `product_code` String,
    `package_name` String
    ) ENGINE = MergeTree ORDER BY package_name SETTINGS index_granularity = 8192;
    insert into ck_test.first_table (product_code, package_name) VALUES ('1', '1');
    select * from ck_test.first_table;
  6. Save and commit the node.
    Notice You must set the Rerun and Parent Nodes parameters before you can commit the node.
    1. Click the Save icon in the toolbar to save the node.
    2. Click the Commit icon in the toolbar.
    3. In the Commit Node dialog box, enter your comments in the Change description field.
    4. Click OK.
    In a workspace in standard mode, you must click Deploy in the upper-right corner after you commit the node. For more information, see Deploy nodes.
  7. Test the node. For more information, see View auto triggered nodes.