To run data synchronization or data development tasks against an ApsaraDB for ClickHouse cluster in DataWorks, add the cluster as a computing resource. This establishes the connection between DataWorks and the cluster.
Prerequisites
Before you begin, make sure you have:
-
An ApsaraDB for ClickHouse cluster. Create the cluster in the same region as your DataWorks workspace. If they are in different regions, the cluster can only be used for data synchronization tasks — not for computing tasks in Data Studio or Operation Center.
-
A DataWorks workspace with your RAM user added and assigned the Workspace Administrator role.
-
A resource group associated with the workspace, with network connectivity to the ClickHouse cluster confirmed (see Configure network access).
Limitations
-
SSL restriction: If SSL authentication is enabled for the ClickHouse compute engine, the cluster cannot be used for data development or periodic scheduling tasks.
-
Supported regions: China (Hangzhou), China (Shanghai), China (Beijing), China (Shenzhen), China (Chengdu), China (Hong Kong), Japan (Tokyo), Singapore, Malaysia (Kuala Lumpur), and Indonesia (Jakarta).
-
Permissions:
Operator Required permissions Alibaba Cloud account None RAM user or RAM role O&M or Workspace Administrator role in the workspace, or the AliyunDataWorksFullAccesspermission. See Grant a user the Workspace Administrator permissions.
Configure network access
By default, ApsaraDB for ClickHouse clusters deny connections from all IP addresses. Before associating the computing resource, add your resource group's IP addresses to the cluster whitelist.
Add the vSwitch CIDR block of the resource group, the EIP of a legacy resource group, or the EIP of the VPC associated to a Serverless resource group to the ApsaraDB for ClickHouse cluster whitelist. For information about how to obtain these values, see Add IP addresses to the DataWorks whitelist.
For legacy exclusive resource groups, add IP addresses for all three group types relevant to your scenario: exclusive resource group for integration, exclusive resource group for scheduling, and exclusive resource group for services.
If you skip this step, the connectivity test fails.
Associate a ClickHouse computing resource
The steps differ depending on whether your workspace uses Use Data Studio (New Version).
New Data Studio
Use this procedure if your workspace uses Use Data Studio (New Version).
Go to the computing resources page
-
Log on to the DataWorks console. In the top navigation bar, select the target region. In the left navigation pane, choose More > Management Center. Select the workspace from the drop-down list and click Go To Management Center.
-
In the left navigation pane, click Computing Resources.
Configure and associate the computing resource
-
Click Associate Computing Resource.
-
On the Associate Computing Resource page, set the computing resource type to ClickHouse. The Associate ClickHouse Computing Resource page opens.
-
Set the parameters described in the following table.
If SSL authentication is enabled for the ClickHouse compute engine, the cluster cannot be used for data development or periodic scheduling tasks.
Parameter Description Required Configuration Mode Only Connection String Mode is supported. Yes JDBC URL Connection string in the format jdbc:clickhouse://<ip>:<port>/<dbname>. See the JDBC URL parameter details below.Yes Username and password The account and password for your ApsaraDB for ClickHouse cluster. Yes Authentication Method Select No Authentication or SSL Authentication. Yes SSL CA Certificate If you selected SSL Authentication, click Add Authentication File and upload the CA certificate from the cluster's Cluster Information page. Only for SSL Authentication Computing Resource Instance Name A custom name for this computing resource instance. Yes JDBC URL parameter details:
Parameter Value <ip>The VPC Address or Public Address from the ClickHouse Cluster Information page. For example: cc-bp1xxx..clickhouse.ads.aliyuncs.com.<port>No Authentication: use the VPC HTTP Port Number ( 8123). SSL Authentication: use the VPC HTTPS Port Number (8443). Both values are on the Cluster Information page.<dbname>The ClickHouse database to connect to. Defaults to default. Create a new database if needed. -
In the connection configuration section, select the resource group for running ClickHouse node tasks. Click Test Network Connectivity to verify connectivity. For troubleshooting, see Overview of network connection solutions.
-
Click OK.
Legacy DataStudio
Use this procedure if your workspace does not use Data Studio (New Version).
Go to the computing resources page
-
Log on to the DataWorks console. In the top navigation bar, select the target region. In the left navigation pane, choose Data Development and O\&M > Data Development. Select the workspace from the drop-down list and click Go to Data Development.
-
In the left navigation pane, click the
icon to open the Computing Resources page.
Configure and associate the computing resource
-
Click Create Computing Resource.
-
On the Create Computing Resource page, set the computing resource type to ClickHouse. The Create Computing Resource configuration page opens.
-
Set the parameters described in the following table.
If SSL authentication is enabled for the ClickHouse compute engine, the cluster cannot be used for data development or periodic scheduling tasks.
Parameter Description Required Data Source Name A custom name for the computing resource. Yes Configuration Mode Only Connection String Mode is supported. Yes Host Address/IP Address The VPC Address or Public Address from the Cluster Information page. For example: cc-bp1xxx..clickhouse.ads.aliyuncs.com.Yes Port No Authentication: use the VPC HTTP Port ( 8123). SSL Authentication: use the VPC HTTPS Port (8443). Both values are on the Cluster Information page.Yes Database Name The ClickHouse database to connect to. Defaults to default. Create a new database if needed.Yes Username and password The account and password for your ApsaraDB for ClickHouse cluster. Yes Version The version of the cluster to associate. Yes Advanced Parameters Optional properties. Click Add Property to add entries. No Authentication Method Select No Authentication or SSL Authentication. Yes SSL CA Certificate If you selected SSL Authentication, click Add Authentication File and upload the CA certificate from the cluster's Cluster Information page. Only for SSL Authentication -
In the connection configuration section, select the resource group for running ClickHouse tasks. Click Test Network Connectivity to verify connectivity. For troubleshooting, see Overview of network connection solutions.
-
Click Create and Associate Computing Resource with DataStudio.
What's next
After the computing resource is associated, create tasks to work with your ClickHouse data:
| DataStudio version | Task type | Use for |
|---|---|---|
| New Data Studio | Batch synchronization node | Data synchronization |
| New Data Studio | ClickHouse SQL node | Data development |
| Legacy DataStudio | Data Integration > Offline Synchronization node | Data synchronization |
FAQ
Error: "not support data sync channel, error code: 0001."
The JDBC URL contains spaces or extra characters. Check the URL and remove any whitespace or unexpected characters, then retry.
Error: `ru.yandex.clickhouse.except.ClickHouseUnknownException: ClickHouse exception, code: 1002`
The IP address in the JDBC URL or Host Address/IP Address field is incorrect. Verify it matches the VPC Address or Public Address shown on the ClickHouse Cluster Information page.