All Products
Search
Document Center

DataWorks:Create a node to synchronize data from MaxCompute

Last Updated:Jun 25, 2025

DataWorks provides the one-click synchronization feature to help you efficiently synchronize data from MaxCompute to a Hologres database. This capability makes the data available for analysis in Hologres with high performance and low latency. This topic describes how to configure and use the feature.

Background information

You can directly import MaxCompute data into a Hologres database by using SQL statements. This method typically provides better performance. For more information, see Import data from MaxCompute by using SQL statements.

Prerequisites

Create the node

Create the one-click MaxCompute data synchronization node.

Configure the node

Go to the one-click MaxCompute data synchronization node editing page and configure the node.

Select a source MaxCompute table

Configure the related parameters based on the information about the source table that you want to synchronize.

Parameter

Description

Project

The name of the MaxCompute project that you created.

Schema

The schema of the MaxCompute project. This parameter is displayed only when tenant-level schema syntax is enabled.

Table Name

The name of the source MaxCompute table that you want to synchronize.

Filter Condition

The system automatically generates a filter condition based on the partitioned table that you use. You can also adjust the filter condition. Only the data that meets the filter condition is retained.

Note

A filter condition is the content that follows the WHERE clause in an SQL statement.

Set a destination Hologres table

Configure the related parameters based on the information about the destination table to which you want to synchronize data.

Parameter

Description

Instance

The destination Hologres instance. After you configure the Hologres data source in the Connections, the system automatically identifies the specific instance.

Note

You can click Pages for Managing Destination next to Connections to go to the Holo console (instance monitoring), Slow Query, Active connection management, DB authorization, and User management pages.

Database

The database of the destination Hologres instance.

Schema

The schema of the destination Hologres instance.

Table Name

The name of the Hologres internal table. If this name already exists, Hologres processes the existing internal table based on the following policies.

  • If the new table is non-partitioned: Hologres deletes the existing table and its data, and creates a new table.

  • If the new table is partitioned: Hologres retains the existing table and its data, and creates a new partition subtable based on the partition value to import the data.

    Note

    If the structure of the existing table differs from that of the new table, such as differences in field names, field order, or the number of fields, the synchronization task will fail.

Synchronization Field

Select the table fields that you want to synchronize.

Partition Configurations

Select the partition in the source MaxCompute table from which you want to synchronize data.

Note

Hologres supports receiving data synchronized from a single-level partitioned MaxCompute table. If the source table contains multiple partition levels, you must specify a single partition field to be used as the first-level partition in Hologres. All other partition fields are mapped to regular columns in the destination table.

Index Configuration

Configure an index on the Hologres internal table to optimize queries on the synchronized MaxCompute data. For more information about how to create an index, see CREATE TABLE.

Configure other parameters

Parameter

Description

GUC Parameter

The GUC parameters that you need to set before you import MaxCompute data. For more information about the supported GUC parameters, see GUC parameters. Other SQL statements are not supported.

External Server

The default value is odps_server.

SQL Script

  • You cannot edit the generated SQL script. When the synchronization task configuration is updated, refresh the SQL script to generate a new SQL statement.

  • DataWorks parses the SQL statement that is used to run the current data synchronization job based on the synchronization configurations. You can go to the code editor of Hologres and run the data synchronization job in SQL mode.

Test the node

Configure the test information based on your business requirements.

  1. Configure the properties of the node for testing.

    You can configure Computing Resource and Resource Group in the Debugging Configurations section on the right side of the data synchronization node editing page. The following table describes the parameters.

    Parameter

    Description

    Computing Resource

    Select the Hologres computing resource that you attached.

    Virtual Warehouse

    Use the default value.

    Resource Group

    Select the resource group that has passed the connectivity test when you attached the Hologres computing resource.

    CUs for Computing

    Use the default CU value.

    Script Parameter

    If you define a variable in the filter condition in the format of ${Parameter name}, you need to configure Parameter Name and Parameter Value in the Script Parameter section. When the task is running, the variable is dynamically replaced with the actual value. For more information, see Node scheduling.

  2. When you test and run a node task, you can click Save and Run to run the data synchronization task.

Next steps

  • Node scheduling: If you want to periodically schedule and run a node in the project directory, you need to set Scheduling Policies in Properties on the right side of the node and configure the related scheduling properties.

  • Node publishing: If you want to publish a task to the production environment for execution, click the image icon to start the publishing process. This process publishes the task to the production environment. A node in the project directory is periodically scheduled only after the node is published to the production environment.

  • After MaxCompute data is synchronized, you can use HoloWeb to query the data in the Hologres table. For more information, see HoloWeb.

FAQ

  • Error message: get table columns occurs Invalid name:xxx.

  • Solution: Check whether the project name that you configured for the source is correct. Check whether the project name contains spaces or other characters.