All Products
Search
Document Center

DataWorks:Node for synchronizing data to Hologres

Last Updated:Jul 25, 2025

DataWorks allows you to create a node to synchronize data from a single MaxCompute table to Hologres. This can help you efficiently perform big data analysis and real-time queries. This topic describes how to configure a node to easily synchronize data from MaxCompute to Hologres and fully utilize the high-performance query capabilities of Hologres.

Background information

When you run a node to synchronize data from a MaxCompute internal table to a Hologres internal table, the data is first imported into a Hologres foreign table and then synchronized from the foreign table to a Hologres internal table. Data synchronization from MaxCompute to a Hologres foreign table is implemented by executing the IMPORT FOREIGN SCHEMA statement.

Prerequisites

Limits

You can create a Hologres foreign table and read data from the foreign table only if the related MaxCompute internal table exists.

Create a synchronization node

You must first create a node used to synchronize data to Hologres and go to the configuration tab of the node. For more information, see Create an auto triggered node.

Manage the Hologres data source

After the data is synchronized to Hologres, you can perform the following operations to manage the Hologres data source on the configuration tab of the synchronization node:

  1. Select the Hologres data source that is generated after you associate the Hologres instance with the workspace as a computing resource from the Connections drop-down list.

  2. Click Pages for Managing Destination next to the drop-down list and perform operations on the Hologres instance to which the Hologres data source corresponds by selecting the following options:

    • Holo console (instance monitoring): Allows you to manage the Hologres instance in the Hologres console.

    • Slow Query: Allows you to view and analyze the historical slow queries of the Hologres instance in a visualized manner.

    • Active connection management: Allows you to diagnose and manage connections in the Hologres instance.

    • DB authorization: Allows you to create databases in the Hologres instance or grant permissions on the databases created in the Hologres instance.

    • User management: Allows you to use the user management module of the Hologres console to add users to or delete users from the Hologres instance and grant permissions to users.

Configure the synchronization node

After you select the Hologres data source, you can configure the synchronization node by referring to the following instructions:

Configure settings related to the source

You can configure the source based on the following parameter descriptions.

Parameter

Description

Source Object Type

The type of the object from which you want to synchronize data. The value of this parameter is fixed as MaxCompute Table.

Project

The name of the MaxCompute project from which you want to synchronize data.

Schema

The name of the MaxCompute schema that you want to use.

Table Name

The name of the table from which you want to synchronize data.

Filter Condition

The condition that you want to use to filter data. The system automatically generates a filter condition based on the partitioned table that you use. You can also modify the filter condition based on your business requirements. Data that meets the filter condition will be retained.

Note

A filter condition is the content of the clause after WHERE in an SQL statement.

Configure settings related to the destination

You can configure the destination based on the following parameter descriptions.

Parameter

Description

Instance

The name of the Hologres instance that you want to use. The system automatically matches the Hologres instance based on the Hologres data source that you select from the Connections drop-down list.

Database

The name of the Hologres database that you want to use. The system automatically matches the database based on the Hologres data source that you select from the Connections drop-down list.

Schema

The name of the Hologres schema to which the desired Hologres internal table belongs.

Table Name

The name of a Hologres internal table. You can configure this parameter based on your business requirements. If the table name that you specify already exists, the policy used to process the situation varies based on the table type.

  • New Non-partitioned Table: If you want to create a non-partitioned table and an internal table that has the same name as the table to be created already exists, the system deletes the existing internal table and its data and creates a table in the Hologres database.

  • New Partitioned Table: If you want to create a partitioned table and an internal table that has the same name as the table to be created already exists, the system retains the existing internal table and its data, creates a partition in the existing internal table based on the partition values, and then imports the synchronized data into the partition.

Note

If the schema of the new table is different from that of the existing table, an error is reported.

Fields

Synchronization Field

You can select the fields to which you want to write data and configure the data types of the fields in the Hologres internal table.

Partition Configurations

You can configure the partition key column of the Hologres internal table based on your business requirements.

Index Configuration

You can create an index for the Hologres internal table that stores the synchronized MaxCompute data to facilitate subsequent data queries. For information about how to create indexes, see CREATE TABLE.

  • Storage Mode: Hologres supports row-oriented storage, column-oriented storage, and row-column hybrid storage. You can configure the storage format of the table based on the usage scenarios of the table.

  • Lifecycle (s): The lifecycle of the table data. The lifecycle starts from the time when the data is written to the table for the first time. When the lifecycle is reached, the table data is cleared within a non-fixed period of time. The default lifecycle is Permanent.

  • Binlog: Specifies whether to enable the binary logging feature for the table. For more information, see Subscribe to Hologres binary logs.

  • Lifecycle of Binary Logs: The default lifecycle is Permanent.

  • Configure Field Properties: You can search for a field to view the information about the field and configure properties for the field.

Configure advanced parameters

You can configure GUC parameters and an external server in the Configure Advanced Settings section of the configuration tab of the synchronization node.

Parameter

Description

GUC Parameters

You must configure specific GUC parameters for the synchronization node. For information about the supported GUC parameters, see GUC parameters. Other SQL statements are not supported.

External Server

The default value is odps_server.

Debug the synchronization node

To debug and run the synchronization node, configure debugging information based on your business requirements.

  1. Configure properties for debugging the synchronization node.

    You can click Debugging Configurations in the right-side navigation pane of the configuration tab of the synchronization node, and configure the following parameters.

    Parameter

    Description

    Computing Resource

    Select the Hologres computing resource that is associated with the workspace.

    Resource Group

    Select the resource group that has passed the connectivity test when you associate the Hologres computing resource with the workspace.

    CUs for Computing

    Specify the number of CUs that you want to use to run the synchronization node. The default value is 0.25.

    Script Parameters

    If you define variables in the ${Parameter name} format in the filter condition, you must configure the Parameter Name and Parameter Value parameters in the Script Parameters section. When the synchronization node is run, the variables are replaced with actual values. For more information, see Node scheduling.

  2. To debug and run the synchronization node, click Save and Run.

What to do next

  • Node scheduling: If a node in a workspace directory needs to be periodically scheduled, you need to click Properties in the right-side navigation pane of the configuration tab of the node and configure the parameters in the Scheduling Policies section.

  • Node deployment: If you want to deploy a node to the production environment, you can click the image icon in the top toolbar of the configuration tab of the node to initiate the deployment process. Nodes in a workspace directory can be periodically scheduled only after they are deployed to the production environment.

Additional information

  • Field type mismatch: If you encounter field type mismatch issues when you configure the synchronization node, the node fails. You must check whether the data types of fields in the Hologres table are correctly configured. For information about mappings between MaxCompute data types and Hologres data types, see Data type mappings between MaxCompute and Hologres.

  • Inconsistency between the data that is actually synchronized from a partition and the original data in the partition: You must check whether the filter condition is correctly configured in the source.