All Products
Search
Document Center

DataWorks:Synchronize data to Hologres

Last Updated:Mar 27, 2026

Use a DataWorks data synchronization node to synchronize data from a single MaxCompute table into Hologres for real-time queries and big data analytics.

How it works

The synchronization follows a two-stage process:

  1. DataWorks imports the MaxCompute Internal Table into a Hologres External Table using the IMPORT FOREIGN SCHEMA command.

  2. Data is then synchronized from the External Table into a Hologres Internal Table.

Prerequisites

Before you begin, ensure that you have:

Limitations

An External Table can only be created and read when the MaxCompute source table exists.

Synchronize a MaxCompute table to Hologres

Before you start, note that the node configuration involves four sets of decisions:

Decision area

What you configure

Destination data source

Which Hologres instance and database to write to

Source table

Which MaxCompute project, schema, and table to read from, and any row filter

Destination table

Target schema, table name, fields, partitioning, and index options

Run configuration

Compute resource and scheduling parameters

Step 1: Create a synchronization node

Create a synchronization node for Hologres and open its configuration page before proceeding.

Step 2: Select the destination data source

On the node configuration page:

  1. From the Data Source dropdown, select the Hologres data source you have bound.

  2. Click Destination Management to open the management dialog. The available operations are:

    Operation

    Description

    HoloWeb (Instance Monitoring)

    Manage the Hologres instance in the HoloWeb console

    Slow Query

    View and analyze historical slow queries

    Active Connection Management

    Diagnose and manage connections to the instance

    Database Authorization

    Add a database or grant database permissions

    User management

    Add or remove users and manage permissions in HoloWeb

Step 3: Select the MaxCompute source table

Parameter

Description

Source Object Type

The default value is MaxCompute Table.

Project

The MaxCompute project that contains the data to synchronize.

Schema

The schema within the project.

Table

The table to synchronize.

Filter Condition

Filters which rows to synchronize—equivalent to the WHERE clause in SQL. For partitioned tables, the system generates this automatically; you can modify it as needed. To use dynamic values, define variables in the format ${ParameterName} and configure them in the Run Configuration tab.

Step 4: Configure the Hologres destination table

Parameter

Description

Instance

Populated automatically from the data source selected in step 2.

Database

Populated automatically from the data source selected in step 2.

Schema

The schema to which the Hologres Internal Table belongs.

Table

The name for the Hologres Internal Table. See table naming behavior below.

Synchronization Field (under Field)

The fields to include in the synchronization and their data types in the destination table.

Partition Configuration

The partition key fields for the destination table.

Index Configuration

Index settings for faster queries. See index options below.

Table naming behavior

If a table with the same name already exists in Hologres, the behavior depends on the table type:

Table type

Existing table exists

Behavior

Non-partitioned

Yes

Hologres deletes the existing table and all its data, then creates a new table.

Partitioned

Yes

Hologres keeps the existing table and data, and creates a new partition sub-table for the incoming partition value.

An error occurs if the schema of the new table differs from the existing table.
Important

For non-partitioned tables, the existing data is permanently deleted before the new table is created. Verify the target table name carefully before running the task.

Index options

Under Index Configuration, set the following properties:

Option

Description

Storage Mode

The table storage format: row store, column store, or hybrid row-column store. Choose based on your query pattern.

Time to Live (TTL) (Seconds)

How long Hologres retains data before clearing it. The TTL starts from when data is first written. Default: Permanent.

Binlog

Whether to enable Binlog for this table. See Subscribe to Hologres binlogs.

Binlog Time to Live

Retention period for Binlog data. Default: Permanent.

Set Field Properties

Search for specific fields and configure their properties.

For full details on creating indexes, see CREATE TABLE.

Step 5: Configure advanced settings

In the Advanced section:

Parameter

Description

GUC Parameter

Configuration parameters required before importing data from MaxCompute. For supported parameters, see GUC parameters. Other SQL statements are not supported.

External Server

The default value is odps_server.

Step 6: Run the synchronization node

  1. In the Run Configuration tab, set the following:

    Parameter

    Description

    Compute Engine Instance

    The Hologres compute resource you have bound.

    Resource Group

    The resource group that passed the Connectivity Test when you bound the Hologres compute resource.

    Compute CU

    The number of compute units (CUs) to allocate for the task. Default: 0.25.

    Parameter

    If the filter condition contains ${ParameterName} variables, configure Parameter Name and Parameter Value here. The task replaces each variable with its actual value at runtime. For details, see Node scheduling configuration.

  2. Click Save, then click Run.

What's next

  • Schedule the node: To run the synchronization on a recurring basis, set a Scheduling Policy in the Schedule tab. See Node scheduling configuration.

  • Deploy to production: Click the image icon to open the deployment dialog. After deployment, the node runs on the configured schedule in the production environment. See Deploy a node.

FAQ

The synchronization task fails with a field data type mismatch.

Check that the data types in the Hologres destination table match the corresponding fields in the MaxCompute source table. A mismatch at configuration time causes the task to fail at runtime.

Data is inconsistent after synchronizing a single partition.

Check the filter condition for the source partition. An incorrect filter condition is the most common cause of partial or mismatched partition data.