This topic describes how to use DataWorks data integration to synchronize data from a database to Hologres in real time.
When your source database (such as MySQL, Oracle, or PolarDB) needs to feed a Hologres analytics environment with up-to-date data, DataWorks data integration provides three real-time synchronization methods. Choose the method that matches your use case, then configure the source, destination, and task in sequence.
Prerequisites
Before you begin, ensure that you have:
-
Activated DataWorks. See Overview
-
Successfully activated ApsaraDB
-
Verified network connectivity if you plan to sync across regions. See Network connectivity solutions
Background information
Hologres is a real-time interactive analytics product. It seamlessly connects with the big data ecosystem and deeply integrates with DataWorks, an intelligent R&D platform. This integration supports data query and analysis with high concurrency and low latency. Common databases that support real-time data synchronization include Oracle, PolarDB, and PolarDB for MySQL.
Set up real-time sync
The setup involves three steps:
-
Configure the source data source.
-
Configure the Hologres destination data source.
-
Choose a sync method and configure the task.
Step 1: Configure the source
Configure the database you want to sync from. For example, to sync from MySQL, add a MySQL data source in DataWorks. See Configure a data source for step-by-step instructions.
Step 2: Configure the Hologres destination
The Hologres data source must use an exclusive resource group for data integration.
Add Hologres as the destination data source. See Configure a Hologres data source.
Step 3: Choose a sync method and configure the task
DataWorks data integration offers three real-time sync methods. Use the following table to choose the one that fits your use case.
| Sync method | When to use | Supported sources | Data source guide | Task guide |
|---|---|---|---|---|
| Single-table real-time synchronization | Sync incremental changes from selected tables | MySQL binary logging, DataHub, LogHub, Kafka, PolarDB, SQL Server | PolarDB source, MySQL source | Configure real-time synchronization of incremental data from a single table |
| Full database real-time synchronization | Sync incremental changes from all tables in the source database | PolarDB for MySQL, PolarDB, MySQL | PolarDB source, Oracle source, MySQL source | Configure and manage a real-time sync task |
| Synchronization solution | Handle multiple sync scenarios in one pipeline with one-click cloud migration — initial full load, real-time incremental writes, and scheduled merges into full-table partitions | PolarDB for MySQL, Oracle, MySQL, PolarDB-X, PostgreSQL | PolarDB source, Oracle source, MySQL source, PolarDB-X environment, PostgreSQL environment | One-click real-time synchronization to Hologres, Add or remove sync tables from a running task, O&M for full and incremental sync tasks |
To add extra fields to the Hologres destination table — for example, an update_time field — see Configure and manage a real-time sync task.