This topic describes how to use the Data Integration service of DataWorks to synchronize data from Kafka to Hologres in real time.

Prerequisites

  • A Hologres instance is purchased and connected to HoloWeb. For more information, see HoloWeb quick start.
  • DataWorks is activated. For more information, see Overview.
  • The Kafka environment and data are prepared. For more information, see Overview.

Background information

Kafka is a distributed message queue service that features high throughput and scalability. It is widely used in scenarios such as log collection, monitoring data aggregation, streaming data processing, and online and offline analysis. Hologres seamlessly integrates with the big data ecosystem and DataWorks. You can use the Data Integration service of DataWorks to synchronize data from Kafka to Hologres in real time, and then query, analyze, and process the data with high concurrency and low latency. For more information, see Kafka Reader and Hologres Writer.

Synchronize data in a single table in real time

The Data Integration service of DataWorks reads data from Kafka by using Kafka SDK for Java and synchronizes the data to Hologres in real time.

  1. Configure data sources.
    Before you synchronize data, you must configure a Kafka data source as the source and a Hologres data source as the destination. For more information, see the following topics:
  2. Configure a sync node.
    After the data sources are configured, you can configure a sync node to synchronize data from Kafka to Hologres in real time. For more information, see the following topics:
  3. Query the synchronized data.
    After the data is synchronized, you can query the synchronized data in Hologres.