All Products
Search
Document Center

Hologres:Synchronize data from Log Service to Hologres

Last Updated:May 29, 2023

You can write data from Log Service to Hologres in multiple ways. This topic describes how to synchronize data from Log Service to Hologres in real-time by using Realtime Compute for Apache Flink or DataWorks.

Prerequisites

  • Log Service is activated, and a project and a Logstore are created. For more information, see Getting Started.

  • A Hologres instance is purchased and is connected to a development tool. For more information, see Procedure to use Hologres.

  • Realtime Compute for Apache Flink is activated, and a project is created if you want to write data from Log Service to Hologres by using Realtime Compute for Apache Flink. For more information, see Activate fully managed Flink and Create and manage a namespace.

  • DataWorks is activated, and a workspace is created if you want to write data from Log Service to Hologres by using DataWorks. For more information, see Activate DataWorks and Create a workspace.

Background information

Log Service is a cloud-native observability and analytics platform that provides large-scale, low-cost, and real-time services to process multiple types of data such as logs, metrics, and traces. Log Service allows you to collect, process, query, analyze, visualize, consume, and deliver data. You can configure alerts in the Log Service console. Log Service helps enterprises improve their digital capabilities in terms of R&D, O&M, and data security.

Hologres is designed to provide a real-time compute engine that features high performance, high reliability, cost efficiency, and high scalability. Hologres provides real-time data warehouse solutions that help manage large amounts of data and interactive query services that can respond in sub-seconds. Hologres is widely used in scenarios such as the construction of real-time data mid-ends, fine-grained analysis, self-service analysis, marketing profile setup, audience grouping, and real-time risk control. You can write data from Log Service to Hologres in a quick manner for real-time analysis and query. This helps exploit data for your business.

Write data from Log Service to Hologres by using Realtime Compute for Apache Flink

  1. Prepare data in Log Service.

    Simulated data provided by Log Service is used as source data in this example. The data is generated by simulating gaming platform logons and consumption logs. You can also use your business data.

    1. Log on to the Log Service console.

    2. In the Import Data section, click the Simulated Data Import tab.

    3. On the Simulated Data Import tab, click Simulation in the Game Operation Logs section.

    4. In the Specify Logstore step, select the project and Logstore, and click Next.

    5. In the Simulated Data Import step, configure the parameters related to the time range and frequency, and click Import.

    6. The following figure shows the simulation fields and data that you can query. For more information, see Query and analyze logs.

      The content field is of the JSON data type.Simulation data

  2. Create a table in Hologres.

    Create a table that is used to receive data in Hologres. You can create indexes for fields in the table based on your query requirements. This helps improve query efficiency. For more information about indexes, see Overview. The following DDL statement is used to create the table:

    CREATE TABLE sls_flink_holo (
        content JSONB ,
        operation TEXT,
        uid TEXT,
        topic  TEXT ,
        source TEXT ,
        c__timestamp TIMESTAMPTZ,
        receive_time BIGINT
        PRIMARY KEY (uid)
      );
  3. Use Realtime Compute for Apache Flink to write data to Hologres.

    Perform the following steps to write data from Log Service to Hologres by using Realtime Compute for Apache Flink:

    1. Read data from Log Service by using Realtime Compute for Apache Flink. For more information, see Create a Log Service source table.

    2. Write data to Hologres by using Realtime Compute for Apache Flink. For more information, see Create a Hologres result table.

    The following sample SQL job provides an example on how to write data from Log Service to Hologres by using Realtime Compute for Apache Flink. JSON-formatted fields are directly written as the JSON data type in the Hologres table. The JSON data type corresponds to the VARCHAR type in Realtime Compute for Apache Flink.

    Note
    • For more information about how to create and run SQL jobs in Realtime Compute for Apache Flink, see Develop an SQL draft and Start a deployment.

    • If data that is read from Log Service contains JSON-formatted data, you can parse such data before you write the data to Realtime Compute for Apache Flink. You can also directly write the JSON-formatted data from Log Service to Hologres.

    CREATE TEMPORARY TABLE sls_input (
        content STRING,
        operation STRING,
        uid STRING,
        `__topic__` STRING METADATA VIRTUAL,
        `__source__` STRING METADATA VIRTUAL,
        `__timestamp__` BIGINT METADATA VIRTUAL,
        `__tag__` MAP<VARCHAR, VARCHAR> METADATA VIRTUAL
      )
    WITH (
        'connector' = 'sls',
        'endpoint' = 'Internal endpoint of the Log Service project',-- The internal endpoint that is used to access the Log Service project.
        'accessid'='AccessKey ID of your Alibaba Cloud account',-- The AccessKey ID of your Alibaba Cloud account.
        'accesskey'='AccessKey secret of your Alibaba Cloud account',-- The AccessKey secret of your Alibaba Cloud account.
        'starttime'='Start time for log consumption ',-- The time at which logs start to be consumed.
        'project'='Project name',-- The name of the Log Service project.
        'logstore'='Logstore name'-- The name of the Logstore. 
      );
    
    CREATE TEMPORARY TABLE hologres_sink (
        content VARCHAR,
        operation VARCHAR,
        uid VARCHAR,
        topic  STRING ,
        source STRING ,
        c__timestamp timestamp ,
        receive_time BIGINT
      )
    with (
        'connector' = 'hologres',
        'dbname'='Name of the Hologres database',-- The name of the Hologres database. 
        'tablename'='Name of the Hologres table',-- The name of the Hologres table to which you write data. 
        'username'='AccessKey ID of your Alibaba Cloud account',-- The AccessKey ID of your Alibaba Cloud account. 
        'password'='AccessKey secret of your Alibaba Cloud account',-- The AccessKey secret of your Alibaba Cloud account 
        'endpoint'='VPC endpoint of your Hologres instance.'-- The VPC endpoint of your Hologres instance. 
      );
    
    insert into hologres_sink
    select
       content,
       operation,
       uid,
       `__topic__` ,
       `__source__` ,
        cast (
        FROM_UNIXTIME (`__timestamp__`) as timestamp
      ),
       cast(__tag__['__receive_time__'] as bigint) as receive_time
    from
      sls_input;
  4. Query data in Hologres.

    You can query the data that is written from Log Service to Hologres by using Realtime Compute for Apache Flink. You can also develop the data based on your business requirements.Query data in Hologres

Write data to Hologres by using DataWorks

  1. Prepare data in Log Service.

    Simulated data provided by Log Service is used as source data in this example. The data is generated by simulating gaming platform logons and consumption logs. You can also use your business data.

    1. Log on to the Log Service console.

    2. In the Import Data section, click the Simulated Data Import tab.

    3. On the Simulated Data Import tab, click Simulation in the Game Operation Logs section.

    4. In the Specify Logstore step, select the project and Logstore, and click Next.

    5. In the Simulated Data Import step, configure the parameters related to the time range and frequency, and click Import.

    6. The following figure shows the simulation fields and data that you can query. For more information, see Query and analyze logs.

      The content field is of the JSON data type.Simulation data

  2. Create a table in Hologres.

    Create a table that is used to receive data in Hologres. You can create indexes for fields in the table based on your query requirements. This helps improve query efficiency. For more information about indexes, see Overview. The following DDL statement is used to create the table:

    Note
    • In this example, the uid field is configured as the primary key that is used to uniquely identify data. You can configure the primary key based on your business requirements.

    • The uid field is also configured as the distribution key. This way, data records that have the same value of the uid field are written to the same shard. This helps improve query performance.

    BEGIN;
    CREATE  TABLE sls_dw_holo (
        content JSONB ,
        operation TEXT,
        uid TEXT,
        C_Topic  TEXT ,
        C_Source TEXT ,
        timestamp BIGINT,
        PRIMARY KEY (uid)
      );
      call set_table_property('sls_dw_holo', 'distribution_key', 'uid');
      call set_table_property('sls_dw_holo', 'event_time_column', 'timestamp');
    COMMIT;
                            
  3. Configure the data source.

    Before you write data to Hologres by using the Data Integration service of DataWorks, you must add the Log Service data source and Hologres data source in the specified DataWorks workspace.

  4. Synchronize data from Log Service to Hologres in real time.

    Create a real-time synchronization node and run the node in Data Integration. For more information, see Create a real-time synchronization node to synchronize incremental data from a single table and O&M for real-time synchronization nodes.

    In this example, Log Service LogHub is configured as the source data source and Hologres is configured as the destination data source for the real-time synchronization node. Field mappings are configured to synchronize data from the source to the destination. The following figure shows the example of field mappings.Field mappings

  5. Query data in Hologres.

    After the real-time synchronization node is run, you can query the data that is written to Hologres by using the Data Integration service of DataWorks in Hologres.Query data in Hologres