All Products
Search
Document Center

Hologres:Set up binlog replication for Hologres

Last Updated:Mar 26, 2026

Hologres binary logging (binlog) records every data change event at the table level, letting you replicate data to downstream systems, synchronize databases, or feed change events into stream-processing pipelines. When combined with Flink, binlog consumption enables end-to-end real-time data transformation across warehouse layers—from the operation data store (ODS) to the data warehouse dimension (DWD) layer and beyond—at lower latency than traditional batch-based approaches.

This topic covers how to enable binlogs on a table, understand the binlog record structure, query binlog data directly, and consume binlog events in real time.

How binlogs work in Hologres

Hologres binary logging is modeled after MySQL's binlog mechanism but adapted for a distributed real-time data warehouse:

  • Table-level control: Enable or disable binlogs per table, with an independent Time to Live (TTL) per table.

  • DML-only recording: Binlogs capture INSERT, UPDATE, and DELETE events. DDL operations are not logged.

  • Distributed binlogs: Because Hologres is a distributed system, binlog data is spread across shards. The log sequence number (LSN) increases monotonically within a shard but is not guaranteed to be continuous, unique, or ordered across shards. See Distribution Key.

  • Directly queryable: Query binlog data with SQL—useful for auditing, debugging, and offset inspection.

  • Not used for high availability: Unlike traditional databases, Hologres binlogs are not used for primary-replica synchronization or data restoration.

Prerequisites

Before you begin, make sure that:

  • Your Hologres instance is V0.9 or later. If it is earlier, contact Hologres support. See How do I get more online support?.

  • Your account has the permissions required to consume binlogs:

Version Method Required permission
V1.3.14 / V1.1.82 and later Flink SELECT on the target table
V1.3.14 / V1.1.82 and later Java Database Connectivity (JDBC) Replication Role
Earlier versions Any Superuser only. Accounts with fewer permissions encounter: permission denied for table hg_replication_slot_properties

Limitations

Limitation Details
Existing tables (V0.9 and V0.10) Cannot enable binlogs on existing tables by modifying properties. Recreate the table. In V1.1 and later, you can enable binlogs on existing tables without recreating them.
Partitioned tables Consume binlogs from child tables or standard (non-partitioned) tables only. Consumption from a parent table is not supported. In V1.3.24 and later, you can set a binlog TTL per child table. A child table with no TTL set inherits the parent table's TTL.
TTL precision The TTL is not precise. The system does not delete expired binlogs immediately—deletion occurs at some point after expiration.
Foreign tables Only Hologres internal tables support binlogs. Foreign tables do not.
Blink data types When using Realtime Compute (Blink) to consume binlogs, the TIMESTAMP type is not supported. Use TIMESTAMPTZ when creating tables. SMALLINT is also not supported.
Column-oriented tables with frequent updates Enabling binlogs on a column-oriented table has higher overhead than on a row-oriented table for write-heavy workloads. For tables that also serve online analytical processing (OLAP) queries, use the row-column hybrid storage format. See Table storage formats.

Binlog support matrix

Consumer Row-oriented table Column-oriented table Row-column hybrid table (V1.1+)
Realtime Compute (Blink) Supported Supported Supported
Fully Managed Flink Supported Supported Supported
Open source Flink Not supported Not supported Not supported
JDBC Supported (V1.1+) Supported (V1.1+) Supported (V1.1+)

Enable binlogs

Binlogs are disabled by default. Enable them by setting the binlog.level and binlog.ttl table properties when you create a table, or on an existing table in V1.1 and later.

Create a table with binlogs enabled

V2.1 and later (property names updated to binlog_level and binlog_ttl):

CREATE TABLE test_message_src (
    id int PRIMARY KEY,
    title text NOT NULL,
    body text
)
WITH (
    orientation = 'row',
    clustering_key = 'id',
    binlog_level = 'replica',
    binlog_ttl = '86400'  -- TTL in seconds (24 hours)
);

All versions:

BEGIN;
CREATE TABLE test_message_src (
    id int PRIMARY KEY,
    title text NOT NULL,
    body text
);
CALL set_table_property('test_message_src', 'orientation', 'row');       -- Row-oriented storage
CALL set_table_property('test_message_src', 'clustering_key', 'id');    -- Clustered index on id
CALL set_table_property('test_message_src', 'binlog.level', 'replica'); -- Enable binlogs
CALL set_table_property('test_message_src', 'binlog.ttl', '86400');     -- TTL in seconds (24 hours)
COMMIT;

Parameters:

Parameter Description
binlog_level / binlog.level Whether to enable binlogs. replica = enabled; none = disabled.
binlog_ttl / binlog.ttl Retention period for binlog data, in seconds. Default: 2,592,000 seconds (30 days).

For full CREATE TABLE syntax, see CREATE TABLE.

Enable or disable binlogs on an existing table

These operations require V1.1 or later.

Enable binlogs:

-- Enable binlogs
BEGIN;
CALL set_table_property('<table_name>', 'binlog.level', 'replica');
COMMIT;

-- Set binlog TTL (in seconds)
BEGIN;
CALL set_table_property('<table_name>', 'binlog.ttl', '2592000');
COMMIT;

Disable binlogs:

BEGIN;
CALL set_table_property('<table_name>', 'binlog.level', 'none');
COMMIT;

Update binlog TTL:

CALL set_table_property('<table_name>', 'binlog.ttl', '8640000');  -- Unit: seconds
In V1.3.24 and later, you can set a TTL on a child table of a partitioned table. A child table with no TTL set inherits the parent table's TTL.

Replace <table_name> with the actual table name.

Understand the binlog record format

Each binlog record contains three system fields followed by all user-defined table columns, in the same column order as the original table's DDL.

Field Type Description
hg_binlog_lsn BIGINT Binlog serial number. Increases monotonically within a shard. Not guaranteed to be continuous, unique, or ordered across shards.
hg_binlog_event_type BIGINT Type of change event. See event type values below.
hg_binlog_timestamp_us BIGINT System timestamp of the change event, in microseconds.
User table columns User-defined All columns from the source table, in DDL order.

hg_binlog_event_type values

Value Name Meaning
5 INSERT A row was inserted.
2 DELETE A row was deleted.
3 BEFORE_UPDATE The row state before an UPDATE.
7 AFTER_UPDATE The row state after an UPDATE.

How UPDATE events are recorded

An UPDATE operation produces two consecutive binlog records: the BEFORE_UPDATE record comes first, immediately followed by the AFTER_UPDATE record. The binlog subscription feature guarantees these two records are consecutive.

BEFORE_UPDATE (3) and AFTER_UPDATE (7) appear only when an UPDATE is executed as a plain SQL statement. When UPDATE is performed through a Hologres connector—such as Holo Client, Flink Connector, or Data Integration—the connector translates BEFORE_UPDATE to DELETE (2) and AFTER_UPDATE to INSERT (5) to ensure eventual consistency.

Structurally, a Hologres binlog behaves like a row-oriented table where hg_binlog_lsn is the key, and hg_binlog_event_type, hg_binlog_timestamp_us, and all original table columns form the value. Because binlog fields are fixed and strongly typed, row-oriented or row-column hybrid tables deliver better read performance when binlogs are enabled.

Query binlogs

Query binlog records directly

Combine the three system fields with the table's original columns in a SELECT statement:

SELECT hg_binlog_lsn, hg_binlog_event_type, hg_binlog_timestamp_us, *
FROM test_message_src;

The following figure shows a sample result.

image

Query the earliest or latest binlog on a shard

Use hg_get_binlog_cursor to retrieve the first or last binlog entry on a specific shard:

-- Earliest binlog on shard 0
SELECT * FROM hg_get_binlog_cursor('test_message_src', 'OLDEST', 0);

-- Latest binlog on shard 0
SELECT * FROM hg_get_binlog_cursor('test_message_src', 'LATEST', 0);

Syntax:

SELECT * FROM hg_get_binlog_cursor('<table_name>', 'OLDEST' | 'LATEST', <shard_id>);

The following figure shows a sample result for the OLDEST query.

image

Query binlog position by LSN

Use hg_get_binlog_cursor_by_lsn to find the first binlog record whose LSN is greater than or equal to the specified value. If no such record exists, hg_binlog_timestamp_us in the result returns the current time.

SELECT * FROM hg_get_binlog_cursor_by_lsn('test_message_src', 152, 0);

Syntax:

SELECT * FROM hg_get_binlog_cursor_by_lsn('<table_name>', <lsn>, <shard_id>);
-- <lsn>: BIGINT

The following figure shows a sample result.

image

Query binlog position by timestamp

Use hg_get_binlog_cursor_by_timestamp to find the first binlog record with a timestamp greater than or equal to the specified time. If the specified time is later than the latest binlog entry, hg_binlog_timestamp_us returns the current time and hg_binlog_lsn returns the LSN that will be assigned to the next inserted row.

If the provided timestamp is later than the current time returned by now(), the query throws a Get binlog cursor in future time exception.
SELECT *, to_timestamp(hg_binlog_timestamp_us / 1000000.0)
FROM hg_get_binlog_cursor_by_timestamp('test_message_src', '2024-05-20 19:34:53.791+08', 0);

Syntax:

SELECT * FROM hg_get_binlog_cursor_by_timestamp('<table_name>', <timestamp>, <shard_id>);

The following figure shows a sample result.

image

Consume binlogs in real time

Choose a consumption method based on your environment:

Method When to use
Flink or Realtime Compute (Blink) Stream-processing pipelines. Supported for row-oriented, column-oriented, and row-column hybrid tables.
JDBC (including Holo Client) Custom consumers and application-level change data capture (CDC). Requires V1.1+ and the Replication Role.

Manage binlog tables

View tables with binlogs enabled

Run the following query to list all tables with binlogs enabled in the current database:

SELECT *
FROM hologres.hg_table_properties
WHERE property_key = 'binlog.level'
  AND property_value = 'replica';

The following figure shows a sample result.

Tables with binlogs enabled

View binlog storage size

Suppress binlog generation for a session

To run DML statements without generating binlog records, set the following Grand Unified Configuration (GUC) parameter at the session level before executing the DML statement:

SET hg_experimental_generate_binlog = off;

This setting applies only to the current session and must be set before each DML statement for which you want to suppress binlog output.