All Products
Search
Document Center

Hologres:Mirroring acceleration

Last Updated:Oct 23, 2025

Hologres V4.0 and later lets you mirror MaxCompute tables. After you enable mirroring, Hologres synchronizes data to its local storage in near-real time or at specified intervals. If the metadata or data in the external data source changes, Hologres automatically updates the mirrored data. You can mirror an entire table or specific partitions of a partitioned table. You can also use parameters to limit the data range for mirroring. This topic describes how to use the mirroring feature.

Features

You can use the ALTER EXTERNAL TABLE command to manually enable data mirroring for MaxCompute tables. This feature supports the following capabilities:

  • Enable mirroring for standard MaxCompute tables. This feature does not support MaxCompute tables of the Append2.0, atomicity, consistency, isolation, and durability (ACID) 1.0, or Delta Table types.

  • Mirror entire partitioned and non-partitioned tables, or mirror specific partitions of a partitioned table.

  • When you enable data mirroring, you can set hg_experimental_maxcompute_sdk_split_file_num_limit = 1 to ensure split stability.

  • This feature supports schema evolution for foreign tables. Supported operations include adding, deleting, reordering, and renaming columns. For unsupported operations, Hologres automatically falls back to directly reading the foreign table. If a schema evolution occurs, the mirroring process automatically restarts.

  • Automatically discover and synchronize partitions of MaxCompute tables.

  • You can set common indexes on mirrored lake tables to improve query performance. Indexes help SQL queries hit data faster, reduce I/O, and use fewer computing resources. For more information, see CREATE TABLE. The supported indexes include the following:

    • data_mirroring_clustering_key

    • data_mirroring_bitmap_columns

    • data_mirroring_dictionary_encoding_columns

  • You can query mirrored data using primary and secondary instances, compute group instances, and Serverless resources.

  • You can view the progress of mirror synchronization.

  • You can view the storage usage of mirrored data at the instance, DB, and table levels.

  • You can modify the data mirroring configuration for a specific table.

  • You can disable the data mirroring feature for a specific table.

Prerequisites

Notes

Important
  • Only foreign tables whose metadata is mapped using an External Database are supported. Foreign tables created using Import Foreign Schema or Create Foreign Table are not supported.

  • Mirroring is not supported for MaxCompute tables of the Append2.0, ACID 1.0, or Delta Table types.

  • The MaxCompute table cannot contain data types that Hologres does not support. Otherwise, the mirroring operation fails. For more information about the data types that are supported by Hologres, see Data types.

Syntax

The SQL command is as follows:

alter external table <ext_db.ext_schema.ext_table> 
set(
   -- Enable mirroring for the foreign table
  data_mirroring_speed_up_enable = 'on|off', 
  [data_mirroring_partition_num='1~N|all',] 
  [data_mirroring_partition_list='pt1,pt2...',] 
  
  -- Specify the resources used for the mirror build process
  [data_mirroring_guc_hg_computing_resource='[serverless | local]',]
  [data_mirroring_guc_hg_experimental_serverless_computing_required_cores='<num>',] 

  -- Set indexes
  [data_mirroring_clustering_key='[columnName{:asc]} [,...]]',]
  [data_mirroring_dictionary_encoding_columns='[columnName [,...]]',]
  [data_mirroring_bitmap_columns='[columnName [,...]]',]
  );

Parameters

Parameter

Description

Required

Details

data_mirroring_speed_up_enable

Enables or disables the mirroring feature for the foreign table.

Yes

ON: Enables mirroring.

OFF: Disables mirroring.

data_mirroring_partition_num

Specifies the n most recent partitions to mirror. This parameter applies only to partitioned tables.

No

ALL: All partitions.

1~N: n partitions.

data_mirroring_partition_list

Specifies a list of partitions to mirror. This parameter applies only to partitioned tables.

No

Partition name

data_mirroring_guc_hg_computing_resource

Specifies the resources used for the mirror build process.

No

Serverless: Specifies that the data mirroring uses a Serverless task.

Local: Default value. Specifies that the resources of the current instance are used.

data_mirroring_guc_hg_experimental_serverless_computing_required_cores

Specifies the number of cores required for the Serverless resource.

No

Data type: positive integer. This parameter is required only when data_mirroring_guc_hg_computing_resource is set to Serverless.

Usage

Create an external database

You can create an external database to map the metadata of MaxCompute tables to Hologres. For more information, see CREATE EXTERNAL DATABASE.

The SQL command is as follows:

-- Globally enable the bigmeta link.

alter role  all set hg_experimental_enable_big_meta_on_maxcompute_sdk = on;

-- Globally enable the CommonTable direct read link. You must manually enable this for instances upgraded to version 4.0. This is not required for new instances because it is enabled by default.

alter role  all set hg_experimental_external_catalog_routing = 'odps:common_table';

-- Generate stable data shards.

alter role  all set hg_experimental_maxcompute_sdk_split_file_num_limit = 1;

-- Create an external database.

  CREATE EXTERNAL DATABASE ext_db WITH
    metastore_type 'maxcompute'
    mc_project '<mc_project_name>'
  ;

Enable mirroring for a foreign table

Non-partitioned table

The SQL command is as follows:

alter external table <ext_db.ext_schema.ext_table> 
set(
   -- Enable mirroring for the foreign table
  data_mirroring_speed_up_enable = 'on', 
   
  -- Specify the resources used for the mirror build process
  data_mirroring_guc_hg_computing_resource='local',
 
  -- Set indexes
  data_mirroring_clustering_key='c_int:asc,c_char:desc',
  data_mirroring_dictionary_encoding_columns='c_int:on,c_char:auto',
  data_mirroring_bitmap_columns='c_int:on'
  );

Partitioned table

  • Specify the number of partitions

The SQL command is as follows:

alter external table <ext_db.ext_schema.ext_table> 
set(
   -- Enable mirroring for the foreign table
  data_mirroring_speed_up_enable = 'on', 
  data_mirroring_partition_num='7', 
  
  -- Specify the resources used for the mirror build process
  data_mirroring_guc_hg_computing_resource='serverless',
  data_mirroring_guc_hg_experimental_serverless_computing_required_cores='16', 

  -- Set indexes
  data_mirroring_clustering_key='c_int:asc,c_char:desc',
  data_mirroring_dictionary_encoding_columns='c_int:on,c_char:auto',
  data_mirroring_bitmap_columns='c_int:on'
  );
  • Specify all partitions

The following is the SQL command:

alter external table <ext_db.ext_schema.ext_table> 
set(
   -- Enable mirroring for the foreign table
  data_mirroring_speed_up_enable = 'on', 
  data_mirroring_partition_num='all', 
  
  -- Specify the resources used for the mirror build process
  data_mirroring_guc_hg_computing_resource='local',

  -- Set indexes
  data_mirroring_clustering_key='c_int:asc,c_char:desc',
  data_mirroring_dictionary_encoding_columns='c_int:on,c_char:auto',
  data_mirroring_bitmap_columns='c_int:on'
  );
  • You can specify a list of partitions.

The SQL command is as follows:

alter external table <ext_db.ext_schema.ext_table> 
set(
   -- Enable mirroring for the foreign table
  data_mirroring_speed_up_enable = 'on', 
  data_mirroring_partition_list='pt1,pt2...', 
  
  -- Specify the resources used for the mirror build process
  data_mirroring_guc_hg_computing_resource='local',

  -- Set indexes
  data_mirroring_clustering_key='c_int:asc,c_char:desc',
  data_mirroring_dictionary_encoding_columns='c_int:on,c_char:auto',
  data_mirroring_bitmap_columns='c_int:on'
  );
  • You can specify the number of partitions or the specific partitions to retain as historical mirrored partitions.

The SQL command is as follows:

alter external table <ext_db.ext_schema.ext_table> 
set(
   -- Enable mirroring for the foreign table
  data_mirroring_speed_up_enable = 'on',
  data_mirroring_partition_num='7', 
  data_mirroring_partition_list='pt1', 
  
  -- Specify the resources used for the mirror build process
  data_mirroring_guc_hg_computing_resource='local',

  -- Set indexes
  data_mirroring_clustering_key='c_int:asc,c_char:desc',
  data_mirroring_dictionary_encoding_columns='c_int:on,c_char:auto',
  data_mirroring_bitmap_columns='c_int:on'
  );

View the mirror build status

The SQL command is as follows:

select * from hologres.hg_datalake_get_mirror_status();

The following result is returned:

image.png

Parameters

Parameter

Description

external_db_name

The external database where the table for the mirror build task resides.

external_schema_name

The external schema where the table for the mirror build task resides.

external_table_name

The external table for the mirror build task.

partition

The partition in the external table for the mirror build task.

mirror_data_size

The size of the mirror file for the table or partition.

mirror_start_time

The start time of the mirror build task.

mirror_last_update_time

The last update time of the mirrored data.

total_file_count

The total number of files.

mirrored_file_count

The number of mirrored files.

Modify the mirror configuration

The SQL command is as follows:

alter external table <ext_db.ext_schema.ext_table> 
set(
   -- Enable mirroring for the foreign table
  data_mirroring_speed_up_enable = 'on', 
  data_mirroring_partition_num='12'
  );

Check whether a query uses mirror files

You can use explain analyze to view the execution plan. In the results, check the values of Meta mirror table count and Data mirror file count to check whether the mirror was hit.

The SQL command is as follows:

explain analyze  select xx from <ext_db.ext_schema.ext_table>;

The following result is returned:

QUERY PLAN
Gather  (cost=0.00..5.80 rows=1000 width=116)
[128:1 id=100002 dop=1 time=26/26/26ms rows=107995(107995/107995/107995) mem=0/0/0B open=0/0/0ms get_next=26/26/26ms * ]
  ->  Local Gather  (cost=0.00..5.01 rows=1000 width=116)
      [id=2 dop=128 time=13/0/0ms rows=107995(97688/843/0) mem=0/0/0B open=0/0/0ms get_next=13/0/0ms local_dop=0/0/0]
        ->  Seq Scan on sls_ads_table  (cost=0.00..5.01 rows=1000 width=116)
              Foreign Table Type: DLF
            [id=1 split_count=128 time=13/0/0ms rows=107995(97688/843/0) mem=37/5/4KB open=0/0/0ms get_next=13/0/0ms physical_reads=154(140/77/14) scan_rows=107995(97688/8999/671)]

ADVICE: 
"[node id : 1] Table default."sls_ads_table" Miss Stats! please run 'analyze default."sls_ads_table"'. "

Query id:[1002007457093288810]
QE version: 2.0
Query Queue: init_warehouse.default_queue
======================cost======================
Total cost:[150] ms
Fast statistics cost:[0] ms
Optimizer cost:[73] ms
Build execution plan cost:[10] ms
Init execution plan cost:[9] ms
Start query cost:[13] ms
- Queue cost: [0] ms
- Wait schema cost:[0] ms
- Lock query cost:[0] ms
- Create dataset reader cost:[0] ms
- Create split reader cost:[3] ms
Get result cost:[45] ms
- Get the first block cost:[1] ms
====================resource====================
Memory: total 7 MB. Worker stats: max 3 MB, avg 3 MB, min 3 MB, max memory worker id: 1889346452478863373.
CPU time: total 96 ms. Worker stats: max 82 ms, avg 48 ms, min 14 ms, max CPU time worker id: 1889346412692090879.
DAG CPU time stats: max 77 ms, avg 38 ms, min 0 ms, cnt 2, max CPU time dag id: 1.
Fragment CPU time stats: max 66 ms, avg 25 ms, min 0 ms, cnt 3, max CPU time fragment id: 2.
Ec wait time: total 139 ms. Worker stats: max 79 ms, max(max) 2 ms, avg 69 ms, min 60 ms, max ec wait time worker id: 1889346452478863373, max(max) ec wait time worker id: 1889346412692090879.
Physical read bytes: total 1 MB. Worker stats: max 1 MB, avg 0 MB, min 0 MB, max physical read bytes worker id: 1889346412692090879.
Read bytes: total 7 MB. Worker stats: max 7 MB, avg 3 MB, min 0 MB, max read bytes worker id: 1889346412692090879.
DAG instance count: total 3. Worker stats: max 2, avg 1, min 1, max DAG instance count worker id: 1889346452478863373.
Fragment instance count: total 257. Worker stats: max 129, avg 128, min 128, max fragment instance count worker id: 1889346452478863373.
Meta mirror table count: use 1 miss 0.
Data mirror file count: use 12 miss 0.

Disable mirroring for a foreign table

The SQL command is as follows:

alter external table <ext_db.ext_schema.ext_table> 
set(
  data_mirroring_speed_up_enable = 'off'
  );
Note

After you disable mirroring, the system no longer synchronizes data to Hologres. Queries on the foreign table directly access the data in the data lake. The mirrored data files within Hologres are asynchronously deleted after 30 minutes.

View the mirror configuration

The SQL command is as follows:

SELECT * FROM hologres.hg_datalake_get_mirror_config();

View the mirror progress

You can use the following SQL command to view the progress. In the command, external_database_name is the name of the External Database.

The SQL command is as follows:

SELECT * 
FROM hologres.hg_user_datalake_mirror_cron_tasks 
WHERE command::jsonb->>'external_db_name'='<ext_db>'
ORDER BY start_time DESC;