This article addresses data corruption issues — missing data and duplicate primary keys — that affect logical partitioned tables in Hologres V4.0.0 through V4.0.8.
Am I affected?
This bug affects your instance only if all three conditions are true:
-
Your Hologres instance version is V4.0.0 to V4.0.8.
-
The table is a logical partitioned table with column-oriented or hybrid row-columnar storage.
-
The table has an explicitly defined clustering key.
If all three apply, continue to the sections below.
What's happening
The multi-partition flush optimization in V4.0.0–V4.0.8 has a bug in its clustering index implementation. The bug corrupts AliORC data files (data files of column-oriented tables), causing two symptoms:
-
Missing data: Queries fail to locate existing records because the clustering index points to the wrong location.
-
Duplicate primary keys: During updates, Hologres cannot find the existing record and inserts a new one instead, violating the primary key constraint. This surfaces as a
duplicate key value violates unique constrainterror.
After upgrading to V4.0.9 or later, Hologres detects corrupt index files at query time and returns:
clustered_index size [x] incorrect.
Solution
Step 1: Upgrade your instance
Upgrade to V4.0.9 or later. V4.0.9 prevents queries from silently reading corrupt data — affected queries now fail with the error above, making it safe to identify and repair the corrupted tables.
Step 2: Find tables with corrupted indexes
Run the following SQL against each database. It generates one SELECT ... GROUP BY ... HAVING COUNT(1) > 1 statement per partition.
WITH logical_partition_tables AS (
SELECT DISTINCT
p1.table_namespace,
p1.table_name
FROM hologres.hg_table_properties p1
WHERE p1.property_key = 'logical_partition_columns'
AND EXISTS (
SELECT 1
FROM hologres.hg_table_properties p2
WHERE p2.table_namespace = p1.table_namespace
AND p2.table_name = p1.table_name
AND p2.property_key = 'orientation'
AND p2.property_value IN ('column', 'row,column', 'column,row')
)
),
tables_with_pk AS (
SELECT
l.table_namespace,
l.table_name,
p.property_value AS pk_expr
FROM logical_partition_tables l
JOIN hologres.hg_table_properties p
ON l.table_namespace = p.table_namespace
AND l.table_name = p.table_name
WHERE p.property_key = 'primary_key'
),
partition_cols AS (
SELECT
l.table_namespace,
l.table_name,
p.property_value AS part_expr
FROM logical_partition_tables l
JOIN hologres.hg_table_properties p
ON l.table_namespace = p.table_namespace
AND l.table_name = p.table_name
WHERE p.property_key = 'logical_partition_columns'
),
all_partitions AS (
SELECT
t.table_namespace,
t.table_name,
t.pk_expr,
pc.part_expr,
part.partition
FROM tables_with_pk t
JOIN partition_cols pc
ON t.table_namespace = pc.table_namespace
AND t.table_name = pc.table_name
CROSS JOIN LATERAL (
SELECT partition
FROM hologres.hg_list_logical_partition(
(t.table_namespace || '.' || t.table_name)::TEXT
)
) AS part
),
split_kv AS (
SELECT
table_namespace,
table_name,
pk_expr,
partition,
TRIM(SPLIT_PART(kv, '=', 1)) AS col_name,
TRIM(SPLIT_PART(kv, '=', 2)) AS col_val
FROM all_partitions,
UNNEST(STRING_TO_ARRAY(partition, '/')) AS kv
),
final_where AS (
SELECT
table_namespace,
table_name,
pk_expr,
partition,
STRING_AGG(
FORMAT('%I = %L', col_name, col_val),
' AND '
ORDER BY col_name
) AS where_clause
FROM split_kv
GROUP BY table_namespace, table_name, pk_expr, partition
)
SELECT
FORMAT(
'-- Check PK duplicates in %I.%I, partition: %s' || E'\n' ||
'SELECT %s, COUNT(1) FROM %I.%I WHERE %s GROUP BY %s HAVING COUNT(1) > 1;',
table_namespace, table_name, partition,
pk_expr,
table_namespace, table_name,
where_clause,
pk_expr
) AS detection_sql
FROM final_where
ORDER BY table_namespace, table_name, partition;
This query returns one check statement per partition. Sample output:
Run the generated statements. Two outcomes are possible:
-
No error: The index is intact for that partition.
-
clustere_index size mismatcherror: The index is corrupted. Run a full compaction on the affected table to rebuild the index:SELECT hologres.hg_full_compact_table('schema.table_name','max_file_size_mb=1, reclaim_deleted_data_space = false');
After compaction, re-run the check statements for the repaired table. If they complete without error, the index is correct.
Step 3: Find and remove duplicate primary keys
Run the following SQL to generate duplicate-key check statements for each partition.
WITH logical_partition_tables AS (
SELECT DISTINCT
p1.table_namespace,
p1.table_name
FROM hologres.hg_table_properties p1
WHERE p1.property_key = 'logical_partition_columns'
AND EXISTS (
SELECT 1
FROM hologres.hg_table_properties p2
WHERE p2.table_namespace = p1.table_namespace
AND p2.table_name = p1.table_name
AND p2.property_key = 'orientation'
AND p2.property_value IN ('column', 'row,column')
)
),
tables_with_pk AS (
SELECT
l.table_namespace,
l.table_name,
p.property_value AS pk_expr
FROM logical_partition_tables l
JOIN hologres.hg_table_properties p
ON l.table_namespace = p.table_namespace
AND l.table_name = p.table_name
WHERE p.property_key = 'primary_key'
),
partition_cols AS (
SELECT
l.table_namespace,
l.table_name,
p.property_value AS part_expr
FROM logical_partition_tables l
JOIN hologres.hg_table_properties p
ON l.table_namespace = p.table_namespace
AND l.table_name = p.table_name
WHERE p.property_key = 'logical_partition_columns'
),
all_partitions AS (
SELECT
t.table_namespace,
t.table_name,
t.pk_expr,
pc.part_expr,
part.partition
FROM tables_with_pk t
JOIN partition_cols pc
ON t.table_namespace = pc.table_namespace
AND t.table_name = pc.table_name
CROSS JOIN LATERAL (
SELECT partition
FROM hologres.hg_list_logical_partition(
(t.table_namespace || '.' || t.table_name)::TEXT
)
) AS part
),
split_kv AS (
SELECT
table_namespace,
table_name,
pk_expr,
partition,
TRIM(SPLIT_PART(kv, '=', 1)) AS col_name,
TRIM(SPLIT_PART(kv, '=', 2)) AS col_val
FROM all_partitions,
UNNEST(STRING_TO_ARRAY(partition, '/')) AS kv
),
final_where AS (
SELECT
table_namespace,
table_name,
pk_expr,
partition,
STRING_AGG(
FORMAT('%I = %L', col_name, col_val),
' AND '
ORDER BY col_name
) AS where_clause
FROM split_kv
GROUP BY table_namespace, table_name, pk_expr, partition
)
SELECT
FORMAT(
'-- Check PK duplicates in %I.%I, partition: %s' || E'\n' ||
'SELECT %s, COUNT(1) FROM %I.%I WHERE %s GROUP BY %s HAVING COUNT(1) > 1;',
table_namespace, table_name, partition,
pk_expr,
table_namespace, table_name,
where_clause,
pk_expr
) AS detection_sql
FROM final_where
ORDER BY table_namespace, table_name, partition;
Run the generated statements. Sample output:
Two outcomes are possible:
-
No rows returned: No duplicate primary keys exist in that partition.
-
Rows returned: Duplicate primary keys are present. Run a full compaction followed by the deduplication procedure:
-- Step 1: compact the table select hologres.hg_full_compact_table('schema.table_name','max_file_size_mb=1'); -- Step 2: remove duplicate primary keys call public.hg_remove_duplicated_pk('schema.table_name','max_file_size_mb=1');If step 2 returns a
Query exceed memory limiterror, the data volume is too large to process the whole table at once. Remove duplicates partition by partition. The SQL comments in the generated statements from the previous step identify which partitions contain duplicate data.call public.hg_remove_duplicated_pk('schema.table_name_1', 'dt_int=1'); call public.hg_remove_duplicated_pk('schema.table_name_2', 'dt_text=''A''');
After deduplication, re-run the check statements for the affected table. No rows returned means the duplicates are resolved.
Step 4: Compact tables without a clustering key (optional)
Tables with logical partitions but no clustering key are not directly affected by the index corruption bug, but they can still experience duplicate primary keys and cluster index size mismatch errors during real-time write-backs.
Run the following to generate full compaction commands for these tables:
SELECT
FORMAT(
'SELECT hologres.hg_full_compact_table(%L, ''max_file_size_mb=1''); -- Table: %I.%I (has logical partition but NO clustering_key)',
table_namespace || '.' || table_name,
table_namespace,
table_name
) AS compact_sql
FROM (
SELECT DISTINCT
p1.table_namespace,
p1.table_name
FROM hologres.hg_table_properties p1
WHERE p1.property_key = 'logical_partition_columns'
AND EXISTS (
SELECT 1
FROM hologres.hg_table_properties p2
WHERE p2.table_namespace = p1.table_namespace
AND p2.table_name = p1.table_name
AND p2.property_key = 'orientation'
AND p2.property_value IN ('column', 'row,column')
)
) AS tables_missing_ck
ORDER BY table_namespace, table_name;
Sample output:
Run the generated compaction commands during off-peak hours.