All Products
Search
Document Center

Hologres:Compaction (beta)

Last Updated:Dec 29, 2023

After you import a large amount of data offline or perform a large number of DELETE or UPDATE operations, the data read and write performance may deteriorate due to data file fragmentation. In this case, you need to perform the compaction operation. The compaction operation merges multiple data files into a large data file to improve the data write and read efficiency. In the compaction, the data storage structure is reorganized. This topic describes how to perform compaction operations in Hologres.

Background information

The data structure of the Hologres data write model is similar to the log-structured merge-tree (LSM-tree). Data is written to storage in append-only mode. This data structure can change random write into sequential write. It is a write-oriented data structure that can effectively improve the write throughput. Compaction is required to merge the written data files into a larger data file.

Hologres supports two types of compaction operations:

  • Auto Compaction

    The auto compaction of Hologres is hierarchical. Up to five levels are allowed. If more than five files exist at a level, compaction is triggered. The large file that is generated after compaction is placed at the next level. For example, if the number of files at Level 0 reaches five, compaction is triggered to merge the five files. By default, the maximum size of a merged file is 64 MB. If the size of the merged file exceeds 64 MB, multiple files are generated after compaction. The merged files are placed at Level 1, as shown in the following figure.

    image.png

  • Full Compaction

    Auto compaction takes effect only within a specific level and does not merge files across levels. Full compaction merges all files at all levels. By default, the maximum size of a merged file is 64 MB. The merged files are placed at the last level.

Limits

  • Only Hologres V2.1 and later allow you to manually trigger full compaction. If your Hologres instance version is earlier than V2.1, manually upgrade your Hologres instance in the Hologres console or join a Hologres DingTalk group to apply for an instance upgrade. For more information about how to manually upgrade a Hologres instance, see Instance upgrades. For more information about how to join a Hologres DingTalk group, see Obtain online support for Hologres.

  • You can trigger full compaction only for column-oriented tables and row-column hybrid tables.

  • If you perform full compaction on row-column hybrid tables, full compaction is performed only on data that is stored in column-oriented storage mode.

Usage notes

  • Scenarios

    You can trigger full compaction in the following scenarios to merge small files and improve query efficiency:

    • A large amount of data is imported in offline mode.

    • The DELETE or UPDATE operation is performed for a large number of times.

    Note

    Full compaction consumes a large amount of I/O and CPU resources. We recommend that you perform full compaction during off-peak hours of write operations. In most cases, full compaction lasts for 10 minutes or longer.

  • Syntax

    SELECT hologres.hg_full_compact_table(
      '<schema_name.table_name>'
      [,'max_file_size_mb=<value>']
    );
  • Parameters

    Parameter

    Description

    Required

    Default Value

    schema_name.table_name

    The name of the table on which you want to perform full compaction.

    Yes

    No default value

    max_file_size_mb

    The maximum size of the file that is generated after full compaction is performed. The value must be a positive integer in MB. We recommend that you retain the default value.

    If you decrease the value of this parameter, the number of generated data files increases, and data queries slow down.

    No

    64

  • Examples

    • Perform full compaction on the table named public.lineitem:

      SELECT hologres.hg_full_compact_table( 'public.lineitem');
    • Perform full compaction on the table named public.lineitem, and set the maximum merged file size to 256 MB:

      SELECT hologres.hg_full_compact_table(
       'public.lineitem',
       'max_file_size_mb=256'
      );