This topic describes the basic principles, scenarios, policies, operation guide, and FAQ of specifying the table group and shard count in Hologres for optimized query performance.

Usage notes

When you specify a table group, we recommend that you comply with the following rules:
  • If unnecessary, do not create a table group.
  • If the data amount is large, create an independent table group with a large shard count.
  • Create an independent table group with a small shard count for a group of tables with a small data amount. This reduces the query startup overhead.
  • Try best to add the tables involved in local join operations to the same table group.

Background information

Hologres is a distributed real-time data warehouse engine that has high performance and separates computing from storage. It stores data on shards located in the underlying storage system. Hologres allows users to specify table groups and set the shard count as required. Compared with similar services, Hologres is more flexible and easy to use because users can customize schemas based on their specific business scenarios. With a good understanding of their own business, users can make full use of Hologres to achieve high performance.

In Hologres, a database contains zero or more table groups. Each table group contains multiple tables, and each table can belong to only one table group. A table group matches a group of shards, which are responsible for storing data to and querying data in the tables in this table group. The number of shards for a table group is called the shard count. This number cannot be changed once a table group is created.

You can improve the data storage and computing efficiency by specifying an appropriate table group and shard count. Inappropriate settings for the table group and shard count result in degraded performance, which cannot be adjusted to optimal.

Note For more information about basic terms such as table group and shard count, see Terms.

Scenarios

By default, one database has one table group in a Hologres instance, and table groups for different instance types are configured with different shard counts. For more information, see Instance types. The default shard counts are the best parameters that have been verified by large-scale experiments and can meet the requirements in most scenarios. If you have no special requirements, we recommend that you do not create a table group or modify the shard count.

To perform this best practice, you must have a good understanding of the implementation and architecture of Hologres and be experienced in programming. This best practice is applicable to the following scenarios of modifying the table group and shard count:

  1. If you have a large amount of data to be written, the query startup overhead for a small amount of data is too large, or the query shuffle overhead is too large, we recommend that you create a table group. The query startup overhead is positively related to the shard count.
  2. You can greatly improve the join efficiency and performance by adding two or more tables with the join key as the distribution key to a table group and performing local join operations in the table group.

Table group planning

You can determine whether to create a table group and how to select a table group for your tables based on the following basic principles:

  • Plan 1: Use the default table group

    If you use Hologres to meet the following conditions, we recommend that you directly use the default table group:

    • Data amount

      If the shard count of the default table group meets the requirement of the data amount, you can use the default table group to directly create tables.

    • Overall size

      The total data amount of all tables is controllable and predictable. The usage mode does not significantly change.

    • Local join

      You need to perform efficient local join operations on tables in the default table group.

  • Plan 2: Create a table group

    If the default table group cannot meet your requirements, you may need multiple table groups. Typically, multiple table groups may be required in your instance under the following conditions:

    • Data amount

      If the estimated data amount of the table to be created is small, a table group that has a small shard count is appropriate for the table. Otherwise, a table group that has a large shard count is appropriate for the table. This is the most common reason why you need multiple table groups.

    • Independent load

      Existing table groups contain a large number of tables, and data needs to be simultaneously written to most of the tables. Consequently, the load of the instance is high. In addition, the table to be created requires high query and write throughput. In this case, multiple table groups can make data write and query of this table independent from other tables to some extent. Alternatively, if you troubleshoot the issue and determine that the existing table groups cannot meet the write and query requirements, multiple table groups are also required.

    • Table correlation

      If a set of tables in your business have a unique data write or query pattern, have or will have local join requirements, and have few or no correlation with tables in existing table groups, you can create multiple independent table groups for the set of tables. Local join operations can be performed only on tables with the join key as the distribution key and in the same table group. In other words, you can create multiple table groups for a set of tables that have a strong correlation with each other but have a low correlation and a low probability of local join with tables in existing table groups.

    • Instance resource scaling

      If the instance has been scaled in or out more than five times, the original shard count may no longer meet the requirements. In this case, you can replace the default table group.

  • Plan 3: Add tables to multiple table groups

    If you need to plan multiple table groups, we recommend that you plan the roles and significance of the table groups and the table group to which each table belongs before stress testing and production. You can consider the following factors during planning:

    • Data amount

      The shard count is determined by the amount of data to be stored in the table. Specifically, a table group that has a large shard count is appropriate for large tables, and a table group that has a small shard count is appropriate for large and medium tables.

    • Required write performance

      The shard count has a positive correlation with data write performance. The write capability of a single shard has an upper limit. More shards indicate higher write concurrency and throughput. If you need to write data to a table at a high records per second (RPS), a larger shard count may be required. If the CPU utilization for a single core is 100%, a single shard in Hologres writes data at an RPS of 3,000 to 5,000 (1 KB per record). You can estimate the required shard count based on your required RPS. Each shard also needs to perform read operations such as data query. Therefore, the CPU utilization for data write cannot reach 100%. A shard that uses a 1/3 CPU core writes data at an RPS of 1,000 (1 KB per record). For example, if you want to write data at an RPS of 60,000, and the size of each record is 1KB, the shard count must be about 60, which can be fine-tuned.

    • Load of each table group

      When you create a table group, you must consider the number of tables to be added to this table group. If a large number of tables will be added to this table group in the future, and most of the tables are queried frequently, a small shard count may fail to support highly concurrent queries.

Shard count planning

If a table group has a large shard count, data can be written, queried, and analyzed at high concurrency. Therefore, increasing the shard count can speed up data write, query, and analysis to some extent. However, a larger shard count does not necessarily bring better performance. More communications and computing resources and a larger memory space are required for increased shards. From this perspective, if resources are insufficient or only a small amount of data is queried, increasing the shard count may lead to the opposite effect.

In Hologres, the lower limit of the shard count is 1. If the data amount is only hundreds or thousands of records, you can set the shard count to 1. In principle, the upper limit of the shard count is the total number of computing cores of the instance. This ensures that each shard can occupy at least one core for computing. The total number of computing cores of the instance is about 60% of the total number of cores of the instance, and some resources are used in processes such as frontend request processing, foreign table query, cluster management, and metadata management. If the shard count exceeds the number of computing cores, some shards cannot be allocated CPU resources all the time during the query. This may cause long tails and failover overheads. From the perspective of a disk, for the same table, if more shards exist, the data is more scattered, small files are more likely to appear, and the number of files is greater. If large numbers of tables and shards exist, the number of files is very large. Consequently, more overheads are required during query and failover, query I/O is increased, an amount of recovery time is larger.

Similar to the shard count, the number of table groups must be set to an appropriate value. A shard occupies specific memory space regardless of whether it is in use, to store information such as memory tables and schemas. If data is written to the tables, the shard occupies more memory space. The total number of shards in the instance increases with the number of table groups, which results in larger memory space. In addition, if you want to perform local join operations on multiple tables that are related to each other, these tables must be located in the same table group.

Estimate the shard count based on the data amount

In practice, you can estimate the data amount and want to determine the most suitable range of the shard count. The most suitable shard count is not only related to the amount of data to be stored, but also related to the actual access frequency, the actual amount of accessed data, the computing load type, such as point query or analysis, the write throughput, and the number of tables in the table group. Therefore, you cannot obtain the accurate value for the most suitable shard count. The following table describes the recommended shard counts and instance types that correspond to specific data amount ranges. You can select suitable settings based on the estimated data amount and recommended configurations.
Data amount Recommended shard count Recommended instance type
Less than 40 million rows 10 to 20 More than 32 cores
40 million to 0.4 billion rows 20 to 40 More than 64 cores
0.4 billion to 4 billion rows 40 to 80 More than 128 cores
4 billion to 40 billion rows 80 to 240 More than 256 cores
40 billion to 400 billion rows 200 to 400 More than 512 cores
Note The recommended shard counts and instance types that correspond to specific data amount ranges in the preceding table are not the only criteria. A table whose data amount is small can also be added to a table group that has a large shard count. A table whose data amount is large can also be added to a table group that has a single shard. You must select an appropriate shard count based on your business scenario to implement high concurrency, high computing efficiency, and high data concentration, and prevent unnecessary shuffle overheads.

Operation guide

This section describes operations related to table groups in Hologres.

  1. Query the metadata of table groups when you use the table groups for the first time.
    1. Query the default table group.
      SELECT * FROM hologres.hg_table_group_properties
      WHERE tablegroup_name IN (
        SELECT tablegroup_name FROM hologres.hg_table_group_properties
        WHERE property_key = 'is_default_tg' AND property_value = '1'
      );
      Result:
      Note In the result, the is_default_tg property indicates that the table group is the default table group, and the shard_count property indicates the shard count of the table group.
      tablegroup_name |   property_key   | property_value
      -----------------+------------------+----------------
       test_tg_default | tg_version       | 1
       test_tg_default | table_num        | 1               
       test_tg_default | is_default_tg    | 1               
       test_tg_default | shard_count      | 3
       test_tg_default | replica_count    | 1
       test_tg_default | created_manually | 0
      (6 rows)
    2. Query table groups in the current database.
      SELECT tablegroup_name
      FROM hologres.hg_table_group_properties GROUP BY tablegroup_name;
      Result:
      tablegroup_name
      -----------------
       test_tg_default
      (1 row)
    3. Query the shard count of a table group.
      SELECT property_value AS shard_count
      FROM hologres.hg_table_group_properties
      WHERE property_key = 'shard_count' AND tablegroup_name ='<tg_name>';
      Result:
      shard_count
      -------------
       3
      (1 row)
    4. Query tables in a table group.
      SELECT table_namespace AS schema_name, table_name
      FROM hologres.hg_table_properties
      WHERE property_key = 'table_group' AND property_value = '<tg_name>';
      Result:
      schema_name | table_name
      -------------+------------
       public      | a
      (1 row)
    5. Query the table group to which a table belongs.
      SELECT property_value AS table_group_name
      FROM hologres.hg_table_properties
      WHERE property_key = 'table_group' AND table_name = '<table_name>';
      Result:
      table_group_name
      ------------------
       test_tg_default
      (1 row)
  2. Create a table group.
    If you have new business and need to create a table group, you can execute the following statement:
    CALL HG_CREATE_TABLE_GROUP ('<new_tg_name>', <shard_count>);
    The following table describes the parameters in the preceding statement.
    Parameter Type Description
    new_tg_name Text The name of the table group to be created.
    shard_count INT4 The shard count for the table group.
    Example:
    -- Create a table group that is named tg_8 and has eight shards.
    CALL HG_CREATE_TABLE_GROUP ('tg_8', 8);
  3. Change the default table group.
    If the database contains multiple table groups, and you want to change the default table group to store newly created tables in the new table group, you can execute the following statement as required:
    • In Hologres V0.9 or later, execute the following statement to change the default table group:
      CALL HG_UPDATE_DATABASE_PROPERTY ('default_table_group', '<tg_name>');
      The following table describes the parameter in the preceding statement.
      Parameter Type Description
      tg_name TEXT The name of the default table group. After setting, its shard count is the number of shards of the specified table group.
      Example:
      -- In Hologres V0.9 or later, set a new table group as the default table group. By default, subsequently created tables are stored in this table group.
      CALL HG_UPDATE_DATABASE_PROPERTY ('default_table_group', 'tg_8');
    • In Hologres of a version earlier than V0.9, create a table group to change the default table group. Execute the following statement to create a table group that has a specified shard count and set the table group as the default table group:
      CALL HG_UPDATE_DATABASE_PROPERTY ('shard_count', '<shard_count>');
      The following table describes the parameter in the preceding statement.
      Parameter Type Description
      shard_count TEXT The shard count for the new table group. The value must be a number.
      Example:
      -- In Hologres of a version earlier than V0.9, create a table group and set the table group as the default table group.
      CALL HG_UPDATE_DATABASE_PROPERTY ('shard_count', '8');
  4. Create a table and add it to a table group.
    You can execute the following statement to create a table and add it to a specified table group:
    BEGIN;
    CREATE TABLE table_name (
        col type,
      ....
    );
    CALL SET_TABLE_PROPERTY('table_name', 'table_group', '<tg_name>');
    COMMIT;
    The following table describes the parameters in the preceding statement.
    Parameter Type Description
    table_name TEXT The name of the table to be created.
    tg_name TEXT The name of the default table group. After setting, its shard count is the number of shards of the specified table group.
    Example:
    -- Create a table named tbl1 and add it to the table group named tg_8.
    BEGIN;
    CREATE TABLE tbl1 (
        col1 text
    );
    CALL SET_TABLE_PROPERTY('tbl1', 'table_group', 'tg_8');
    COMMIT;
  5. In Hologres V0.10 or later, migrate some business tables to a new table group.
    You can execute the following statement to migrate some business tables to a new table group:
    -- Create a table group that is named tg_8 and has eight shards.
    CALL HG_CREATE_TABLE_GROUP('tg_8', 8);
    
    -- Migrate a table to the created table group.
    CALL HG_UPDATE_TABLE_SHARD_COUNT('TABLE_NAME', 'tg_8');
  6. Delete a table group.
    You can execute the following statement to delete an empty table group. If a table exists in a table group, you cannot delete the table group.
    CALL HG_DROP_TABLE_GROUP('<tg_name>');
    Example:
    -- Delete the table group named tg_8.
    CALL HG_DROP_TABLE_GROUP('tg_8');

FAQ

  • I have an instance that has 512 cores and use it to perform online analytical processing (OLAP) on a real-time event table. The table contains about 20 billion to 40 billion rows. In this case, how do I specify the table group and shard count?

    A single computing load is involved so that you can use one table group. The default shard count for an instance that has 512 cores is 160. If the event table contains a large number of columns, such as hundreds of columns, you can appropriately increase the shard count to improve OLAP concurrency. For example, change the shard count of the default table group in the database to 200 or more to store the event table.

  • I have an instance that has 256 cores and a large number of column-oriented tables, and use the instance to perform fast OLAP in milliseconds. Each table contains tens of millions of rows. I want to group multiple fields, and query details based on specific conditions. How do I specify the table group and shard count?

    A single computing load is involved so that you can use one table group. The default shard count for an instance that has 256 cores is 120. For the table that contains tens of millions of rows, we recommend that you specify 10 to 20 shards. Especially for aggregation operations such as grouping, more shards cause more shuffle overheads. Consequently, analysis in milliseconds cannot be implemented. Therefore, the default table group may not be able to meet your requirements. To achieve a better effect, you can change the shard count of the default table group in the database to a value between 16 and 40 based on the specific situation, and perform a stress test.

  • How do I check whether slow query is caused by an inappropriate shard count?
    For an inappropriate shard count, the shard count may be too large or too small.
    • If the shard count is too large, the query startup overhead is large, which can be learned from the start query cost line in the execution result of the EXPLAIN ANALYZE statement; or the shuffle overhead is large, which can be determined based on the Max_GetNext_Time size of Redistribution Motion in the execution result of the EXPLAIN ANALYZE statement. In Hologres V0.10 or later, you can view these overheads of historical queries in slow query logs.
    • If the shard count is too small, the CPU utilization cannot reach 100% during long-term computing; or the overhead of scanning data is large because of insufficient concurrency, which can be determined based on the Max_GetNext_Time size of Scan Node in the execution result of the EXPLAIN ANALYZE statement; or the data write performance is poor. If the CPU utilization for a single core is 100%, a single shard in Hologres writes data at an RPS of 3,000 to 5,000. You can compare the actual RPS with this range to assess whether the shard count is too small.
  • Are queries per second (QPS) are not high enough in point query scenarios due to insufficient shards?

    First, determine whether another cause exists. For example, analysis query rather than point query is performed, no indexes are used, shards are not split, or the CPU utilization reaches 100%. After the troubleshooting is complete, if other possible causes do not result in the issue, a single SQL statement achieves the highest performance, and the QPS still does not meet your requirement, you can increase the shard count to increase the backend concurrency of the point query.

  • How do I troubleshoot data skew of a shard?
    Hologres provides an internal field hg_shard_id, which indicates the number of the shard where data is located. You can execute an SQL statement to check whether data skew exists in a shard.
    SELECT hg_shard_id, COUNT(1) FROM tbl1
    GROUP BY hg_shard_id ORDER BY COUNT(1) DESC;
    If the amount of data in a shard is significantly larger than that in other shards, data skew exists, and you may need to adjust the distribution key.