Hologres creates a new system view called hologres.hg_worker_info. This view allows you to query the number of workers, table groups, and shards and the shard allocation in the current database, which helps you determine whether the resources are unevenly allocated. This topic describes how to use hologres.hg_worker_info to query the resource allocation status.
Background information
For information about the concepts and relationships of workers, table groups, and shards in Hologres, see Architecture and Basic concepts. The shard count in a table group is a multiple of the number of workers. If the shard count is not a multiple of the number of workers, the shard count allocated to each worker varies. This causes load imbalance among workers and leads to inefficient use of resources. At the same time, worker-level monitoring metrics are displayed in the Hologres console. To allow you to check whether the resources are unevenly allocated, Hologres V1.3 and later provide the hologres.hg_worker_info system view for you to query the number of workers, table groups, and shards and the shard allocation in the current database.
Limits
- Only Hologres V1.3.23 and later support the hologres.hg_worker_info system view. You can view the version of your Hologres instance on the instance details page in the Hologres console. If the version of your Hologres instance is earlier than V1.3.23,
- hologres.hg_worker_info shows the real-time allocation of shards among workers. You cannot query historical data.
- A new table group obtains
worker_id
with a delay of 10 to 20 seconds. If you query the system view immediately after you create a table group, the field ofworker_id
may be empty. - If no table exists in a table group, resources cannot be allocated to workers. The field of
worker_id
in the query result is displayed asid/0
. - You can query only the information of workers, table groups, and shards in the current database. You cannot query the information across databases.
Description
Field | Type | Description |
---|---|---|
worker_id | TEXT | The ID of the worker to which the current database belongs. |
table_group_name | TEXT | The name of the table group in the current database. |
shard_id | BIGINT | The ID of the shard in the table group. |
select * from hologres.hg_worker_info;
The following sample code shows the query result.xx_tg_internal
is the built-in table group of an instance, which is used to manage metadata. You can ignore this table group. worker_id | table_group_name | shard_id
------------+------------------+----------
bca90a00ef | tg1 | 0
ea405b4a9c | tg1 | 1
bca90a00ef | tg1 | 2
ea405b4a9c | tg1 | 3
bca90a00ef | db2_tg_default | 0
ea405b4a9c | db2_tg_default | 1
bca90a00ef | db2_tg_default | 2
ea405b4a9c | db2_tg_default | 3
ea405b4a9c | db2_tg_internal | 3
Best practice: Troubleshoot the issue of uneven allocation of computing resources (load imbalance among workers)

- Cause 1: Shards are unevenly allocated to workers after a worker fails over. According to Basic concepts, if a worker fails over due to reasons such as out of memory (OOM), the system allocates the corresponding shards to other workers to recover queries. After the terminated worker is recovered, the system allocates some shards to this worker again. This causes uneven allocation of shards among workers. In the hologres.hg_worker_info system view, you can query the shard count allocated to each worker in the current database. This way, you can check whether computing resources are unevenly allocated. However, compute workers are shared by an instance. Therefore, if you want to check the resource allocation, you need to query the shard count allocated to each worker in all databases to obtain the total number of shards allocated to each worker in the instance, and then check whether computing resources are unevenly allocated.
- Sample statement:
select worker_id, count(1) as shard_count from hologres.hg_worker_info group by worker_id;
- Sample result:
-- Assume that the instance has one database. worker_id | shard_count ------------+------------- bca90a | 4 ea405b | 4 tkn4vc | 4 bqw5cq | 3 mbbrf6 | 3 hsx66f | 1 (6 rows)
- Result interpretation:
The instance has six workers, and the shard count allocated to the six workers is different. When you view the monitoring metrics in the Hologres console, you can find that the CPU utilization of workers with fewer shards is lower than that of other workers. This indicates that the computing resources of the instance are unevenly allocated.
- Solution:
Restart the instance so that shards are reallocated among compute workers. This ensures the even allocation of shards among workers. If you do not restart the instance, when another worker fails over, more resources are allocated to idle workers.
- Sample statement:
- Cause 2: Computing resources are unevenly allocated due to data skew. If business data is severely skewed, the data is distributed on some fixed shards. When you query data, workers access the fixed shards. This causes CPU load imbalance among workers. You can use the hologres.hg_worker_info and hologres.hg_table_properties system views to query the allocation of shards among workers and check whether computing resources are unevenly allocated due to data skew based on
worker id
corresponding to the skewed data in the table. To do so, perform the following steps:- Query the data skew. Execute the following SQL statement to check whether the table contains skewed data. If the count value of a shard is much larger than that of other shards, data skew exists.
select hg_shard_id,count(1) from <table_name> group by hg_shard_id; -- Sample result: The count value of shard 39 is larger than that of other shards. Data skew exists. hg_shard_id | count -------------+-------- 53 | 29130 65 | 28628 66 | 26970 70 | 28767 77 | 28753 24 | 30310 15 | 29550 39 | 164983
- Query the corresponding
worker_id
based on the ID of the shard specified by thehg_shard_id
field.In the previous step, you find out the skewed shard. You can use the hologres.hg_worker_info and hologres.hg_table_properties system views to queryworker_id
corresponding to the shard specified by thehg_shard_id
field. The following statement provides an example:
According toSELECT distinct b.table_name, a.worker_id, a.table_group_name,a.shard_id from hologres.hg_worker_info a join (SELECT property_value, table_name FROM hologres.hg_table_properties WHERE property_key = 'table_group') b on a.table_group_name = b.property_value and b.table_name = '<tablename>' and shard_id=<shard_id>; -- Sample result table_name | worker_id | table_group_name | shard_id ------------+------------+-------------------+------------------ table03 | bca90a00ef | db2_tg_default | 39
worker_id
in the query result and the CPU Utilization (%) metric of Worker Nodes, if the CPU utilization of the worker is much higher than that of other workers, computing resources are unevenly allocated due to data skew.
Solution:- Set a proper distribution key to distribute data evenly among shards. For more information, see Optimize the performance of internal table queries.
- If business data is severely skewed, you can split the table into multiple tables. For example, the gross merchandise volume (GMV) in the table of orders placed in live streaming may be obviously different among streamers.
- Query the data skew.
- Cause 3: The shard count is not set properly.We recommend that you set the shard count to a multiple of the number of workers to balance the loads among workers. If the shard count is not set properly, the loads among workers may be unbalanced. You can use the hologres.hg_worker_info system view to check whether the shard count in table groups is set properly in the current database.
- Sample statement:
select table_group_name, worker_id, count(1) as shard_count from hologres.hg_worker_info group by table_group_name, worker_id order by table_group_name desc;
- Sample result:
table_group_name | worker_id | shard_count ------------------+------------+------------- tg2 | ea405b4a9c | 1 tg2 | bca90a00ef | 2 tg1 | ea405b4a9c | 5 tg1 | bca90a00ef | 6 db2_tg_default | bca90a00ef | 4 db2_tg_default | ea405b4a9c | 4 db2_tg_internal | bca90a00ef | 1 (7 rows)
- Result interpretation (assume that the instance has two workers):
- The
tg2
table group has three shards. One worker is assigned one shard less than the other. If the performance does not meet expectations, we recommend that you balance the shard count among workers or scale up the instance. - The
tg1
table group has 11 shards. One worker is assigned one shard less than the other. If the performance does not meet expectations, we recommend that you balance the shard count among workers or scale up the instance. - The default table group
db2_tg_default
contains eight shards. The shards are evenly allocated to workers.
- The
- Solution:
If load imbalance among workers is caused by inappropriate shard allocation, you can evaluate the shard count setting based on your business requirements and set the shard count to a multiple of the number of workers.
- Sample statement: