Generate snapshot points - PolarDB - Alibaba Cloud Documentation Center

This topic describes how to generate snapshot points for column-store snapshots and how these snapshot points impact the latency of data synchronization between a read-only columnar instance and the primary instance after snapshot points are generated.

Supported versions

Three-digit version number	Compute node version	Recommended version	Data node database engine version
5.4.20	All versions	5.4.20-20250328 or later (column-store snapshot optimized)	MySQL 5.7 or MySQL 8.0
5.4.19	5.4.19-20250305 or later	-	MySQL 5.7 or MySQL 8.0

Note

For information about the instance version naming rules, see Release notes.
For information about how to view the version of an instance, see View and update the version of an instance.

Manually generate snapshot points

Execute the following function to generate instance-level snapshot points.
```
CALL polardbx.columnar_flush();
```
Note
The CALL polardbx.columnar_flush() function generates a snapshot point and returns a value of the unsigned long type which represents the version of the snapshot point (a timestamp independent of the time zone).
Execute the following function to generate a snapshot point for the specified CCI:
```
CALL polardbx.columnar_flush(schema_name, table_name, index_name);
-- For example, you can add the actual suffix to the index name or not (you can view the suffix by using the SHOW COLUMNAR STATUS statement).
CALL polardbx.columnar_flush('db1', 'tb1', 'cci');
CALL polardbx.columnar_flush('db1', 'tb1', 'cci_$09a9');
```
Note
- index_name is the name of the CCI. For more information about how to obtain the index_name value, see SHOW COLUMNAR STATUS.
- A CCI-level snapshot is a consistent snapshot of the entire instance, but not of a single table. It is equivalent to an instance-level snapshot point.

Automatically generate snapshot points

Execute the following function to periodically generate snapshot points:

-- The interval at which data is generated. Unit: minutes.
CALL polardbx.columnar_config(cci_id, 'auto_gen_columnar_snapshot_interval', 10);

You can also execute the following statements to schedule the creation of instance-level snapshot points based on a CRON expression.

Note

Version requirements:

Three-part version number	Compute node version	Data node component version (engine version)
5.4.20	5.4.20-20250328 or later	MySQL 5.7, MySQL 8.0
5.4.19	5.4.19-20250305 or later	MySQL 5.7, MySQL 8.0

# Schedule a snapshot to be created every hour.
CALL polardbx.columnar_auto_snapshot_config('ENABLE', '0 0 * * * ?', '+08:00');

# View the scheduled snapshot configuration.
CALL polardbx.columnar_auto_snapshot_config('SHOW');

# Disable the scheduled snapshot feature.
CALL polardbx.columnar_auto_snapshot_config('DISABLE');

Query snapshot points

Execute the following statement to query the INFORMATION_SCHEMA.COLUMNAR_SNAPSHOTS view to obtain the available snapshot points (including manually and automatically generated snapshot points):

SELECT * FROM INFORMATION_SCHEMA.COLUMNAR_SNAPSHOTS;

Sample result:

+------------------------+------------+------------+---------------------+
| SCHEMA_NAME            | TABLE_NAME | INDEX_NAME | TSO                 |
+------------------------+------------+------------+---------------------+
| test_columnar_snapshot | tb1        | cci_$09a9  | 7249374192586457152 |
| polardbx               | __global__ | __global__ | 7249332223843762240 |
| polardbx               | __global__ | __global__ | 7249374271451955264 |
+------------------------+------------+------------+---------------------+

The following table describes the parameters.

Parameter	Description
`SCHEMA_NAME`	The name of the schema. `polardbx` indicates an instance-level snapshot point. If the `SCHEMA_NAME` parameter is set to another value, it is a CCI-level snapshot point.
`INDEX_NAME`	The physical name of the CCI, which has the `_$09a9` random suffix compared to the logical name specified when the index is created.
`TABLE_NAME`	The name of the logical table that corresponds to the CCI.

Note

You can also uses the SHOW COLUMNAR STATUS statement to obtain the SCHEMA_NAME, TABLE_NAME, INDEX_NAME, and TSO values.
If a large number of snapshot points are generated, we recommend that you use the SCHEMA_NAME, TABLE_NAME, INDEX_NAME, and TSO columns to filter them. Otherwise, the performance of the instance may be affected.

Convert a timestamp into a TSO value

Algorithm:

SELECT ((UNIX_TIMESTAMP(@your_timestamp) * 1000) << 22)

Note

The operations of multiplying by 1000 and left-shifting by 22 bits are fixed components of the algorithm for converting a timestamp into a TSO value.

Example:

Query all snapshot points for a specific columnar index within a given time range

--Query all snapshot points for a specific columnar index that fall between 14:00 and 18:00 on 1 July 2024.
SELECT * FROM INFORMATION_SCHEMA.COLUMNAR_SNAPSHOTS 
WHERE SCHEMA_NAME = '<schema>' 
AND TABLE_NAME = '<table>'
AND INDEX_NAME = '<snapshot_cci_name>'
AND TSO >= ((UNIX_TIMESTAMP('2024-07-01 14:00:00') * 1000) << 22)
AND TSO <= ((UNIX_TIMESTAMP('2024-07-01 18:00:00') * 1000) << 22);
+-------------+------------+------------------------+---------------------+
| SCHEMA_NAME | TABLE_NAME | INDEX_NAME             | TSO                 |
+-------------+------------+------------------------+---------------------+
| polardbx    | __global__ | __global__             | 7213421061734400000 |
| your_schema | your_table | your_snapshot_cci_name | 7213481459712000000 |
+-------------+------------+------------------------+---------------------+

Test performance impacts

Note

If you execute the CALL polardbx.columnar_flush() function to generate snapshot points, the read and write operations on the row store instance are not affected. However, if you frequently call the CALL polardbx.columnar_flush() function, the latency of data synchronization between the read-only columnar instance and the primary instance may increase.

Test environment

Size of test data
The test is performed based on 1,000 warehouses. The following list describes the amount of data in each major table:
- The bmsql_order_line table contains 300 million rows of data.
- The bmsql_stock table contains 100 million rows of data.
- The bmsql_customer, bmsql_history, and bmsql_oorder tables each contain 30 million rows of data.
Instance specification for the tests
Node name
Node specification
Number of nodes
Compute nodes
4 cores, 16 GB memory
2
Data node
4 cores, 16 GB memory
4
Columnar engine
4 cores, 32 GB memory
2
ECS instance specifications
ecs.hfg6.4xlarge (16 cores, 64 GB memory) For more information about how to purchase an ECS instance, see Create an instance on the Custom Launch tab in the ECS console and manage the instance.

Test procedure

Set up the environment for TPC-C tests. For more information, see TPC-C tests.

Execute the following statements to create column-store snapshots separately for the bmsql_stock, bmsql_customer, and bmsql_order_line tables.

CREATE CLUSTERED COLUMNAR INDEX cci ON `bmsql_stock`(s_i_id) 
PARTITION BY KEY(s_w_id) 
COLUMNAR_OPTIONS='{
  "type":"snapshot",
  "snapshot_retention_days":"30",
  "auto_gen_columnar_snapshot_interval":"-1"
}';

CREATE CLUSTERED COLUMNAR INDEX cci ON `bmsql_customer`(c_id) 
PARTITION BY KEY(c_w_id) 
COLUMNAR_OPTIONS='{
  "type":"snapshot",
  "snapshot_retention_days":"7",
  "auto_gen_columnar_snapshot_interval":"-1"
}';

CREATE CLUSTERED COLUMNAR INDEX cci ON `bmsql_order_line`(ol_i_id) 
PARTITION BY KEY(ol_w_id) 
columnar_options='{
  "type":"snapshot",
  "snapshot_retention_days":"365",
  "auto_gen_columnar_snapshot_interval":"-1"
}';

When the CPU utilization of compute nodes and data nodes is close to 50%, execute the CALL polardbx.columnar_flush() function at different frequencies to generate snapshot points and observe the latency of data synchronization between the read-only columnar instance and the primary instance.

Test results

The following figure shows the latency of data synchronization between the read-only columnar instance and the primary instance when you execute the CALL polardbx.columnar_flush() function 60 times per minute to generate snapshot points.
The following figure shows the latency of data synchronization between the read-only columnar instance and the primary instance when you execute the CALL polardbx.columnar_flush() function 480 times per minute to generate snapshot points.
Note
The latency of data synchronization remains at about 1 second and the maximum is about 4 seconds.
The following figure shows the latency of data synchronization between the read-only columnar instance and the primary instance when you execute the CALL polardbx.columnar_flush() function 960 times per minute to generate snapshot points.
Note
In this case, the latency of data synchronization between the read-only columnar instance and the primary instance increases significantly and the maximum is 50 seconds.

Conclusions

In the preceding test, when you execute the CALL polardbx.columnar_flush() function at a frequency of less than 500 times per minute, the latency of data synchronization between the read-only columnar instance and the primary instance does not increase significantly. However, when you execute the CALL polardbx.columnar_flush() function at a frequency of about 1000 times per minute, the latency of data synchronization between the read-only columnar instance and the primary instance increases significantly. In actual business environments, how the execution frequency of the CALL polardbx.columnar_flush() function affects the latency of data synchronization between the read-only columnar instance and the primary instance is closely related to the actual business traffic and the number of column-store snapshots. Therefore, we recommend that you set the execution frequency of the CALL polardbx.columnar_flush() function to no more than once per second. If you need to call the function for more than once per second, you must perform sufficient tests.

Node name	Node specification	Number of nodes
Compute nodes	4 cores, 16 GB memory	2
Data node	4 cores, 16 GB memory	4
Columnar engine	4 cores, 32 GB memory	2