Hologres is a one-stop real-time data warehouse engine that is compatible with the PostgreSQL protocol. It supports real-time writes, updates, and analysis of massive amounts of data. Hologres handles both Online Analytical Processing (OLAP) and ad hoc analysis on petabytes of data. It also supports online data services with high concurrency and low latency. This topic describes how to perform stress tests for data writes, updates, and point queries in Hologres and provides the results.
Test plan overview
Test scenarios
This topic covers the following test scenarios.
-
Data write scenario: Tests the performance of writing data to row-oriented, column-oriented, and row-column hybrid tables in Hologres.
-
Data update scenario: Tests the performance of updating data in row-oriented, column-oriented, and row-column hybrid tables that have a primary key. This includes both global and partial updates.
-
Point query scenario: Tests the performance of point queries that filter by primary key on row-oriented and row-column hybrid tables.
Test tool
This topic uses holo-e2e-performance-tool, an open source test tool developed by Hologres. For more information, see holo-e2e-performance-tool.
The holo-e2e-performance-tool has the following advantages:
-
Simple test process: The holo-e2e-performance-tool integrates all modules, including table creation, test data generation, and performance testing. You do not need to prepare test data in advance.
-
Flexible parameter settings: You can customize parameters, such as the number of columns, column size, and number of rows, by making simple adjustments to configuration items. This lets you run performance tests that closely match your actual business scenarios.
-
Accurate test results: The primary key for the test data is a series of consecutive, incrementing integers. This design ensures that all primary keys hit their targets in update and point query scenarios by simply adjusting the data volume in the configuration items.
Test procedure
The procedure for using holo-e2e-performance-tool to test data writes, updates, and point queries is as follows:
-
Run a data write test. After setting the basic parameters, test the write performance for a specific number of rows.
-
Keep the test data table from the write scenario. Run the test again with the same parameters to test the performance of data updates.
-
Continue to use the test data table from the data update scenario. Configure parameters, such as the primary key range to query, and then run the point query performance test.
Preparations
Prepare the basic environment
Prepare the basic environment for the test. The requirements are as follows:
To reduce variables that might affect test results, use a newly created instance for each test. Do not use instances that have been upgraded or downgraded.
-
Create an ECS instance.
Log on to Alibaba Cloud and create an Elastic Compute Service (ECS) instance to use as the client for testing. The recommended ECS specifications are as follows:
-
Instance type: ecs.g6.4xlarge.
-
Operating system: Alibaba Cloud Linux 3.2104 LTS 64-bit.
-
Storage: enterprise SSD (ESSD).
-
The ECS instance and the Hologres instance must be in the same region, use the same Virtual Private Cloud (VPC), and be in the same zone.
NoteThe preceding specifications are not mandatory. During the test, monitor the CPU and bandwidth of the ECS instance to ensure that resources are not fully consumed. This confirms that the ECS instance is not a performance bottleneck.
For more information about creating an ECS instance, see Create an instance.
-
-
Create a Hologres instance.
This test uses a dedicated (pay-as-you-go) instance. Because the instance is used only for testing, the computing resources are set to
64 vCPUs and 256 GB of memory. You can select computing resources based on your needs.For more information about creating a Hologres instance, see Purchase a Hologres instance.
-
Create a test database.
Log on to the Hologres instance that you created and create a database. For more information, see Create a database.
Prepare the test tool
-
Install Java Development Kit (JDK) 11 on the ECS instance. For more information, see Manually deploy OpenJDK.
-
Download the holo-e2e-performance-tool.
-
Upload the test tool to the ECS instance. For more information, see Use Workbench to upload or download files.
Performance testing
Data write scenario
How it works
-
Write modes: Data can be written in Fixed Copy mode or Insert mode.
-
Fixed Copy mode: Writes data using the COPY statement and optimizes SQL execution with Fixed Plan. For more information, see Use Fixed Plan to accelerate SQL execution.
-
Insert mode: Writes data using the INSERT statement and optimizes SQL execution with Fixed Plan.
-
-
Special configurations: To ensure compatibility across test scenarios and to streamline testing and validation, the test tool adds the following columns to the table in addition to the columns you configure:
-
Primary key column
id: Serves as the primary key and distribution key for the table. The value starts at 1 and increments with each row during the test. -
Time column
ts: Serves as the segment key for the table. The current time is written to this column during the test.
-
-
Data write principle: During the test, the tool writes data by incrementing the primary key
idstarting from 1. The current time is written to the time column. For other configured TEXT columns, a string of the target length is written with the primary keyidappended. The test stops after a target time or a target number of rows is reached, and the results are calculated.
Procedure
-
Create a test configuration file on the ECS instance.
-
Run the following command to create a file named test_insert.conf.
vim test_insert.conf -
Press
ito enter edit mode and add the following sample content.NoteThe following example uses a row-oriented table. To test column-oriented or row-column hybrid tables, change the value of the
orientationparameter tocolumnorrow,column.# Connection configuration holoClient.jdbcUrl=jdbc:hologres://<ENDPOINT>:<PORT>/<DBNAME> holoClient.username=<AccessKey_ID> holoClient.password=<AccessKey_Secret> holoClient.writeThreadSize=100 # Write configuration put.threadSize=8 put.testByTime=false put.rowNumber=200000000 put.testTime=600000 # Table configuration put.tableName=kv_test put.columnCount=20 put.columnSize=20 put.orientation=row # Other configurations put.createTableBeforeRun=true put.deleteTableAfterDone=false put.vacuumTableBeforeRun=falseThe following table describes the parameters.
Module
Parameter
Description
Notes
Connection configuration
jdbcUrl
The Java Database Connectivity (JDBC) connection string for Hologres. The format is
jdbc:hologres://<ENDPOINT>:<PORT>/<DBNAME>.For
ENDPOINT, enter the VPC domain name of the Hologres instance. You can obtain the domain name from the Network Information section on the Instance Details page in the Hologres console.username
The AccessKey ID of your Alibaba Cloud account.
To obtain an AccessKey ID, go to the AccessKey Management page.
password
The AccessKey secret of your Alibaba Cloud account.
To obtain an AccessKey secret, go to the AccessKey Management page.
writeThreadSize
The number of write threads started by each Holo Client. Each Holo Client write thread occupies one connection. This parameter is effective only in Insert mode.
For more information about Holo Client, see Holo Client.
Write configuration
threadSize
The number of threads for data generation.
-
In FIXED_COPY mode, each thread occupies one connection. The total number of connections is equal to the number of threads.
-
In INSERT mode, a single Holo Client is shared by default. The total number of connections is
writeThreadSize.
For the 64-vCPU instance used in this test, we recommend setting the number of threads to 8 in FIXED_COPY mode. Adjust this parameter based on your instance type and data features.
testByTime
Specifies whether to run the test for a specified duration or for a specified number of rows.
Valid values:
-
true: The test runs at the specified time.
-
false: Runs the test for a specified number of rows.
rowNumber
The target number of rows for the test. This parameter is effective only when testByTime is set to false.
N/A
testTime
The target duration for the test, in milliseconds. This parameter is effective only when testByTime is set to true.
N/A
Table configuration
tableName
The name of the target table for the test.
N/A
columnCount
The number of columns in the table. The data type of each column is TEXT.
N/A
columnSize
The character length of each column in the table.
N/A
orientation
The storage class of the table.
Valid values:
-
row: row-oriented table.
-
column: column-oriented table.
-
row,column: row-column hybrid table.
Other configurations
createTableBeforeRun
Specifies whether to create the table before the test starts.
Valid values:
-
true: Creates the table.
-
false: Does not create the table.
If you set this to true, the test tool first deletes any existing table with the same name and then creates a new one. Make sure the target table name is not the same as any other table in the instance.
deleteTableAfterDone
Specifies whether to delete the table after the test is complete.
Valid values:
-
true: Deletes the table.
-
false: Does not delete the table.
To perform tests such as updates or point queries on the same table after a write test, you must set this parameter to true.
vacuumTableBeforeRun
Specifies whether to run the VACUUM operation before the test starts.
Valid values:
-
true: Runs the VACUUM operation.
-
false: Does not run the VACUUM operation.
Running VACUUM forces a compaction operation. This affects data update tests only in INSERT mode and does not affect FIXED_COPY mode.
-
-
Press the Esc key, enter
:wq, and press Enter to save and close the file.
-
-
Run the data write test.
Run the following statement to start the test.
# Run the data write test in Fixed Copy mode java -jar holo-e2e-performance-tool-1.0.0.jar test_insert.conf FIXED_COPY-
The test tool saves the results to a file named
result.csvin the root directory by default. The mode parameter for data write and update scenarios has two possible values:FIXED_COPYfor Fixed Copy mode andINSERTfor Insert mode.
-
-
View the test results.
Run the following command to view the test results.
cat result.csvThe
result.csvfile contains the following fields.Field
Description
start
The test start time.
end
The test end time.
count
The number of rows of test data.
qps1
The average QPS in the last 1 minute.
qps5
The average QPS in the last 5 minutes.
qps15
The average QPS in the last 15 minutes.
latencyMean
The average latency.
Collected in INSERT and GET modes.
latencyP99
The P99 latency.
Collected in INSERT and GET modes.
latencyP999
The P999 latency.
Collected in INSERT and GET modes.
version
The instance version.
Data update scenario
How it works
-
During the test, the tool updates data by incrementing the primary key
idstarting from 1. The time column is updated to the current time. For other configured TEXT columns that need to be updated (you can configure global or partial updates), a string of the target length is rewritten with the primary keyidappended. The test stops after a target time or a target number of rows is reached, and the results are calculated. -
You must first write data to the table and set deleteTableAfterDone to false. This ensures that the table contains data to be updated.
Procedure for global updates
-
Create a test configuration file on the ECS instance.
-
Run the following command to create a file named test_update.conf.
vim test_update.conf -
Press
ito enter edit mode and add the following sample content.Note-
The following example uses a row-oriented table. To test column-oriented or row-column hybrid tables, change the value of the
orientationparameter tocolumnorrow,column. -
Compared to the data write scenario, you only need to change the value of the
createTableBeforeRunparameter fromtruetofalse. Keep all other parameters the same to start the data update test.
# Connection configuration holoClient.jdbcUrl=jdbc:hologres://<ENDPOINT>:<PORT>/<DBNAME> holoClient.username=<AccessKey_ID> holoClient.password=<AccessKey_Secret> holoClient.writeThreadSize=100 # Write configuration put.threadSize=8 put.testByTime=false put.rowNumber=200000000 put.testTime=600000 # Table configuration put.tableName=kv_test put.columnCount=20 put.columnSize=20 put.orientation=row # Other configurations put.createTableBeforeRun=false put.deleteTableAfterDone=false put.vacuumTableBeforeRun=falseFor parameter descriptions, see the parameter description table in the Data write scenario section.
-
-
Press the Esc key, enter
:wq, and press Enter to save and close the file.
-
-
Run the global update test.
Run the following statement to start the test.
# Run the data write test in Fixed Copy mode java -jar holo-e2e-performance-tool-1.0.0.jar test_update.conf FIXED_COPY -
View the test results.
Run the following command to view the test results.
cat result.csvThe
result.csvfile contains the following fields.Field
Description
start
The test start time.
end
The test end time.
count
The number of rows of test data.
qps1
The average QPS in the last 1 minute.
qps5
The average QPS in the last 5 minutes.
qps15
The average QPS in the last 15 minutes.
latencyMean
The average latency.
Collected in INSERT and GET modes.
latencyP99
The P99 latency.
Collected in INSERT and GET modes.
latencyP999
The P999 latency.
Collected in INSERT and GET modes.
version
The instance version.
Procedure for partial updates
-
Create a test configuration file on the ECS instance.
-
Run the following command to create a file named test_update_part.conf.
vim test_update_part.conf -
Press
ito enter edit mode and add the following content.Note-
The following example uses a row-oriented table. To test column-oriented or row-column hybrid tables, change the value of the
orientationparameter tocolumnorrow,column. -
Compared to the global update scenario, the partial update scenario requires the
writeColumnCountparameter. This parameter defines how many of the total TEXT columns are written to. In this topic, this parameter is set to 50% of the table'scolumnCount.
# Connection configuration holoClient.jdbcUrl=jdbc:hologres://<ENDPOINT>:<PORT>/<DBNAME> holoClient.username=<AccessKey_ID> holoClient.password=<AccessKey_Secret> holoClient.writeThreadSize=100 # Write configuration put.threadSize=8 put.testByTime=false put.rowNumber=200000000 put.testTime=600000 # Table configuration put.tableName=kv_test put.columnCount=20 put.columnSize=20 put.writeColumnCount=10 put.orientation=row # Other configurations put.createTableBeforeRun=false put.deleteTableAfterDone=false put.vacuumTableBeforeRun=falseFor parameter descriptions, see the parameter description table in the global update scenario.
-
-
Press the Esc key, enter
:wq, and press Enter to save and close the file.
-
-
Run the partial update test.
Run the following command to start the test.
# Run the data write test in Fixed Copy mode java -jar holo-e2e-performance-tool-1.0.0.jar test_update_part.conf FIXED_COPY -
View the test results.
Run the following command to view the test results.
cat result.csvThe
result.csvfile contains the following fields.Field
Description
start
The test start time.
end
The test end time.
count
The number of rows of test data.
qps1
The average QPS in the last 1 minute.
qps5
The average QPS in the last 5 minutes.
qps15
The average QPS in the last 15 minutes.
latencyMean
The average latency.
Collected in INSERT and GET modes.
latencyP99
The P99 latency.
Collected in INSERT and GET modes.
latencyP999
The P999 latency.
Collected in INSERT and GET modes.
version
The instance version.
Point query scenario
How it works
-
Point query modes:
-
Synchronous mode: The point query API is blocking. The call waits for the actual request to complete before returning. For a single worker thread, each point query request corresponds to one SQL statement from start to finish. The next request can start only after the current one is complete. Synchronous mode is suitable for scenarios that are sensitive to latency but not to throughput.
-
Asynchronous mode: The point query API is non-blocking. The call returns immediately without waiting for the actual request to complete. After multiple point query requests are submitted asynchronously, a worker thread in the client batches them into a single SQL statement for processing when a batch size or submission interval is met. Asynchronous mode is suitable for scenarios that require high throughput and are less sensitive to latency, such as using Flink to consume real-time data for high-throughput table joins.
-
-
Point query test principle: During the test, the tool randomly generates a target
idwithin the configured primary key range and performs a synchronous or asynchronous point query. The test stops after a target time is reached, and the results are calculated.
Procedure
-
If the automatically generated data does not meet your business needs for point queries, you can write your own business data. Create the table with properties such as a primary key, distribution key, and segment key, and then follow the steps below to run the point query performance test.
-
The custom table must contain a single-column primary key of the INT or BIGINT data type. The primary key values must be consecutive.
-
Create a test configuration file on the ECS instance.
-
Run the following command to create a file named test.conf.
vim test.conf -
Press
ito enter edit mode and add the following content.NoteThe following example uses a row-oriented table. To test a row-column hybrid table, change the value of the
orientationparameter torow,column.# Connection configuration holoClient.jdbcUrl=jdbc:hologres://<ENDPOINT>:<PORT>/<DBNAME> holoClient.username=<AccessKey_ID> holoClient.password=<AccessKey_Secret> holoClient.readThreadSize=32 # Test configuration get.threadSize=8 get.testTime=300000 get.tableName=kv_test get.async=true get.vacuumTableBeforeRun=true get.keyRangeParams=L1-200000000 # Table initialization configuration (effective only in PREPARE_GET_DATA mode) prepareGetData.rowNumber=200000000 prepareGetData.orientation=row put.columnCount=20 put.columnSize=20The following table describes the configuration parameters.
Module
Parameter
Description
Notes
Connection configuration
jdbcUrl
The JDBC connection string for Hologres. The format is
jdbc:hologres://<ENDPOINT>:<PORT>/<DBNAME>.For
ENDPOINT, enter the VPC domain name of the Hologres instance. You can obtain the domain name from the Network Information section on the Instance Details page in the Hologres console.username
The AccessKey ID of your Alibaba Cloud account.
To obtain an AccessKey ID, visit the AccessKey Management page.
password
The AccessKey secret of your Alibaba Cloud account.
To obtain an AccessKey secret, visit the AccessKey Management page.
readThreadSize
The number of connections for the query scenario.
-
In asynchronous mode, to improve batching efficiency, we recommend setting this to 2 to 4 times the number of threads (
threadSize). -
In synchronous mode, to ensure full utilization of computing resources, we recommend increasing the number of connections. In this topic, the number of connections is set to
100.
Test configuration
threadSize
The number of threads for generating request conditions.
For the 64-vCPU instance used in this test:
-
In asynchronous point query mode, we recommend setting the number of threads to 8.
-
In synchronous point query mode, we recommend setting the number of threads to 500.
Adjust this parameter based on your instance type and data features.
testTime
The target duration for the test, in milliseconds.
N/A.
tableName
The name of the target table for the test.
N/A.
async
Specifies whether the point query test is asynchronous.
Valid values:
-
true: asynchronous.
-
false: synchronous.
vacuumTableBeforeRun
Specifies whether to run the vacuum operation before the test starts.
Valid values:
-
true: Runs the VACUUM operation.
-
false: Does not run the VACUUM operation.
Running VACUUM forces a compaction operation.
keyRangeParams
The range of primary key parameters for the point query. The format is
<I/L><Start>-<End>.Parameter description:
-
I/L: I indicates the INT type, and L indicates the BIGINT type. -
Start: The starting value of the primary key. -
End: The ending value of the primary key.
During the point query test, the tool randomly generates a target primary key within the configured range for querying.
Table initialization configuration (effective only in PREPARE_GET_DATA mode)
rowNumber
The number of data rows to pre-generate for the table.
N/A.
orientation
The storage class of the table to pre-generate.
Valid values:
-
row: row-oriented table.
-
column: column-oriented table.
-
row,column: A table with rows and columns.
columnCount
The number of columns in the table to pre-generate. The data type of each column is TEXT.
N/A.
columnSize
The character length of each column in the table to pre-generate.
N/A.
-
-
Press the Esc key, enter
:wq, and press Enter to save and close the file.
-
-
Run the following statement to start the test:
NoteThe mode parameter for the point query scenario has two possible values:
PREPARE_GET_DATAfor the data preparation mode before a point query, andGETfor the point query mode. If you have already written business data or have completed the data write and update tests and retained the test table, you do not need to use the PREPARE_GET_DATA mode. You can use the GET mode directly for point queries.# Use PREPARE_GET_DATA mode to prepare data java -jar holo-e2e-performance-tool-1.0.0.jar test.conf PREPARE_GET_DATA # Use GET mode to run the point query test java -jar holo-e2e-performance-tool-1.0.0.jar test.conf GET -
View the test results.
Run the following command to view the test results.
cat result.csvThe
result.csvfile contains the following fields.Field
Description
start
The test start time.
end
The test end time.
count
The number of rows of test data.
qps1
The average QPS in the last 1 minute.
qps5
The average QPS in the last 5 minutes.
qps15
The average QPS in the last 15 minutes.
latencyMean
The average latency.
Collected in INSERT and GET modes.
latencyP99
The P99 latency.
Collected in INSERT and GET modes.
latencyP999
The P999 latency.
Collected in INSERT and GET modes.
version
The instance version.
Test results
This topic tests multiple scenarios. For data write and update scenarios, three storage classes are tested: row-oriented, column-oriented, and row-column hybrid. Because column-oriented tables are not suitable for point queries, only row-oriented and row-column hybrid storage classes are tested for the point query scenario.
The performance test was configured as follows: The number of columns in the table was set to 20, 50, and 100, and the size of each column was 20 characters. This test used a Hologres V3.0.22 instance (with JDK 11) with 64 vCPUs. The recommended total data volume for this instance type is between 40 million and 400 million rows. Therefore, the data volume for each scenario in this test was 200 million rows.
-
Parameters such as instance type, table schema, data volume, and concurrency can significantly affect performance. In your actual tests, first determine the Hologres instance type, table schema, and data volume, and then adjust the concurrency for multiple test runs.
-
If your goal is to test the optimal performance of Hologres, monitor the CPU utilization of the instance after each test. If CPU resources are not fully utilized, increase the concurrency and continue testing to obtain the most accurate and optimal performance results.
-
20 columns
The specific values for QPS and latency (for point queries), are shown below. The QPS is based on the
qps1metric and the latency is based on thelatencyMeanmetric in the test result file.Storage format
Test scenario
Concurrency configuration
QPS (in 10,000s)
Latency (ms)
Row-oriented
Write (Static copy)
threadSize=8
88.3
N/A
Global update (Fixed copy)
threadSize=8
91.1
N/A
Partial update (Fixed copy)
threadSize=8
136.7
N/A
Point query (asynchronous)
threadSize=8
readThreadSize=32
43.6
8.60
Point query (synchronous)
threadSize=100
readThreadSize=100
26.1
7.88
Column-oriented
Write (Fixed copy)
threadSize=8
29.9
N/A
Global update (Fixed copy)
threadSize=8
18.5
N/A
Partial update (Fixed copy)
threadSize=8
12.8
N/A
Row-column hybrid
Write (Fixed copy)
threadSize=8
25.0
N/A
Global update (Fixed copy)
threadSize=8
17.3
N/A
Partial update (Fixed copy)
threadSize=8
17.5
N/A
Point Query (Asynchronous)
threadSize=8
readThreadSize=32
40.4
8.75
Point Query (Synchronous)
threadSize=100
readThreadSize=100
27.0
8.12
-
50 columns
The following table shows the specific QPS and latency (for point queries) values from the test results. The QPS value is taken from the
qps1metric, and the latency value is taken from thelatencyMeanmetric in the result file.Storage format
Test scenario
Concurrency configuration
QPS (in 10,000s)
Latency (ms)
Row-oriented
Write (Fixed copy)
threadSize=8
41.9
N/A
Global update (Fixed copy)
threadSize=8
38.2
N/A
Partial update (Fixed copy)
threadSize=8
69.8
N/A
Point query (asynchronous)
threadSize=8
readThreadSize=32
35.7
10.20
Point Query (Synchronous)
threadSize=100
readThreadSize=100
24.0
9.28
Column-oriented
Write (Fixed copy)
threadSize=8
15.5
N/A
Global update (Fixed copy)
threadSize=8
11.7
N/A
Partial update (Fixed copy)
threadSize=8
6.4
N/A
Row-column hybrid
Write (Fixed copy)
threadSize=8
13.7
N/A
Global update (Fixed copy)
threadSize=8
10.5
N/A
Partial update (Fixed copy)
threadSize=8
11.6
N/A
Point Query (Asynchronous)
threadSize=8
readThreadSize=32
31.5
12.29
Point query (sync)
threadSize=100
readThreadSize=100
24.2
10.00
-
100 columns
The following table shows the specific QPS and latency (for point queries) values from the test results. The QPS value is taken from the
qps1metric, and the latency value is taken from thelatencyMeanmetric in the result file.Storage format
Test scenario
Concurrency configuration
QPS (in 10,000s)
Latency (ms)
Row-oriented
Write (Fixed copy)
threadSize=8
24.4
N/A
Global update (Fixed copy)
threadSize=8
22.5
N/A
Partial update (Fixed copy)
threadSize=8
38.1
N/A
Point query (asynchronous)
threadSize=8
readThreadSize=32
26.6
14.34
Point query (synchronous)
threadSize=100
readThreadSize=100
21.4
12.12
Column-oriented
Write (Fixed copy)
threadSize=8
8.4
N/A
Global update (Fixed copy)
threadSize=8
6.9
N/A
Partial update (Fixed copy)
threadSize=8
3.1
N/A
Row-column hybrid
Write (Fixed copy)
threadSize=8
7.6
N/A
Global update (Fixed copy)
threadSize=8
5.3
N/A
Partial update (Fixed copy)
threadSize=8
6.8
N/A
Point query (asynchronous)
threadSize=8
readThreadSize=32
25.0
15.66
Point query (synchronous)
threadSize=100
readThreadSize=100
21.4
13.79