This topic describes how to use VectorDBBench to benchmark the vector retrieval performance of PolarDB over the MySQL protocol and presents test results for various datasets and concurrency levels.
Test environment
PolarDB cluster specifications
Node role | Specification | Configuration | Quantity |
read-write node (RW) | polar.mysql.x4.large | 4 vCPU, 16 GB | 1 |
read-only node (hot standby) | polar.mysql.x4.large | 4 vCPU, 16 GB | 1 |
columnar index read-only node | polar.mysql.x4.4xlarge | 32 vCPU, 128 GB | 1 |
Vector data is created on an InnoDB table by usingCOMMENT 'COLUMNAR=1'. The columnar index read-only node automatically synchronizes the data and builds aHNSW_FLAT vector index. Vector retrieval requests are sent to the columnar index read-only node.
Test client (ECS)
The test client is an ecs.g9i.4xlarge (16 vCPU / 64 GB memory) ECS instance. It is located in the same availability zone as the PolarDB cluster.
Test tool and datasets
VectorDBBench is an open-source vector database benchmark tool from Zilliz. It supports end-to-end performance evaluation for mainstream vector databases, covering data ingestion, index building, and vector retrieval. VectorDBBench provides the following subcommands for PolarDB, each mapped to a different vector index type:
polardbhnswflat: runs tests using the HNSW_FLAT index type.polardbhnswpq: runs tests using the HNSW_PQ index type.polardbhnswsq: runs tests using the HNSW_SQ index type.
This topic usespolardbhnswflat (HNSW_FLAT index) to run tests on the following three datasets:
Dataset | Number of vectors | Dimension | Case type |
Cohere 768D 1M | 1,000,000 | 768 | Performance768D1M |
Cohere 768D 10M | 10,000,000 | 768 | Performance768D10M |
OpenAI 1536D 5M | 5,000,000 | 1536 | Performance1536D5M |
Test steps
Install VectorDBBench
On the test client, run the following commands to install VectorDBBench:
git clone https://github.com/zilliztech/VectorDBBench.git
cd VectorDBBench
# Create and activate a venv
python3 -m venv .venv
source .venv/bin/activate
# Upgrade pip
pip install --upgrade pip
# Install VectorDBBench and the PolarDB dependencies
pip install -e '.[polardb]'Run the benchmark
The following example command runs a benchmark on the Cohere 768D 1M dataset:
DATASET_LOCAL_DIR=/root/ \
DATASET_SOURCE=AliyunOSS \
NUM_PER_BATCH=64 \
vectordbbench polardbhnswflat \
--case-type Performance768D1M \
--username <user_name> \
--password <password> \
--host <host> \
--port 3306 \
--m 16 \
--ef-construction 256 \
--ef-search 256 \
--insert-workers 64 \
--num-concurrency '20,40,60,80,100' \
--concurrency-duration 60 \
--task-label test_ecs \
--db-label ecs_test \
--post-load-indexParameter descriptions
Parameter | Description |
| The batch size for a single INSERT statement during data ingestion. |
| Specifies the test dataset. Valid values: Performance768D1M, Performance768D10M, and Performance1536D5M. |
| HNSW graph construction parameters. |
| The search width during retrieval. A larger value usually increases recall but also increases latency. |
| The number of concurrent threads for data import. |
| A comma-separated list of concurrency levels for the retrieval phase. |
| The duration in seconds for each concurrency level. |
| Imports all data first, then builds the vector index in a single operation. |
To test other datasets, change the--case-type value and adjust the--m,--ef-construction, and--ef-search parameters based on the dataset's characteristics.
Test results
Cohere 768D 1M dataset
Test parameters:
--m 16 --ef-construction 256 --ef-search 256.Key metrics:
Metric
Value
Recall@100
0.9612
Single-thread average latency
2.6 ms
Single-thread p95 latency
3.0 ms
Single-thread p99 latency
3.3 ms
Index building time (optimize)
76.80 s
Peak QPS (concurrency=100)
13060.41
Performance at different concurrency levels:
Concurrency
QPS
Average latency (ms)
p95 latency (ms)
p99 latency (ms)
20
5771.52
3.46
4.80
6.48
40
8901.77
4.48
7.29
9.55
60
11932.17
5.01
9.19
12.51
80
12589.50
6.33
11.97
15.96
100
13060.41
7.62
14.33
19.50
Cohere 768D 10M dataset
Test parameters:
--m 16 --ef-construction 500 --ef-search 300.Key metrics:
Metric
Value
Recall@100
0.9551
Single-thread average latency
3.1 ms
Single-thread p95 latency
3.7 ms
Single-thread p99 latency
4.2 ms
Index building time (optimize)
1625.95 s
Peak QPS
10174.35
Performance at different concurrency levels:
Concurrency
QPS
Average latency (ms)
p95 latency (ms)
p99 latency (ms)
20
4859.82
4.11
5.78
7.72
40
6556.74
6.09
9.95
12.55
60
9293.53
6.44
12.25
16.45
80
10063.69
7.93
15.65
21.15
100
10174.35
9.80
18.99
26.17
OpenAI 1536D 5M dataset
Test parameters:
--m 16 --ef-construction 256 --ef-search 256Key metrics:
Metric
Value
Recall@100
0.9676
Single-thread average latency
3.4 ms
Single-thread p95 latency
3.9 ms
Single-thread p99 latency
4.4 ms
Index building time (optimize)
1508.41 s
Peak QPS (concurrency=100)
9130.94
Performance at different concurrency levels:
Concurrency
QPS
Average latency (ms)
p95 latency (ms)
p99 latency (ms)
20
4597.60
4.34
5.84
7.49
40
6539.36
6.11
9.70
12.34
60
8438.65
7.09
13.17
17.72
80
8996.89
8.86
17.42
23.38
100
9130.94
10.91
21.08
28.63
Summary and comparison
Dataset | Number of vectors | Vector dimension | Recall@100 | p99 latency (ms) | Peak QPS | Index building time (s) |
Cohere 768D 1M | 1,000,000 | 768 | 0.9612 | 3.3 | 13060.41 | 76.80 |
Cohere 768D 10M | 10,000,000 | 768 | 0.9551 | 4.2 | 10174.35 | 1625.95 |
OpenAI 1536D 5M | 5,000,000 | 1536 | 0.9676 | 4.4 | 9130.94 | 1508.41 |
Conclusion
High recall with low latency
Across all three datasets, Recall@100 exceeds 0.95, with single-thread p99 latency between 3.3 ms and 4.4 ms. This demonstrates that the system achieves millisecond-level response times without sacrificing high recall.
Excellent concurrency scalability
As concurrency increases from 20 to 100, QPS grows almost linearly. With the Cohere 768D 1M dataset, the benchmark reaches 13,060 QPS at 100 concurrency with a p99 latency of only 19.50 ms, showing excellent throughput scaling.
High throughput for high-dimensional, large-scale workloads
For the OpenAI 1536D 5M dataset (1536 dimensions, 5 million vectors), the benchmark reaches 9,130 QPS at 100 concurrency with a p99 latency of 28.63 ms. This indicates that PolarDB delivers stable retrieval performance in high-dimensional, large-scale scenarios.
Efficient index building
Building a vector index for 1 million vectors takes only 76.80 s. For 10 million vectors, it takes 1,625.95 s (about 27 minutes). This index building speed meets the demands of production environments.
Engineering-friendly design
PolarDB uses a columnar index read-only node to automatically synchronize data and build the vector index. You do not need to deploy a standalone vector database. Vector retrieval over the standard MySQL protocol reduces operational and integration overhead.