Preparations
We recommend performing testing in a Virtual Private Cloud (VPC) environment.
For more information about VPCs, see What is a VPC.
For more information about how to perform stress testing over the Internet, see Configure the public access whitelist.
Download the gist-960-euclidean dataset:


Decompress the package and use the gist_base.fvecs file.
Install Python 3 and related libraries.
h5py
json
numpy
sklearn
alibabacloud_ha3engine_vectorGenerate data
Run
prepare_data.py. The script supports vector data in .hdf5, .fvecs, .bvecs, and .ivecs format. In this example, we use.hdf5.
python3 prepare_data.py -i ./gist-960-euclidean.hdf5 View the
data/subdirectory in the script directory.gist-960-euclidean.hdf5.datais generated.Check the number of data rows.
wc -l data/gist-960-euclidean.hdf5.data
1000000 gist-960-euclidean.hdf5.dataPurchase an OpenSearch Vector Search Edition instance
For more information, see Purchase an OpenSearch Vector Search Edition instance.
Create a table
References:
Generate queries
Run the
prepare_query.pyscript to randomly generate queries from the raw data.
python3 prepare_query.py -i gist-960-euclidean.hdf5 -c 10000 -t gistObtain the
query.datafile that is generated in thedata/subdirectory
Stress testing using wrk
wrk is an open source HTTP request stress testing tool.
Download wrk from https://github.com/wg/wrk.
git clone https://github.com/wg/wrk.gitRun
search.luafor stress testing.Copy the script to the wrk/scripts/ directory.
cp search.lua wrk/scripts/Calculate the signature and modify
header["Authorization"]in therequestmethod.
-- During execution, wrk randomly selects queries to construct specific requests
request = function ()
local query = query_table[count]
count = (count + 1)%query_count
local headers = {}
headers["Authorization"] = "Basic xxxx" -- Signature information
headers["Content-Type"] = "application/json"
return wrk.format("POST", nil, headers, query)
endPerform stress testing.
-c: the number of concurrent connections.-t: the number of threads for sending the requests.-d: the specified duration for stress testing.-s: the specified script.--latency: displays the detailed stress testing results.
./wrk -c24 -d100s -t8 -s scripts/search.lua http://ha-cn-xxxxxx.ha.aliyuncs.com/vector-service/query --latencyView metrics
View metrics, such as the recall rate and response time.
For more information, see Authorize RAM users to view instance monitoring metrics.
Script download links
Performance testing data
The following shows results of Vector Search Edition, using 8th generation machines, on the ANN_GIST1M 960-dimensional dataset.
Product | OpenSearch (stress testing with 1000 queries included in gist. Adjust ef according to groundtruth) | |
Test dataset ANN_GIST1M 960-dimensional (http://corpus-texmex.irisa.fr/) | ||
Product version | OpenSearch-Vector Search Edition 2024.11 vector_service_1.4.0_test_202411081507 | OpenSearch-Vector Search Edition 2024.11 vector_service_1.3.0_202410081048 |
Machine specifications | 16core 64G ecs.g8i.4xlarge 8th generation machine | |
Test tool | wrk (https://github.com/wg/wrk) Stress test parameters: Threads: 10 Connections: 30 | wrk (https://github.com/wg/wrk) Stress test parameters: Threads: 10 Connections: 40 |
Parameters | m: 100 ef_construction: 500 | |
Vector algorithm | HNSW | QGRAPH (HNSW+quantization) |
top10 recall@95 | Query parameters: ef=60 | Query parameters: ef=40: recall@94 ef=80: recall@95 |
QPS: 4486.06 Latency (avg): 6.7ms CPU load: 95.8% | recall 94 ef=40 QPS: 4957.48 Latency (avg): 7.6ms CPU load: 93% recall 95 ef=80 QPS: 3700.74 Latency (avg): 8.19ms CPU load: 92% | |
top10 recall@99 | Query parameters: ef=170 | Maximum recall is 95.8, cannot reach 99 |
QPS: 2868.77 Latency (avg): 10.44ms CPU load: 94% | ||
top10 recall@99.5 | Query parameters: ef=300 Threads: 10 Connections: 20 | |
QPS: 2050.66 Latency (avg): 9.73ms CPU load: 95% | ||