全部产品
Search
文档中心

云原生数据仓库AnalyticDB PostgreSQL版:向量分析性能测试

更新时间:Dec 29, 2023

本文介绍AnalyticDB PostgreSQL版向量分析的性能测试。

测试环境

AnalyticDB PostgreSQL版实例与客户端ECS应处于同一VPC中,以避免网络波动带来的误差。

AnalyticDB PostgreSQL服务端规格

引擎版本

高性能版节点规格

计算节点数量

计算节点存储空间

计算节点存储类型

v6.6.1.0

16C64G

4个

1000 GB

ESSD 云盘 PL1

8C32G

4C16G

客户端ECS规格

CPU

内存

磁盘

16 核

32 GB

2 TB

准备工作

准备测试环境

  1. 本地安装3.8及以上版本的Python环境。

  2. 下载适配AnalyticDB PostgreSQL版的ann-benchmark测试工具到本地。下载链接,请参见ann-benchmark

  3. 执行如下语句,安装测试工具依赖。

    pip install -r requirements.txt 
  4. 安装20版本以上的Docker。具体操作,请参见Docker官方安装指南

  5. 执行以下语句,构建测试镜像。

    python install.py --proc 4 --algorithm adbpg

准备测试数据集

下载所需的数据集,将数据集放置于ann-benchmarks项目的data目录下。

数据集

维度

样本数

度量函数

dataset参数

下载地址

GIST

960

1,000,000

L2相似度

gist-960-euclidean

GIST

SIFT-10M

128

10,000,000

L2相似度

sift-128-euclidean

SIFT-10M

SIFT-100M

128

100,000,000

L2相似度

sift100m-128-euclidean

SIFT-100M

Deep

96

10,000,000

余弦相似度

deep-image-96-angular

Deep

Cohere

768

1000,000

L2相似度

cohere-768-euclidean

Cohere

Dbpedia

1536

1000,000

余弦相似度

dbpedia-openai-1000k-angular

Dbpedia

Glove

200

1,180,000

余弦相似度

glove-200-angular

Glove

测试流程

步骤一:配置测试工具连接信息

编辑测试工具中ann_benchmarks/algorithms/adbpg/module.py文件,根据实际情况填写配置信息:

# AnalyticDB PostgreSQL实例的内网地址。
self._host = 'gp-bp10ofhzg2z****-master.gpdb.rds.aliyuncs.com'

# AnalyticDB PostgreSQL实例的端口号。
self._port = 5432

# AnalyticDB PostgreSQL实例的数据库名称。
self._dbname = '<database_name>'

# AnalyticDB PostgreSQL实例的账号。
self._user = '<user_name>'

# AnalyticDB PostgreSQL实例的账号密码。
self._password = '<your_password>'

步骤二:配置测试参数

根据测试数据集,编辑测试工具中ann_benchmarks/algorithms/adbpg/config.yml文件。

float:
  any:
  - base_args: ['@metric']
    constructor: ADBPG
    disabled: false
    docker_tag: ann-benchmarks-adbpg
    module: ann_benchmarks.algorithms.adbpg
    name: adbpg
    run_groups:
      pq_mmap:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 12}]
        query_args: [[ 
        {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 1},
        {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 5},
        {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 10},
        {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 15}, 
        {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 20},
        {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 25},
        {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 30}, 
        {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 50}]]

arg_groups:创建索引的相关参数。如何创建向量索引,请参见创建向量索引

参数名

说明

M

HNSW索引的M值参数。M越大,构建越慢,构建精度越高。

efConstruction

HNSW索引用于控制搜索质量。

parallel_build

构建索引的并行度,一般设置为计算节点的CPU数量。

external_storage

设置缓存索引方式,取值说明:

  • 1:使用mmap缓存索引。

  • 0:使用shared_buffer缓存索引。

pq_enable

是否开启PQ,取值说明:

  • 1:开启PQ。

  • 0:不开启PQ。

pq_segments

PQ切分的segment数量,一般取向量维度dim/8

query_args:检索相关参数。

参数名

说明

ef_search

HNSW索引中控制搜索过程候选最近邻数量。

max_scan_points

控制索引最多检索的样本数。

pq_amp

开启PQ时的检索放大系数,在非PQ时不起作用。

parallel

检索的并发数,仅在Batch模式中生效。

在测试过程中,需要对上述参数进行微调,以保证95%的召回率。对于上述的测试数据集,AnalyticDB PostgreSQL版提供以下配置供参考,可根据相应的数据集选取对应的参数配置。

# for sift 128 10m
float:
  any:
  - base_args: ['@metric']
    constructor: ADBPG
    disabled: false
    docker_tag: ann-benchmarks-adbpg
    module: ann_benchmarks.algorithms.adbpg
    name: adbpg
    run_groups:
      pq_mmap:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 16}]
        query_args: [[ {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 50}]]
      pq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 16}]
        query_args: [[ {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 50}]]
      wopq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 0, pq_segments: 16}]
        query_args: [[ {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 50}]]



# for gist 960
float:
  any:
  - base_args: ['@metric']
    constructor: ADBPG
    disabled: false
    docker_tag: ann-benchmarks-adbpg
    module: ann_benchmarks.algorithms.adbpg
    name: adbpg
    run_groups:
      pq_mmap:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 120}]
        query_args: [[ {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 50}]]
      pq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 120}]
        query_args: [[ {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 50}]]
      wopq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 0, pq_segments: 120}]
        query_args: [[ {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 50}]]


# for deep 96
float:
  any:
  - base_args: ['@metric']
    constructor: ADBPG
    disabled: false
    docker_tag: ann-benchmarks-adbpg
    module: ann_benchmarks.algorithms.adbpg
    name: adbpg
    run_groups:
      pq_mmap:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 12}]
        query_args: [[ {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 50}]]
      pq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 12}]
        query_args: [[ {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 50}]]
      wopq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 0, pq_segments: 12}]
        query_args: [[ {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 50}]]


# for sift 128 100M
float:
  any:
  - base_args: ['@metric']
    constructor: ADBPG
    disabled: false
    docker_tag: ann-benchmarks-adbpg
    module: ann_benchmarks.algorithms.adbpg
    name: adbpg
    run_groups:
      pq_mmap:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 16}]
        query_args: [[ {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 50}]]
      pq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 16}]
        query_args: [[ {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 50}]]
      wopq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 0, pq_segments: 16}]
        query_args: [[ {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 50}]]


# for glove 200
float:
  any:
  - base_args: ['@metric']
    constructor: ADBPG
    disabled: false
    docker_tag: ann-benchmarks-adbpg
    module: ann_benchmarks.algorithms.adbpg
    name: adbpg
    run_groups:
      pq_mmap:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 25}]
        query_args: [[ {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 50}]]
      pq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 25}]
        query_args: [[ {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 50}]]
      wopq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 0, pq_segments: 25}]
        query_args: [[ {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 50}]]


# for cohere 768
float:
  any:
  - base_args: ['@metric']
    constructor: ADBPG
    disabled: false
    docker_tag: ann-benchmarks-adbpg
    module: ann_benchmarks.algorithms.adbpg
    name: adbpg
    run_groups:
      pq_mmap:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 96}]
        query_args: [[ {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 50}]]
      pq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 96}]
        query_args: [[ {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 50}]]
      wopq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 0, pq_segments: 96}]
        query_args: [[ {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 50}]]


# for dbpedia 1536
float:
  any:
  - base_args: ['@metric']
    constructor: ADBPG
    disabled: false
    docker_tag: ann-benchmarks-adbpg
    module: ann_benchmarks.algorithms.adbpg
    name: adbpg
    run_groups:
      pq_mmap:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 192}]
        query_args: [[ {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 50}]]
      pq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 192}]
        query_args: [[ {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 50}]]
      wopq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 0, pq_segments: 192}]
        query_args: [[ {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 50}]]

步骤三:测试检索召回率

完成上述参数配置后,执行以下命令进行召回率测试:

nohup python run.py --algorithm adbpg --dataset <数据集> --runs 1 --timeout 990000 
> annbenchmark_deep.log 2>&1 &
说明

dataset:需要替换为具体测试数据集。

等待测试结束后,执行以下命令以查看召回率结果:

python plot.py --dataset <数据集>  --recompute

输出结果示例:

0:    ADBPG(m=64, ef_construction=600, ef_search=400, max_scan_point=500, pq_amp=10)        recall: 0.963       qps: 126.200
1:   ADBPG(m=64, ef_construction=600, ef_search=400, max_scan_point=1000, pq_amp=10)        recall: 0.992       qps: 122.665

检查召回率是否符合预期,若不符合,需要调节参数并重新执行测试。

步骤四:测试检索性能

在完成召回率调整后,即可进行性能测试,方法与召回率测试类似,但在此环节中,需要打开Batch模式,以检测并发性能:

nohup python run.py --algorithm adbpg --dataset <数据集> --runs 1 --timeout 990000 --
batch > annbenchmark_deep.log 2>&1 &

等待测试运行结束,查看输出文件annbenchmark_deep.log,可以查看不同并发下的QPS、平均RT及P99 RT表现:

2023-12-20 17:31:39,297 - INFO - query using 25 parallel
worker 0 cost 9.50 s, qps 315.92, mean rt 0.00317, p99 rt 0.00951
2023-12-20 17:31:49,097 - INFO - QPS: 7653.155
2023-12-20 17:31:49,113 - INFO - query using 30 parallel
worker 0 cost 13.87 s, qps 216.36, mean rt 0.00462, p99 rt 0.04298
2023-12-20 17:32:03,260 - INFO - QPS: 6361.819
2023-12-20 17:32:03,281 - INFO - query using 50 parallel
worker 0 cost 20.78 s, qps 144.36, mean rt 0.00693, p99 rt 0.02735
2023-12-20 17:32:24,385 - INFO - QPS: 7107.920

测试结果

下文提供了不同数据集在不同AnalyticDB PostgreSQL版向量数据库配置中的性能表现结果,所有的召回率已经调节至95%,测试过程中检索统一取Top10。其中不同索引构建模式说明如下:

索引构建模式

说明

适用场景

PQ + mmap

采用mmap向量索引的缓存与持久化存储,并且使用PQ量化方式压缩向量编码和加速向量计算。

数据量超过50w且不需要更新和删除数据。

PQ + shared_buffer

采用PostgreSQL原生的shared_buffer机制进行向量索引的缓存,并且使用PQ量化方式压缩向量编码和加速向量计算。

数据量超过50w且需要更新和删除数据。

noPQ + shared_buffer

采用PostgreSQL原生的shared_buffer机制进行向量索引的缓存,不进行PQ量化压缩向量编码和加速向量计算。

数据量小于50w。

实例规格:16C64G * 4 segment

数据集:GIST L2 (960 * 100w)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1,pq_segments: 120

搜索参数:ef_search: 400, max_scan_points: 2000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

261

1

242

3

4

5

1068

4

6

10

1673

5

7

15

2158

6

7

20

2492

7

13

25

2405

9

19

30

2423

11

24

50

2453

19

43

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 120

搜索参数:ef_search: 400, max_scan_points: 2000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

505

1

161

5

7

5

771

5

8

10

1111

8

10

15

1380

10

12

20

1395

13

23

25

1422

16

31

30

1412

20

39

50

1450

34

70

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 0

搜索参数:ef_search: 400, max_scan_points: 2000

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1102

1

130

7

10

5

425

11

14

10

678

14

16

15

901

16

19

20

963

20

31

25

979

24

38

30

978

30

48

50

969

51

98

数据集:Deep1B cosine (96 * 1000w)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 12

搜索参数:ef_search: 400, max_scan_points: 1200, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1515

1

451

2

2

5

2279

2

3

10

4059

2

5

15

6274

2

5

20

8198

2

5

25

8347

3

6

30

8411

3

8

50

8178

6

22

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 12

搜索参数:ef_search: 400, max_scan_points: 1200, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

2824

1

361

2

3

5

1439

3

4

10

2261

4

5

15

2958

5

6

20

3164

6

11

25

3168

7

15

30

3170

9

19

50

3156

15

36

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 0

搜索参数:ef_search: 400, max_scan_points: 1200

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

3141

1

423

2

2

5

1240

3

4

10

2453

4

4

15

3144

4

6

20

3322

6

11

25

3332

7

15

30

3350

8

19

50

3347

14

35

数据集:SIFT L2 (128 * 100M)

索引构建模式:PQ + mmap

建索引参数:M: 16, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 16

搜索参数:ef_search: 400, max_scan_points: 2100, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

8805

1

441

2

6

5

2222

2

6

10

3528

2

7

15

4679

3

8

20

5358

3

9

25

5426

4

10

30

5527

5

15

50

5904

8

30

索引构建模式: PQ + shared_buffer

建索引参数:M: 16, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 16

搜索参数:ef_search: 400, max_scan_points: 2100, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

14763

1

282

3

4

5

1180

4

5

10

1769

5

7

15

2298

6

8

20

2438

8

14

25

2510

9

18

30

2472

12

22

50

2467

20

42

数据集:Glove Cosine (200 * 118w)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 25

搜索参数:ef_search: 400, max_scan_points: 30000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

378

1

174

5

8

5

905

5

7

10

1431

6

9

15

1891

7

9

20

1921

10

21

25

1880

13

26

30

1862

15

32

50

1998

24

50

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 25

搜索参数:ef_search: 400, max_scan_points: 30000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

715

1

83

11

14

5

351

14

20

10

495

20

25

15

635

23

29

20

628

31

50

25

613

40

67

30

605

49

83

50

576

86

151

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 0, pq_segments: 25

搜索参数:ef_search: 400, max_scan_points: 30000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

891

1

52

18

24

5

233

21

30

10

335

29

45

15

430

34

48

20

437

45

77

25

426

58

101

30

416

71

124

50

405

123

215

数据集:Cohere L2 (768 * 1M)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 96

搜索参数:ef_search: 400, max_scan_points: 600, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

285

1

317

2

3

5

1450

2

4

10

2177

4

5

15

2812

4

6

20

3275

5

9

25

3485

6

14

30

3657

7

18

50

3619

13

35

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 96

搜索参数:ef_search: 400, max_scan_points: 600, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

518

1

223

4

4

5

996

4

6

10

1548

6

7

15

2006

7

8

20

2141

8

15

25

2081

11

22

30

2160

13

27

50

2039

24

51

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 0, pq_segments: 96

搜索参数:ef_search: 400, max_scan_points: 600, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

701

1

357

2

3

5

641

7

9

10

1019

9

10

15

1330

10

13

20

1431

13

20

25

1437

16

26

30

1449

20

33

50

1445

34

66

数据集:Dbpedia cosine (1536 * 1M)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 192

搜索参数:ef_search: 400, max_scan_points: 425, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1120

1

122

7

8

5

548

8

14

10

727

12

16

15

940

15

16

20

1011

18

30

25

1022

23

38

30

1019

28

50

50

1022

47

93

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 192

搜索参数:ef_search: 400, max_scan_points: 425, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1642

1

110

8

9

5

512

9

16

10

666

14

18

15

830

17

18

20

911

21

32

25

919

26

41

30

922

31

52

50

915

53

97

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 0, pq_segments: 192

搜索参数:ef_search: 400, max_scan_points: 425, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1130

1

308

2

6

5

575

7

9

10

939

10

11

15

1197

11

13

20

1295

14

22

25

1323

17

28

30

1328

21

36

50

1315

37

68

实例规格:8C32G * 4 segment

数据集:GIST L2 (960 * 100w)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 1, pq_enable: 1, pq_segments: 120

搜索参数:ef_search: 400, max_scan_points: 2000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

681

1

252

3

5

5

868

5

7

10

1312

7

11

15

1305

11

22

20

1343

14

30

25

1323

18

37

30

1352

21

47

50

1305

37

70

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 0, pq_enable: 1, pq_segments: 120

搜索参数:ef_search: 400, max_scan_points: 2000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1349

1

123

7

10

5

423

11

13

10

582

16

27

15

611

24

44

20

603

32

59

25

605

40

76

30

596

49

99

50

594

83

151

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 0, pq_enable: 0

搜索参数:ef_search: 400, max_scan_points: 2000

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

2158

1

92

10

11

5

365

13

15

10

532

18

28

15

546

26

44

20

558

35

63

25

554

44

78

30

556

53

94

50

543

91

170

数据集:Deep1B cosine (96 * 1000w)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 1, pq_enable: 1, pq_segments: 12

搜索参数:ef_search: 400, max_scan_points: 1200, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

2781

1

559

1

2

5

2340

2

3

10

3973

2

4

15

4087

3

8

20

3705

5

18

25

3966

6

18

30

4083

7

20

50

4163

11

39

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 0, pq_enable: 1, pq_segments: 12

搜索参数:ef_search: 400, max_scan_points: 1200, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

5658

1

333

2

3

5

1196

4

5

10

1685

5

11

15

1740

8

18

20

1745

11

25

25

1754

14

30

30

1755

17

38

50

1733

28

64

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: , pq_enable: 0

搜索参数:ef_search: 400, max_scan_points: 1200

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

6456

1

354

2

3

5

1292

3

4

10

1758

5

11

15

1779

8

18

20

1767

11

26

25

1778

13

33

30

1768

16

40

50

1770

28

64

数据集:SIFT L2 (128 * 10M)

索引构建模式:PQ + mmap

建索引参数:M: 16, efConstruction: 600, parallel_build: 8, external_storage: 1, pq_enable: 1, pq_segments: 16

搜索参数:ef_search: 400, max_scan_points: 1000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1613

1

523

1

2

5

2568

1

3

10

4500

2

4

15

5174

2

6

20

5045

3

11

25

4846

5

16

30

4873

6

23

50

4445

11

32

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 0, pq_enable: 1, pq_segments: 12

搜索参数:ef_search: 400, max_scan_points: 1000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

4023

1

345

2

3

5

1169

4

5

10

1601

6

12

15

1632

9

21

20

1630

12

27

25

1604

15

33

30

1579

18

45

50

1536

32

53

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 0, pq_enable: 0

搜索参数:ef_search: 400, max_scan_points: 1000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

5164

1

371

2

3

5

1238

3

5

10

1605

6

12

15

1615

9

20

20

1531

12

31

25

1585

15

36

30

1531

19

45

50

1517

32

71

数据集:Glove Cosine (200 * 118w)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 1, pq_enable: 1, pq_segments: 25

搜索参数:ef_search: 400, max_scan_points: 30000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

884

1

150

6

7

5

638

7

9

10

894

11

20

15

898

16

31

20

890

22

45

25

894

27

54

30

877

34

66

50

874

57

110

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 0, pq_enable: 1, pq_segments: 25

搜索参数:ef_search: 400, max_scan_points: 30000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1537

1

75

13

16

5

250

19

25

10

321

30

51

15

310

48

83

20

301

66

115

25

289

86

153

30

274

109

192

50

252

197

346

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 0, pq_enable: 0, pq_segments: 25

搜索参数:ef_search: 400, max_scan_points: 30000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1804

1

41

23

34

5

164

30

47

10

216

46

82

15

211

70

123

20

209

95

168

25

209

119

207

30

207

144

254

50

200

248

434

数据集:Cohere L2 (768 * 1M)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 1, pq_enable: 1, pq_segments: 96

搜索参数:ef_search: 400, max_scan_points: 600, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

688

1

323

2

3

5

1153

3

5

10

1767

5

9

15

1801

7

16

20

1921

9

22

25

1886

12

33

30

1820

16

38

50

1900

25

72

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 0, pq_enable: 1, pq_segments: 96

搜索参数:ef_search: 400, max_scan_points: 600, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1224

1

198

5

7

5

561

8

10

10

838

11

19

15

819

17

33

20

835

23

42

25

839

29

52

30

835

35

68

50

834

59

107

索引构建模式:noPQ + shared_buffer

建索引参数:

M: 64, efConstruction: 600, parallel_build: 8, external_storage: 0, pq_enable: 0, pq_segments: 96

搜索参数:ef_search: 400, max_scan_points: 600, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1411

1

130

7

8

5

532

8

10

10

777

12

20

15

795

18

32

20

798

24

46

25

810

30

57

30

812

36

67

50

815

61

108

数据集:Dbpedia cosine (1536 * 1M)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 1, pq_enable: 1, pq_segments: 192

搜索参数:ef_search: 400, max_scan_points: 425, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

2214

1

125

7

9

5

374

12

15

10

506

18

30

15

504

28

51

20

509

38

69

25

500

49

90

30

507

58

104

50

509

97

169

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 0, pq_enable: 1, pq_segments: 192

搜索参数:ef_search: 400, max_scan_points: 425, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

7651

1

87

10

12

5

314

14

18

10

443

21

33

15

448

32

53

20

445

44

76

25

444

55

99

30

447

66

119

50

450

110

192

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 0, pq_enable: 0, pq_segments: 192

搜索参数:ef_search: 400, max_scan_points: 425, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

2206

1

128

7

8

5

484

9

11

10

729

12

19

15

748

19

34

20

782

24

44

25

783

31

57

30

794

37

65

50

787

62

122

实例规格:4C16G * 4 segment

数据集:GIST L2 (960 * 100w)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 1, pq_segments: 120

搜索参数:ef_search: 400, max_scan_points: 2000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

2014

1

237

3

5

5

630

7

13

10

645

15

27

15

650

22

44

20

642

30

57

25

634

39

70

30

622

47

98

50

621

79

147

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 1, pq_segments: 120

搜索参数:ef_search: 400, max_scan_points: 2000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

2934

1

119

7

11

5

275

17

29

10

292

33

60

15

282

52

93

20

286

69

121

25

284

87

154

30

280

106

188

50

276

180

328

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 0

搜索参数:ef_search: 400, max_scan_points: 2000

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

4468

1

87

11

13

5

244

20

32

10

263

37

64

15

261

57

100

20

269

73

130

25

268

92

163

30

267

111

196

50

266

188

327

数据集:Deep1B cosine (96 * 1000w)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 1, pq_enable: 1, pq_segments: 12

搜索参数:ef_search: 400, max_scan_points: 1200, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

6696

1

520

1

2

5

1679

2

5

10

1832

5

12

15

1908

7

18

20

1899

10

28

25

1920

12

33

30

1910

15

39

50

1904

26

66

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 1, pq_segments: 12

搜索参数:ef_search: 400, max_scan_points: 1200, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

11677

1

302

3

4

5

814

6

11

10

858

11

23

15

870

17

34

20

870

22

48

25

864

28

57

30

866

34

70

50

848

58

119

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 0

搜索参数:ef_search: 400, max_scan_points: 1200

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

14657

1

301

3

4

5

821

6

11

10

885

11

22

15

890

16

34

20

898

22

45

25

896

27

56

30

891

33

67

50

885

56

112

数据集:SIFT L2 (128 * 10M)

索引构建模式:PQ + mmap

建索引参数:M: 16, efConstruction: 600, parallel_build: 4, external_storage: 1, pq_enable: 1, pq_segments: 16

搜索参数:ef_search: 400, max_scan_points: 1000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

3004

1

577

1

2

5

1846

2

5

10

2085

4

12

15

2523

5

14

20

2599

7

18

25

2573

9

24

30

2349

12

35

50

2437

20

53

索引构建模式:PQ + shared_buffer

建索引参数:M: 16, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 1, pq_segments: 16

搜索参数:ef_search: 400, max_scan_points: 1000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

6409

1

356

2

3

5

966

5

10

10

1025

9

19

15

974

15

32

20

1041

19

39

25

1032

24

49

30

981

30

69

50

987

50

104

索引构建模式:noPQ + shared_buffer

建索引参数:M: 16, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 0

搜索参数:ef_search: 400, max_scan_points: 1000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

9058

1

365

2

4

5

960

5

10

10

938

10

21

15

961

15

32

20

1034

19

40

25

1005

24

52

30

1013

29

59

50

986

52

101

数据集:Glove Cosine (200 * 118w)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 1, pq_enable: 1, pq_segments: 25

搜索参数:ef_search: 400, max_scan_points: 30000, pq_amp: 10

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1810

1

127

7

11

5

340

14

26

10

341

29

54

15

343

43

83

20

340

58

109

25

341

73

141

30

342

87

165

50

343

145

265

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 1, pq_segments: 25

搜索参数:ef_search: 400, max_scan_points: 30000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

3234

1

72

13

19

5

167

29

52

10

158

62

115

15

172

82

167

20

147

135

237

25

143

174

306

30

140

213

374

50

128

390

663

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 0, pq_segments: 25

搜索参数:ef_search: 400, max_scan_points: 30000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

3571

1

40

24

35

5

104

47

87

10

98

101

186

15

95

156

279

20

96

210

368

25

96

260

458

30

96

310

544

50

96

542

933

数据集:Cohere L2 (768 * 1M)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 1, pq_enable: 1, pq_segments: 96

搜索参数:ef_search: 400, max_scan_points: 600, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1603

1

208

4

7

5

504

9

17

10

543

17

32

15

543

27

50

20

522

37

76

25

540

45

83

30

542

54

97

50

535

92

166

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 1, pq_segments: 96

搜索参数:ef_search: 400, max_scan_points: 600, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

3679

1

199

4

6

5

501

9

18

10

578

16

31

15

555

26

52

20

548

35

73

25

555

44

85

30

559

53

97

50

555

89

163

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 0, pq_segments: 96

搜索参数:ef_search: 400, max_scan_points: 600, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

2768

1

131

7

9

5

369

13

21

10

406

24

43

15

410

36

65

20

413

48

84

25

408

60

107

30

408

73

136

50

409

121

213

数据集:Dbpedia cosine (1536 * 1M)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 1, pq_enable: 1, pq_segments: 192

搜索参数:ef_search: 400, max_scan_points: 425, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

4700

1

111

8

14

5

249

19

31

10

230

42

77

15

240

61

104

20

236

83

150

25

241

102

173

30

238

124

214

50

239

208

376

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 1, pq_segments: 192

搜索参数:ef_search: 400, max_scan_points: 425, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

15958

1

101

9

14

5

209

22

36

10

217

45

77

15

210

70

127

20

211

93

168

25

216

114

191

30

215

138

244

50

214

232

411

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 0, pq_segments: 192

搜索参数:ef_search: 400, max_scan_points: 425, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

4541

1

230

3

8

5

383

12

19

10

393

24

43

15

389

37

66

20

386

51

89

25

381

64

115

30

385

77

138

50

389

127

229