PolarDB for MySQL supports efficient vector similarity search using SQL through its built-in vector data type, distance functions, and vector indexes. This feature is suitable for applications such as recommendation systems, chatbots, and image retrieval. The vector computation capability is integrated into the database kernel, supports atomicity, consistency, isolation, and durability (ACID) transactions, and uses the In-Memory Column Index (IMCI) and algorithms such as HNSW to provide both exact k-nearest neighbor (KNN) search with 100% recall and high-performance approximate nearest neighbor (ANN) search.
Scope
Before you use this feature, ensure that your cluster meets the following version requirements: The kernel version must be MySQL 8.0.2, and the revision version must be 8.0.2.2.30 or later.
Version update history
Features vary by kernel version. We recommend that you upgrade your cluster to the latest version.
Feature | Minimum kernel version requirement | Description |
Basic vector features and the HNSW index algorithm. | 8.0.2.2.30 | Provides the basic capabilities for vector retrieval. |
Adds the | 8.0.2.2.31 | Introduces index algorithms based on the FAISS implementation. |
Modify or delete vector indexes. | 8.0.2.2.32 | Lets you dynamically manage vector indexes using |
Vector type
Type definition
You can use the CREATE TABLE or ALTER TABLE statements with the VECTOR(N) type to define a vector column.
Nrepresents the vector dimension, which can range from 1 to 16,383.Each dimension of the vector is a single-precision floating-point number (4 bytes).
The vector type supports equality comparison only with other vector types and cannot be compared with any other type.
A vector column cannot be a primary key, foreign key, unique key, or partition key.
Example
-- The following example defines the vector dimension of column v1 in table t1 as 4.
CREATE TABLE t1 (id INT PRIMARY KEY, v1 VECTOR(4));If you define a dimension greater than 16,383, the error Data size (xxx Bytes, xxx dimensions) exceeds VECTOR max (65532 Bytes, 16383 dimensions) for column: 'xxx' is returned.
Type conversion
PolarDB provides the STRING_TO_VECTOR and VECTOR_TO_STRING functions to convert between string and binary formats.
Convert a string to binary (STRING_TO_VECTOR)
This function converts a vector from text format to the internal binary format of the database.
Input format: A sequence of floating-point numbers enclosed in square brackets (
[]) and separated by commas (,), such as'[1.2, 3.4, 5.6]'.NoteIf the input string has an incorrect format, the error
Data cannot be converted to a valid vector: 'xxx'is returned.Byte order: The converted binary data uses the little-endian byte order. For example, the floating-point number
1.0, which has a hexadecimal representation of0x3F800000, is stored in the database with a binaryHEXvalue of0000803F. Take note of this when you serialize data across systems.
Example
SELECT STRING_TO_VECTOR('[1,2,3,4]');
+-------------------------------+
| STRING_TO_VECTOR('[1,2,3,4]') |
+-------------------------------+
| ? @ @@ @ |
+-------------------------------+SELECT HEX(STRING_TO_VECTOR('[1,2,3,4]'));
+------------------------------------+
| HEX(STRING_TO_VECTOR('[1,2,3,4]')) |
+------------------------------------+
| 0000803F000000400000404000008040 |
+------------------------------------+Convert binary to a string (VECTOR_TO_STRING)
This function converts binary vector data stored in the database to a string format.
Input format: Every 4 bytes correspond to the binary representation of a floating-point number for one dimension of the vector, using the little-endian byte order. For example, the floating-point number 1.0 has a hexadecimal representation of 0x3F800000. When stored in the database, its binary HEX value is 0000803F. Take note of this when you serialize data across systems.
If the input binary has an incorrect format, the error Data cannot be converted to a valid vector: '' is returned.
Example
SELECT VECTOR_TO_STRING(0x0000803F000000400000404000008040);
+------------------------------------------------------+
| VECTOR_TO_STRING(0x0000803F000000400000404000008040) |
+------------------------------------------------------+
| [1.00000e+00,2.00000e+00,3.00000e+00,4.00000e+00] |
+------------------------------------------------------+Related operation examples
Create a new table that contains a vector column
CREATE TABLE t1 (id INT PRIMARY KEY, v1 VECTOR(4));Add a vector column to an existing table
ALTER TABLE t1 ADD COLUMN v1 VECTOR(4);Insert a vector
Insert a vector as a string using the
STRING_TO_VECTORfunction.INSERT INTO t1 (id, v1) VALUES (1, STRING_TO_VECTOR('[1.0, 2.0, 3.0, 4.0]'));Insert a vector in binary format.
INSERT INTO t1 VALUES (2, 0x0000803F000000400000404000008040);
Insert or update a vector
Insert or update a vector as a string using STRING_TO_VECTOR.
INSERT INTO t1 VALUES (1, STRING_TO_VECTOR('[1.0, 2.0, 3.0, 4.0]')) ON DUPLICATE KEY UPDATE v1 = STRING_TO_VECTOR('[1.0, 2.0, 3.0, 4.0]');Update a vector
Update a vector as a string using STRING_TO_VECTOR. You can also update the vector directly in binary format.
UPDATE t1 SET v1 = STRING_TO_VECTOR('[1.0, 2.0, 3.0, 4.0]') WHERE id = 1;Delete a vector
Delete a vector as a string using STRING_TO_VECTOR. You can also delete the vector directly in binary format.
DELETE FROM t1 WHERE v1 = STRING_TO_VECTOR('[1.0, 2.0, 3.0, 4.0]');Using vector indexes
PolarDB creates vector indexes by defining a specially formatted string in a column's COMMENT.
Prerequisites
Before you can use vector indexes, you must add a read-only node for the In-Memory Column Index (IMCI) and create an IMCI for the column or the entire table. Only IMCI read-only nodes can build vector indexes to support vector search.
Create a vector index
You can define a vector index by modifying a column's COMMENT in a CREATE TABLE or ALTER TABLE statement.
Syntax
COMMENT 'imci_vector_index=vector_index_algorithm(index_parameter_list) [other_comments]'imci_vector_index=: A fixed prefix that declares a vector index definition.vector_index_algorithm: The index algorithm, such asHNSW,FAISS_HNSW_FLAT, orFAISS_HNSW_PQ.index_parameter_list: The index building parameters, enclosed in parentheses.You can separate multiple parameters with spaces, commas, or semicolons. To avoid confusion, we recommend that you use commas.
The format for a single parameter is
parameter_name=parameter_valueorparameter_name:parameter_value. For example:HNSW(metric=COSINE,max_degree=16,ef_construction=300)If all parameters use their default values, you must still include the parentheses. For example:
COMMENT "imci_vector_index=HNSW() other_comments"
Index algorithms
Algorithm | Scenarios |
| Implemented based on VSAG. This algorithm provides high precision but consumes a large amount of memory. |
| Implemented based on FAISS. This algorithm provides high precision but consumes a large amount of memory. |
| Implemented based on FAISS. This algorithm uses product quantization (PQ) to compress vectors. This reduces memory usage but causes some precision loss. |
Index parameters
Parameter | Available alias | Applicable algorithms | Description | Valid values/Range | Default value |
|
|
| The vector similarity measure. |
Note Only uppercase values are supported. |
|
|
|
| The maximum number of connections for each node in the graph. | A positive integer. A larger value creates a denser graph, which slows down index building and increases memory usage, but may improve query precision. The recommended range is 16 to 64. | 16 |
| - |
| The number of neighbor nodes to search during index building. | A positive integer. A larger value improves index quality but increases the build time. The recommended range is 100 to 500. | 200 |
| - |
| The number of subspaces for product quantization (PQ). | A positive integer that is a divisor of the vector dimension. The vector dimension must be divisible by | 1 |
| - |
| The number of bits used for the quantized representation of each subspace in product quantization (PQ). | A positive integer less than or equal to 24. This parameter is typically set to 8, which means each subspace has 256 centroids. | 8 |
Example 1: Define a vector index when creating a table
-- Method 1: Enable columnstore for the entire table, and then define a vector index on the v1 column.
CREATE TABLE t1 (
id INT PRIMARY KEY,
v1 VECTOR(4) COMMENT 'imci_vector_index=HNSW(metric=COSINE, max_degree=16)'
) COMMENT 'COLUMNAR=1';
-- Method 2: Enable columnstore and a vector index only for the v1 column.
CREATE TABLE t1 (
id INT PRIMARY KEY,
description TEXT,
v1 VECTOR(4) COMMENT 'COLUMNAR=1, imci_vector_index=HNSW(metric=COSINE)'
);Example 2: Add a vector index to an existing table
-- Assume that the existing table is t1.
CREATE TABLE t1 (
id int auto_increment PRIMARY KEY,
v1 vector(4)
);
-- Method 1: Add a table-level columnstore index and modify the COMMENT of the VECTOR column to specify the vector index algorithm and parameters.
ALTER TABLE t1 comment "COLUMNAR=1", MODIFY COLUMN v1 vector(4) COMMENT "imci_vector_index=HNSW(metric=COSINE,max_degree=16,ef_construction=300)";
-- Method 2: Create a columnstore index only for the v1 column. When the columnstore index is initialized, a vector index is automatically created based on the COMMENT of the v1 column.
ALTER TABLE t1 MODIFY COLUMN v1 vector(4) COMMENT "COLUMNAR=1 imci_vector_index=HNSW(metric=COSINE,max_degree=16,ef_construction=300)";
Parameters
Parameter | Description |
imci_enable_inline_vector_index | Specifies whether to allow the creation of vector indexes.
|
imci_vector_index_dump_rows_threshold | Controls the incremental write size for vector indexes. A background task periodically checks the increment between the current vector index snapshot offset and the IMCI snapshot offset. When the number of incremental rows exceeds this threshold, the background task appends the new data rows to the vector index.
|
Modify a vector index
Versions 8.0.2.2.32 and later support modifying the algorithm and parameters of a vector index by changing the comment on a vector column.
Modify the vector index algorithm.
CREATE TABLE t1 ( id int auto_increment PRIMARY KEY, v1 vector(4) COMMENT "imci_vector_index=HNSW(metric=COSINE,max_degree=16,ef_construction=300)" ) COMMENT 'COLUMNAR=1'; -- Modify the index algorithm from HNSW to the FAISS-based HNSW algorithm. ALTER TABLE t1 modify column v1 vector(4) COMMENT "imci_vector_index=FAISS_HNSW_FLAT(metric=COSINE,max_degree=16,ef_construction=300)";Modify the vector index parameters.
CREATE TABLE t1 ( id int auto_increment PRIMARY KEY, v1 vector(4) COMMENT "imci_vector_index=HNSW(metric=COSINE,max_degree=16,ef_construction=300)" ) COMMENT 'COLUMNAR=1'; -- Modify the index parameters. Change the max_degree of HNSW from 16 to 32. ALTER TABLE t1 modify column v1 vector(4) COMMENT "imci_vector_index=HNSW(metric=COSINE,max_degree=32,ef_construction=300)";
Delete a vector index
Delete the vector index when deleting the columnstore index.
CREATE TABLE t1 ( id int auto_increment PRIMARY KEY, v1 vector(4) COMMENT "imci_vector_index=HNSW(metric=COSINE,max_degree=16,ef_construction=300)" ) COMMENT 'COLUMNAR=1'; -- A vector index cannot exist independently of a columnstore index and is deleted by cascade delete. ALTER TABLE t1 COMMENT 'COLUMNAR=0';Delete only the vector index.
NoteVersions 8.0.2.2.32 and later support deleting a vector index by deleting the comment on a vector column.
CREATE TABLE t1 ( id int auto_increment PRIMARY KEY, v1 vector(4) COMMENT "imci_vector_index=HNSW(metric=COSINE,max_degree=16,ef_construction=300)" ) COMMENT 'COLUMNAR=1'; -- Delete the vector index by deleting the comment on the vector type column. ALTER TABLE t1 modify column v1 vector(4) COMMENT "";
Vector queries
Approximate Nearest Neighbor (ANN) search
Approximate Nearest Neighbor (ANN) search uses vector indexes to accelerate queries.
Limits
The optimizer uses a vector index only when it detects a specific SQL pattern. The query must meet all of the following conditions:
The query must include both the
ORDER BYandLIMITclauses.The first expression in the
ORDER BYclause must beDISTANCE(...).The sorting direction for
ORDER BY DISTANCE(...)must beASC.The field in the
DISTANCE(...)expression must have an active vector index.The
metricparameter in theDISTANCE(...)expression must be the same as themetricparameter used when you created the vector index. For example, if the vector index was built with theCOSINEdistance,DISTANCE(...)must also specifyCOSINE.If the query includes a
JOIN, the execution plan must use a Right Deep Join Tree structure. The field referenced in theDISTANCE(...)expression must be from the driving table of the JOIN.The query cannot contain a
GROUP BYclause. To perform aggregation, you can complete the vector retrieval in a subquery.
Steps to enable ANN search
Enable vector index acceleration.
Execute the following command in the session to allow the optimizer to use vector indexes.SET imci_enable_vector_search = ON;Force queries to use a column-based execution plan.
To ensure that queries can access vector indexes built on the column store, you must force the query to use the column store path.SET cost_threshold_for_imci = 0;ImportantSET cost_threshold_for_imci = 0;is a session-level setting that forces all queries in the session to use a columnstore index. This can degrade the performance of queries that would otherwise run efficiently on the row store, such as point queries that use a primary key. Therefore, use this setting only in sessions that perform vector retrieval. Do not set it as a global parameter.Write a query that meets the ANN trigger conditions.
Execute an SQL statement that meets the constraints to perform an ANN search with a vector index.-- Find the top 5 most similar records SELECT id, v1 FROM t1 WHERE category_id = 123 -- Additional conditions can be added ORDER BY DISTANCE(v1, STRING_TO_VECTOR('[...]'), 'COSINE') ASC LIMIT 5;Correcting a non-compliant query (using
GROUP BYas an example):-- Incorrect: GROUP BY conflicts with ORDER BY DISTANCE() -- SELECT category_id, COUNT(*) FROM t1 GROUP BY category_id ORDER BY DISTANCE(v1, STRING_TO_VECTOR('[...]'), 'COSINE') ASC LIMIT 5; -- Correct: First perform ANN search in a subquery, then aggregate the results SELECT category_id, COUNT(*) FROM ( SELECT id, category_id FROM t1 ORDER BY DISTANCE(v1, STRING_TO_VECTOR('[...]'), 'COSINE') ASC LIMIT 100 -- First retrieve 100 nearest neighbors ) AS nearest_neighbors GROUP BY category_id;You can use the
EXPLAINstatement to check whether vector index acceleration for ANN search is active for your SQL statement. If the execution plan containsVector Search, the feature is active.EXPLAIN SELECT * FROM t1 ORDER BY DISTANCE(v1, STRING_TO_VECTOR('[1.2, 2.3, 3.4, 4.5]'), 'COSINE') LIMIT 2;Sample output:
+----+---------------------------+------+--------+--------+---------------------------------------------------------------------------------------+ | ID | Operator | Name | E-Rows | E-Cost | Extra Info | +----+---------------------------+------+--------+--------+---------------------------------------------------------------------------------------+ | 1 | Select Statement | | | | IMCI Execution Plan (max_dop = 2, max_query_mem = 429496729) | | 2 | └─Compute Scalar | | 2 | 0.00 | | | 3 | └─Limit | | 2 | 0.00 | Offset=0 Limit=2 | | 4 | └─Sort | | 2 | 0.00 | Sort Key: VECTOR_DISTANCE(t1.v1,"[1.200000,2.300000,3.400000,4.500000]","COSINE") ASC | | 5 | └─Vector Search | t1 | 2 | 0.00 | | +----+---------------------------+------+--------+--------+---------------------------------------------------------------------------------------+
Parameter descriptions
Parameter | Description |
imci_enable_vector_search | A switch that controls whether to use vector indexes to accelerate ANN search.
|
imci_hnswpq_k_factor | The recall expansion factor for vector retrieval when the vector index uses the
Note When performing a vector query to retrieve K approximate nearest neighbors (ANNs), the executor retrieves |
View vectors
SELECT VECTOR_TO_STRING(v1) FROM t1;Distance calculation
You can use the DISTANCE function to calculate the similarity between two vectors in a specified way.
DISTANCE(vector1, vector2, '<metric>')metric parameter
Parameter | Description |
| Cosine similarity. Measures the directional similarity between two vectors. The result is the cosine of the angle between the two vectors. A smaller value indicates greater similarity. |
| Euclidean distance. Measures the straight-line distance between two vectors or points in Euclidean space. A smaller value indicates greater similarity. |
| Dot product. The result of multiplying the corresponding components of two vectors and then adding them up. A smaller value indicates greater similarity. |
Example
SELECT DISTANCE(v1, STRING_TO_VECTOR('[1.2,2.3,3.4,4.5]'), 'COSINE') FROM t1;Exact Nearest Neighbor (KNN) search
An exact search traverses all data to calculate distances, which guarantees a 100% recall rate. It does not rely on vector indexes and is suitable for scenarios with small datasets.
-- Find the 10 records most similar to the target
SELECT id, VECTOR_TO_STRING(v1)
FROM t1
ORDER BY DISTANCE(v1, STRING_TO_VECTOR('[...]'), 'COSINE')
LIMIT 10;Example
SELECT id, VECTOR_TO_STRING(v1) FROM t1 ORDER BY DISTANCE(v1, STRING_TO_VECTOR('[1.2,2.3,3.4,4.5]'), 'COSINE') LIMIT 1;Vector queries with predicates
You can use additional columns as predicates to filter vector query results. For example:
SELECT * FROM t1 WHERE id < 10 ORDER BY DISTANCE(v1, STRING_TO_VECTOR('[1.2, 2.3, 3.4, 4.5]'), 'COSINE') LIMIT 2;For vector queries that include predicates, you can adjust specific parameters to control the order of vector index retrieval and predicate filtering.
Execution strategies
Vector search often involves filtering by other properties. The optimizer intelligently selects an execution strategy based on the selectivity of the filter condition:
Prefilter (prefilter): When a filter condition can significantly reduce the amount of data (for example, when the number of matching rows is less than
imci_vector_search_prefilter_rows, which defaults to 10,000), the system first applies a predicate filter to the entire table to retrieve matching records, sorts these records, and then returns the number of records specified by the LIMIT clause.Post-filtering (postfilter): When the filter condition has poor selectivity (for example, if the match rate is higher than
imci_vector_search_filter_pct, which is 20% by default), the system first uses the vector index to recall results and then performs predicate filtering on the recalled results. If the number of filtered results is less than the number specified by the LIMIT clause, this procedure is repeated.Inline filter: If the optimizer does not select the prefilter or postfilter strategy, it uses the inline filter strategy. The system first filters the entire table using the predicate and builds a
bitmapfilter from the RowIDs of the matching records. Then, it uses the vector index for retrieval. During the vector index retrieval process, the system uses thebitmapfilter to ensure that the most similar records returned meet the predicate condition.
Adaptive execution optimization
The optimizer's estimate of the number of rows matched by a predicate filter can have a bias. This can lead to a suboptimal choice of execution strategy. An adaptive execution mechanism dynamically switches the execution strategy during runtime based on the actual number of matching rows.
When running a vector query with predicates using the inline filter strategy, if the actual number of matching records is less than imci_vector_search_prefilter_rows after the predicate is applied to the entire table, the system dynamically switches to the prefilter strategy.
Partition optimization
Vector queries often include predicate conditions such as tenant IDs or tags. For these queries, you can use the LIST DEFAULT HASH partitioning method. You can create separate LIST partitions for tenants or tags with large data volumes, and DEFAULT HASH partitions for other tenants or tags. The optimizer can detect these queries. Based on the results of partition pruning, the optimizer selects the postfilter strategy for vector queries on large tenants or tags. For other tenants, it uses the prefilter or inline filter strategy. This helps select the optimal execution strategy for tenants or tags of different sizes.
Parameter descriptions
Parameter | Description |
imci_vector_search_filter_pct | If the estimated selectivity of the predicate filter is greater than or equal to this value, the system prioritizes vector index retrieval.
The default value of 20 means that if the estimated selectivity of the predicate is 20% or higher, the system first retrieves results using the vector index and then applies the predicate filter. |
imci_enable_vector_search_inline_filter | Controls whether to use the inline filter strategy for vector queries that include predicates.
|
imci_vector_search_prefilter_rows | If the estimated number of matching rows from the predicate filter is less than this value, the system prioritizes the prefilter retrieval method.
The default value of 10,000 means that if the estimated number of matching rows from the predicate filter is less than 10,000, the system prioritizes the prefilter method. It retrieves matching records, sorts them, and then returns the number of rows specified by the LIMIT clause. |
Monitor vector index status
Building vector index data and advancing the columnstore index snapshot are separate background tasks. With continuous writes, the vector index snapshot offset might lag behind the columnstore data snapshot. Before you run a vector search, you must confirm that the vector index is available. The vector index status is shown in system views. This includes the snapshot offset and the status of the index on the vector column.
Vector index snapshot offset
The VECTOR_ROWS column in the INFORMATION_SCHEMA.IMCI_INDEX_STATS view shows the number of vectors in the snapshot loaded by the columnstore index. A value of 0 means no vector index is available on the current columnstore index. This can happen if the table has no vector index, or if a new index has not yet generated a valid snapshot.
You can also compare this value with the ROW_ID column in the INFORMATION_SCHEMA.IMCI_INDEXES view to estimate the build latency for the vector index data. The vector index works even with latency. During a search, the system automatically handles any vector data that has not been written to the index.
Physical vector index status
The INFORMATION_SCHEMA.IMCI_VECTOR_INDEX_STATS system view shows the status of each physical vector index inside a vector index snapshot. The table schema is as follows:
Column name | Description |
| DB name. |
| Table name. |
| Vector column name. |
| The actual number of vectors in the vector index. This value is usually smaller than VECTOR_ROWS because NULL values and rows that were deleted during the build are excluded. |
| Memory usage. |
| Size of the persistent file. |
| The index type. Currently, only HNSW is supported. |
| The vector index build parameters in JSON format. |
This view outputs the memory status and statistics for the vector index on each column where a vector index is enabled. This information is not directly related to the availability of the vector index.
Approximate vector searches cannot be performed on any column that is not included in the query results of this view.
In this view, a value of 0 in the
VECTORScolumn does not mean the vector index is unavailable. This usually happens after a cluster restart when an incremental build or an approximate query has not yet run. The persistent data for the vector index has not been loaded, so the exact number of vectors is temporarily unavailable. For accurate vector snapshot information, use theVECTOR_ROWScolumn in theINFORMATION_SCHEMA.IMCI_INDEX_STATSsystem view.
Examples
To ensure that vector indexes can be built normally with a small amount of data, you must set the
imci_vector_index_dump_rows_thresholdparameter to 1 in the PolarDB console.Create a test table and insert data:
CREATE TABLE t1 ( id int auto_increment PRIMARY KEY, v1 vector(4) COMMENT "imci_vector_index=HNSW(metric=COSINE,max_degree=16,ef_construction=300)" ) comment "COLUMNAR=1"; INSERT INTO t1 VALUES(1, STRING_TO_VECTOR('[1.1,1.2,1.3,1.4]')), (2, STRING_TO_VECTOR('[2,2.2,2.4,2.6]'));Check the index snapshot status and the vector index status:
SELECT schema_name,table_name,row_id FROM information_schema.imci_indexes; +-------------+------------+--------+ | schema_name | table_name | row_id | +-------------+------------+--------+ | test | t1 | 2 | +-------------+------------+--------+ 1 row in set (0.00 sec) SELECT schema_name,table_name,vector_rows FROM information_schema.imci_index_stats; +-------------+------------+-------------+ | schema_name | table_name | vector_rows | +-------------+------------+-------------+ | test | t1 | 2 | +-------------+------------+-------------+ 1 row in set (0.00 sec) SELECT schema_name,table_name,column_name,vectors FROM information_schema.imci_vector_index_stats; +-------------+------------+-------------+---------+ | schema_name | table_name | column_name | vectors | +-------------+------------+-------------+---------+ | test | t1 | v1 | 2 | +-------------+------------+-------------+---------+ 1 row in set (0.00 sec)Insert more data:
INSERT INTO t1 VALUES(3, STRING_TO_VECTOR("[8.8,8.88,8.888,8.8888]"));The value of the
VECTORSfield in the query result forINFORMATION_SCHEMA.IMCI_VECTOR_INDEX_STATSis now 3, which is an increase of 1. This indicates that the background task wrote the inserted data to the vector index:SELECT schema_name,table_name,column_name,vectors FROM information_schema.imci_vector_index_stats; +-------------+------------+-------------+---------+ | schema_name | table_name | column_name | vectors | +-------------+------------+-------------+---------+ | test | t1 | v1 | 3 | +-------------+------------+-------------+---------+ 1 row in set (0.00 sec)Restart the columnstore index read-only node and check the index status:
ImportantRestarting the columnstore index read-only node makes the current node unavailable for a short time.
SELECT schema_name,table_name,row_id,state FROM information_schema.imci_indexes; +-------------+------------+--------+-----------+ | schema_name | table_name | row_id | state | +-------------+------------+--------+-----------+ | test | t1 | 3 | COMMITTED | +-------------+------------+--------+-----------+ 1 row in set (0.00 sec) SELECT schema_name,table_name,vector_rows FROM information_schema.imci_index_stats; +-------------+------------+-------------+ | schema_name | table_name | vector_rows | +-------------+------------+-------------+ | test | t1 | 3 | +-------------+------------+-------------+ 1 row in set (0.00 sec) SELECT schema_name,table_name,column_name,vectors FROM information_schema.imci_vector_index_stats; +-------------+------------+-------------+---------+ | schema_name | table_name | column_name | vectors | +-------------+------------+-------------+---------+ | test | t1 | v1 | 0 | +-------------+------------+-------------+---------+ 1 row in set (0.00 sec)The results show that
VECTORSis 0 in theIMCI_VECTOR_INDEX_STATSsystem view, while the vector index snapshot offset is 3. You can use the execution plan to confirm that the vector index is available. Then, you can run an approximate query and check the vector index status again:SET cost_threshold_for_imci = 0; EXPLAIN SELECT * FROM t1 ORDER BY DISTANCE(v1, STRING_TO_VECTOR("[1.2, 2.3, 3.4, 4.5]"), "COSINE") LIMIT 1; +----+-----------------------+------+--------+--------+---------------------------------------------------------------------------------------+ | ID | Operator | Name | E-Rows | E-Cost | Extra Info | +----+-----------------------+------+--------+--------+---------------------------------------------------------------------------------------+ | 1 | Select Statement | | | | IMCI Execution Plan (max_dop = 4, max_query_mem = 858993459) | | 2 | +-Compute Scalar | | | | | | 3 | +-Limit | | | | Offset=0 Limit=1 | | 4 | +-Sort | | | | Sort Key: VECTOR_DISTANCE(t1.v1,"[1.200000,2.300000,3.400000,4.500000]","COSINE") ASC | | 5 | +-Vector Search | t1 | | | | +----+-----------------------+------+--------+--------+---------------------------------------------------------------------------------------+ 5 rows in set (0.00 sec) SELECT id, vector_to_string(v1) FROM t1 ORDER BY DISTANCE(v1, STRING_TO_VECTOR("[1.2, 2.3, 3.4, 4.5]"), "COSINE") LIMIT 1; +----+---------------------------------------------------+ | id | vector_to_string(v1) | +----+---------------------------------------------------+ | 2 | [2.00000e+00,2.20000e+00,2.40000e+00,2.60000e+00] | +----+---------------------------------------------------+ 1 row in set (0.00 sec) SELECT schema_name,table_name,column_name,vectors FROM information_schema.imci_vector_index_stats; +-------------+------------+-------------+---------+ | schema_name | table_name | column_name | vectors | +-------------+------------+-------------+---------+ | test | t1 | v1 | 3 | +-------------+------------+-------------+---------+ 1 row in set (0.00 sec)After the approximate query runs, the vector index is loaded into memory, and the vector count is restored to 3.