You can use the k-nearest neighbor (KNN) vector query feature to perform an approximate nearest neighbor search based on vectors. This way, you can find data items that have the highest similarity as the vector that you want to query in a large-scale dataset.
Prerequisites
An OTSClient instance is initialized. For more information, see Initialize an OTSClient instance.
A data table is created and data is written to the data table. For more information, see Create a data table and Write data.
A search index is created and the vector field is configured on the data table. For more information, see Create a search index.
Usage notes
The KNN vector query feature is supported by Tablestore SDK for Node.js of version 5.5.0 or later. Make sure that a supported version of Tablestore SDK for Node.js is installed.
NoteFor information about the version history of Tablestore SDK for Node.js, see Version history of Tablestore SDK for Java.
Parameters
Parameter | Required | Description |
fieldName | Yes | The name of the vector field. |
topK | Yes | The top K query results that have the highest similarity as the vector that you want to query. For information about the maximum value of the topK parameter, see Search index limits. Important
|
float32QueryVector | Yes | The vector for which you want to query the similarity. |
filter | No | The filter. You can use a combination of query conditions that are not KNN vector query conditions. |
Example
The following sample code provides an example on how to query the ten nearest neighbors of a specific vector in a data table. In this example, the value of the col_keyword column of the nearest-neighbor vectors must be equal to "0" and the value of the col_long column of the nearest-neighbor vectors must be between 0 and 50.
const tableName = "<TABLE_ANME>"
const indexName = "<SEARCH_INDEX_NAME>"
async function knnVectorQuery() {
return new Promise(function (resolve, reject) {
let params = {
tableName: tableName,
indexName: indexName,
searchQuery: {
offset: 0,
limit: 10,
query: {
queryType: TableStore.QueryType.KNN_VECTOR_QUERY,
query: {
fieldName: "col_vector",
topK: TableStore.Long.fromNumber(10),
float32QueryVector: [1.0, 1.1, 1.2, -1.3],
filter: {
queryType: TableStore.QueryType.BOOL_QUERY,
query: {
mustQueries: [
{
queryType: TableStore.QueryType.RANGE_QUERY,
query: {
fieldName: "col_long",
rangeFrom: TableStore.Long.fromNumber(0),
includeLower: true,
rangeTo: TableStore.Long.fromNumber(50),
includeUpper: true,
}
},
{
queryType: TableStore.QueryType.TERM_QUERY,
query: {
fieldName: "col_keyword",
term: "0",
}
},
],
}
},
},
},
sort: {
sorters: [
{
scoreSort: {
order: TableStore.SortOrder.SORT_ORDER_DESC // Sort the query results based on the scores in descending order.
}
}
],
},
getTotalCount: false,
},
columnToGet: {
returnType: TableStore.ColumnReturnType.RETURN_SPECIFIED,
returnNames: ["col_long", "col_keyword"]
},
timeoutMs: 10000,
}
client.search(params, function (err, data) {
if (err) {
console.log('search error:', err.toString());
reject(err);
} else {
console.log('RequestId:', data.RequestId);
for (let i = 0; i < data.searchHits.length; i++) {
let hit = data.searchHits[i]
console.log('Score:', hit.score, 'Row:', hit.row);
}
resolve(data)
}
});
})
}
knnVectorQuery();
References
The following query types are supported by search indexes: term query, terms query, match all query, match query, match phrase query, prefix query, range query, wildcard query, Boolean query, geo query, nested query, vector query, and exists query. You can select a query type to query data based on your business requirements.
If you want to sort or paginate the rows that meet the query conditions, you can use the sorting and paging feature. For more information, see Sorting and paging.
If you want to collapse the result set based on a specific column, you can use the collapse (distinct) feature. This way, data of the specified type appears only once in the query results. For more information, see Collapse (distinct).
If you want to analyze data in a data table, such as obtaining the extreme values, sum, and total number of rows, you can perform aggregation operations or execute SQL statements. For more information, see Aggregation and SQL query.
If you want to quickly obtain all rows that meet the query conditions without the need to sort the rows, you can call the ParallelScan and ComputeSplits operations to use the parallel scan feature. For more information, see Parallel scan.
FAQ
How do I optimize the performance of Tablestore KNN vector query?