Scenario description
This article primarily focuses on the application of the static_bm25 and text_relevance functions within the Vector Search Edition to calculate and rank text relevance scores during the rough and fine sort phases.
Pack index configuration
Below is a demonstration of configuring a pack index using two fields:
Initially, configure the two fields for the pack index, setting their type to text and ensuring identical analysis methods.
Next, define the index, select PACK as the index type, and include the specified fields:
Note: If the field order is set with vector_source_text first and cate_id second, maintain this sequence when configuring the pack index fields to avoid errors during save and publish operations.
Then, proceed to the advanced configuration settings of the pack index.
Example of pack index configuration:
{
"index_name": "pack_index",
"index_type": "PACK",
"index_fields": [
{
"boost": 1,
"field_name": "vector_source_text"
},
{
"boost": 1,
"field_name": "cate_id"
}
],
"doc_payload_flag": 1,
"has_section_attribute": true,
"position_payload_flag": 1,
"term_frequency_bitmap": 0,
"position_list_flag": 1,
"term_payload_flag": 1,
"term_frequency_flag": 1,
"section_attribute_config": {
"has_field_id": true,
"has_section_weight": true
}
}
Query configuration
Construct your query using the following syntax:
query=pack_index:'water kettle'&&cluster=general&&config=start:0,hit:10,format:json
&&kvpairs=first_formula:static_bm25(),
formula: text_relevance(vector_source_text)&&sort=-RANK
● first_formula specifies the rough sort expression
● formula defines the fine sort expression
● sort=-RANK orders the documents by text score
To view scoring details, include rank_trace:all in the config clause:
FAQ
User feedback:
The configuration for rough sorting works correctly, but the score consistently shows 10000:
To troubleshoot, add the rank_trace:all parameter to the config:
"tracerInfo": "begin first formula trace:\nexpression[ipvuv],
result[8819].\nexpression[ipvuv*0.1], result[881.900024].\nexpression[score],
result[97.910000].\nexpression[score*100],
result[9791.000000].\nexpression[score*100+ipvuv*0.1],
result[10672.900024].\nscore [10000.000000] is larger
than max_first_score [10000.000000], adjust it to max_first_score.\nend
first formula trace.\n"
This occurs because the maximum score for rough sorting is capped at 10000. Any score exceeding this limit defaults to 10000.