All Products
Search
Document Center

OpenSearch:Text relevance ranking by pack index

Last Updated:Sep 02, 2024

Scenario description

This article primarily focuses on the application of the static_bm25 and text_relevance functions within the Vector Search Edition to calculate and rank text relevance scores during the rough and fine sort phases.

Pack index configuration

Below is a demonstration of configuring a pack index using two fields:

Initially, configure the two fields for the pack index, setting their type to text and ensuring identical analysis methods.

Next, define the index, select PACK as the index type, and include the specified fields:

Note: If the field order is set with vector_source_text first and cate_id second, maintain this sequence when configuring the pack index fields to avoid errors during save and publish operations.

Then, proceed to the advanced configuration settings of the pack index.

Example of pack index configuration:

{
  "index_name": "pack_index",
  "index_type": "PACK",
  "index_fields": [
    {
      "boost": 1,
      "field_name": "vector_source_text"
    },
    {
      "boost": 1,
      "field_name": "cate_id"
    }
  ],
  "doc_payload_flag": 1,
  "has_section_attribute": true,
  "position_payload_flag": 1,
  "term_frequency_bitmap": 0,
  "position_list_flag": 1,
  "term_payload_flag": 1,
  "term_frequency_flag": 1,
  "section_attribute_config": {
    "has_field_id": true,
    "has_section_weight": true
  }
}

Query configuration

Construct your query using the following syntax:

query=pack_index:'water kettle'&&cluster=general&&config=start:0,hit:10,format:json
&&kvpairs=first_formula:static_bm25(),
formula: text_relevance(vector_source_text)&&sort=-RANK

● first_formula specifies the rough sort expression

● formula defines the fine sort expression

● sort=-RANK orders the documents by text score

To view scoring details, include rank_trace:all in the config clause:

FAQ

User feedback:

The configuration for rough sorting works correctly, but the score consistently shows 10000:

To troubleshoot, add the rank_trace:all parameter to the config:

"tracerInfo": "begin first formula trace:\nexpression[ipvuv], 
result[8819].\nexpression[ipvuv*0.1], result[881.900024].\nexpression[score], 
result[97.910000].\nexpression[score*100], 
result[9791.000000].\nexpression[score*100+ipvuv*0.1], 
result[10672.900024].\nscore [10000.000000] is larger 
than max_first_score [10000.000000], adjust it to max_first_score.\nend 
first formula trace.\n"

This occurs because the maximum score for rough sorting is capped at 10000. Any score exceeding this limit defaults to 10000.