All Products
Search
Document Center

OpenSearch:TagMatch

Last Updated:Feb 06, 2023

Overview

The TagMatch class is used to match terms in requests with tags in documents and then allocate weights to the documents based on match results. For example, after a user searches for information about a commodity, you need to preferentially return the stores that are liked by the user. In this case, you can use the TagMatch class to match the store-related terms in the request with store-related tags in documents and then allocate weights to the documents based on match results. The TagMatch class adds a set of key-value pairs to an array in a document. If a search request contains a kvpairs clause, the tag_match function matches the key-value pairs in the kvpairs clause with the key-value pairs in documents. This function scores each matched key and calculates a final score for each document based on all the matched keys. The final score can be used to sort documents by weight or filter documents. The following figure shows how the TagMatch class works.image

Scenarios:

Functions

Function

Description

TagMatch create(OpsScorerInitParams params, CString queryKey, CString fieldName, CString kvOperatorName, CString mergeOperatorName, boolean hasDefaultValue, boolean fieldIsKv, int maxKvCount)

Creates a TagMatch object based on the detailed format of the field to be matched in documents.

TagMatch create(OpsScorerInitParams params, CString queryKey, CString fieldName, CString kvOperatorName, CString mergeOperatorName, boolean hasDefaultValue, boolean fieldIsKv)

Creates a TagMatch object based on the detailed format of the field to be matched in documents. The default value of the maxKvCount parameter is used.

TagMatch create(OpsScorerInitParams params, CString queryKey, CString fieldName, CString kvOperatorName, CString mergeOperatorName, boolean hasDefaultValue)

Creates a TagMatch object based on the detailed format of the field to be matched in documents. The default value of the maxKvCount parameter is used. The field consists of key-value pairs.

TagMatch create(OpsScorerInitParams params, CString queryKey, CString fieldName, CString kvOperatorName, CString mergeOperatorName)

Creates a TagMatch object based on the detailed format of the field to be matched in documents. The default value of the maxKvCount parameter is used. The field consists of key-value pairs and has no default value.

TagMatch create(OpsScorerInitParams params, CString queryKey, CString fieldName, double kvResult, CString mergeOperatorName, boolean hasDefaultValue, boolean fieldIsKv, int maxKvCount)

Creates a TagMatch object based on the detailed format of the field to be matched in documents. The match result of each key-value pair is a constant.

TagMatch create(OpsScorerInitParams params, CString queryKey, CString fieldName, double kvResult, CString mergeOperatorName, boolean hasDefaultValue, boolean fieldIsKv)

Creates a TagMatch object based on the detailed format of the field to be matched in documents. The match result of each key-value pair is a constant. The default value of the maxKvCount parameter is used.

TagMatch create(OpsScorerInitParams params, CString queryKey, CString fieldName, double kvResult, CString mergeOperatorName, boolean hasDefaultValue)

Creates a TagMatch object based on the detailed format of the field to be matched in documents. The match result of each key-value pair is a constant. The default value of the maxKvCount parameter is used. The field consists of key-value pairs.

TagMatch create(OpsScorerInitParams params, CString queryKey, CString fieldName, double kvResult, CString mergeOperatorName)

Creates a TagMatch object based on the detailed format of the field to be matched in documents. The match result of each key-value pair is a constant. The default value of the maxKvCount parameter is used. The field consists of key-value pairs and has no default value.

double evaluate(OpsScoreParams params)

Calculates the number of query terms after analysis.

Function details

TagMatch create(OpsScorerInitParams params, CString queryKey, CString fieldName, CString kvOperatorName, CString mergeOperatorName, boolean hasDefaultValue, boolean fieldIsKv, int maxKvCount)

Creates a TagMatch object based on the detailed format of the field to be matched in documents. For example, the hasDefautlValue parameter specifies whether the field has a default value. The fieldIsKv parameter specifies whether the field consists of key-value pairs. The maxKvCount parameter specifies the maximum number of key-value pairs in the field.

Scenarios:

Different tags, such as funny, sports, news, music, and science, are attached to posts on a large and comprehensive forum. When you push documents to OpenSearch, you can assign an ID for each tag. For example, the IDs of the funny, sports, news, and music tags are 1, 5, 3, and 6. Then, you can use a tag field to store the IDs of the tags. You can also obtain the weight of each tag for each post after preprocessing. For example, a post has the funny, sports, and news tags. The weights of the funny, sports, and news tags for the post are 0.5, 0.5, and 0.1. In this case, the value of the tag field is [1 0.5 5 0.5 3 0.1]. After a long-term analysis of the searches that are performed by forum members, you can know the favorite post tags of each member.

For example, the nba_fans member is interested in sports and funny content, and the weights of the sports and funny tags for the member are 0.6 and 0.3. Then, you can use the kvpairs clause to define the tag-weight pairs as key-value pairs and pass the key-value pairs to the search request when the member searches for posts. If the field name defined in the kvpairs clause is user_tag, the value of the user_tag field for the nba_fans member is 5=0.6:1=0.3. You can define TagMatch("user_tag", "tag", "mul", "sum", false, true, 50) in the sort script. This way, posts in which forum members are interested are at the top of sort results.

For example, when the nba_fans member searches for the preceding post, both the funny and sports tags can be matched. You can set the kvOperatorName parameter to mul to obtain the product of the value of each key in the search request and that of each matched key in the document. In this example, the score of the sports tag is calculated by using the following formula: 0.5 × 0.6 = 0.3. The score of the funny tag is calculated by using the following formula: 0.5 × 0.3 = 0.15. You can set the mergeOperatorName parameter to sum to obtain the sum of the scores of two tags. The sum of two scores is calculated by using the following formula: 0.3 + 0.15 = 0.45. Then, the sum is added to the final sorting score. This way, you can sort the posts in which the member is interested by calculating weights.

Parameters:

params: the parameters that are used for score calculation. For more information, see CategoryScore.

queryKey: a string constant that specifies the name of the field to be matched in query clauses. This field must be specified as a kvpairs clause. Separate the key and value of each key-value pair with an equal sign (=). Separate key-value pairs with colons (:). Example: kvpairs=query_tags:10=0.67:960=0.85:1=48. In this example, the query_tags field consists of three keys: 10, 960, and 1. Their corresponding values are 0.67, 0.85, and 48. You can also specify only a list of keys for the field. Example: kvpairs=cats:10:960:1. fieldName: a string constant that specifies the name of the field to be matched in documents. The field must be an attribute field and the value must be an integer or a float array. If the value of the field is a float array, the key values are converted to 64-bit integers during matching. Odd positions in the array are occupied by keys and even positions are occupied by values. Format: key0 value0 key1 value1.

kvOperatorName: a string constant that specifies the operation to be performed on the values of the same key in the fields specified by the queryKey and fieldName parameters. To obtain the greater value of the key, set this parameter to max. To obtain the smaller value of the key, set this parameter to min. To obtain the average of the two values of the key, set this parameter to avg. To obtain the product of the values of the key, set this parameter to mul. To obtain the value of the key in the search request, set this parameter to query_value. To obtain the value of the key in the document, set this parameter to doc_value.

mergeOperatorName: a string constant that specifies the operation to be performed on all the calculation results of matched keys. To obtain the maximum calculation result, set this parameter to max. To obtain the minimum calculation result, set this parameter to min. To obtain the sum of all the calculation results, set this parameter to sum. To obtain the average of all the calculation results, set this parameter to avg. To obtain the calculation result of the first matched key and ignore the calculation results of the other matched keys, set this parameter to first_match.

hasDefaultValue: a boolean constant that specifies whether to use the first key-value pair in the field that is specified by the fieldName parameter as the default value of the field. If you set this parameter to false, the field has no default value. If you set this parameter to true, the first key-value pair of the field is used as the default value. The format of a field specified by the fieldName parameter is default_score k0 v0 k1 v1. fieldIsKv: a boolean constant that specifies whether the field specified by the fieldName parameter consists of key-value pairs. If this parameter is set to false, the field specified by the fieldName parameter consists only of a list of keys. In this scenario, tags are required in documents and tags have no weights, which provides more convenience.

maxKvCount: an integer constant that specifies the maximum number of key-value pairs that are to be matched in the field specified by the queryKey parameter. The value of the integer constant cannot exceed 5120.

Sample code:

package users.scorer;
import com.aliyun.opensearch.cava.framework.OpsScoreParams;
import com.aliyun.opensearch.cava.framework.OpsScorerInitParams;
import com.aliyun.opensearch.cava.framework.OpsRequest;
import com.aliyun.opensearch.cava.framework.OpsDoc;
import com.aliyun.opensearch.cava.features.TagMatch;

class BasicSimilarityScorer {
    TagMatch _tagMatch;
    boolean init(OpsScorerInitParams params) {
        _tagMatch = TagMatch.create(params, "tag_match_key", "multi_int8", "query_value",
                                  "first_match", false, false, 100);
        return true;
    }

    double score(OpsScoreParams params) {
        return _tagMatch.evaluate(params);
    }
}

TagMatch create(OpsScorerInitParams params, CString queryKey, CString fieldName, CString kvOperatorName, CString mergeOperatorName, boolean hasDefaultValue, boolean fieldIsKv)

Works in the same way as the TagMatch(CString queryKey, CString fieldName, CString kvOperatorName, CString mergeOperatorName, boolean hasDefaultValue, boolean fieldIsKv, int maxKvCount) function. The maxKvCount parameter is set to 50.

Sample code:

package users.scorer;
import com.aliyun.opensearch.cava.framework.OpsScoreParams;
import com.aliyun.opensearch.cava.framework.OpsScorerInitParams;
import com.aliyun.opensearch.cava.framework.OpsRequest;
import com.aliyun.opensearch.cava.framework.OpsDoc;
import com.aliyun.opensearch.cava.features.TagMatch;

class BasicSimilarityScorer {
    TagMatch _tagMatch;
    boolean init(OpsScorerInitParams params) {
        _tagMatch = TagMatch.create(params, "tag_match_key", "multi_int8", "query_value",
                                  "first_match", false, false);
        return true;
    }

    double score(OpsScoreParams params) {
        return _tagMatch.evaluate(params);
    }
}

TagMatch create(OpsScorerInitParams params, CString queryKey, CString fieldName, CString kvOperatorName, CString mergeOperatorName, boolean hasDefaultValue)

Works in the same way as the TagMatch(CString queryKey, CString fieldName, CString kvOperatorName, CString mergeOperatorName, boolean hasDefaultValue, boolean fieldIsKv, int maxKvCount) function. The fieldIsKv parameter is set to true. The maxKvCount parameter is set to 50. Sample code:

package users.scorer;
import com.aliyun.opensearch.cava.framework.OpsScoreParams;
import com.aliyun.opensearch.cava.framework.OpsScorerInitParams;
import com.aliyun.opensearch.cava.framework.OpsRequest;
import com.aliyun.opensearch.cava.framework.OpsDoc;
import com.aliyun.opensearch.cava.features.TagMatch;

class BasicSimilarityScorer {
    TagMatch _tagMatch;
    boolean init(OpsScorerInitParams params) {
        _tagMatch = TagMatch.create(params, "tag_match_key", "multi_int8", "query_value",
                                  "first_match", true);
        return true;
    }

    double score(OpsScoreParams params) {
        return _tagMatch.evaluate(params);
    }
}

TagMatch create(OpsScorerInitParams params, CString queryKey, CString fieldName, CString kvOperatorName, CString mergeOperatorName)

Works in the same way as the TagMatch(CString queryKey, CString fieldName, CString kvOperatorName, CString mergeOperatorName, boolean hasDefaultValue, boolean fieldIsKv, int maxKvCount) function. The hasDefaultValue parameter is set to false. The fieldIsKv parameter is set to true. The maxKvCount parameter is set to 50. Sample code:

package users.scorer;
import com.aliyun.opensearch.cava.framework.OpsScoreParams;
import com.aliyun.opensearch.cava.framework.OpsScorerInitParams;
import com.aliyun.opensearch.cava.framework.OpsRequest;
import com.aliyun.opensearch.cava.framework.OpsDoc;
import com.aliyun.opensearch.cava.features.TagMatch;

class BasicSimilarityScorer {
    TagMatch _tagMatch;
    boolean init(OpsScorerInitParams params) {
        _tagMatch = TagMatch.create(params, "tag_match_key", "multi_int8", "query_value",
                                  "first_match");
        return true;
    }

    double score(OpsScoreParams params) {
        return _tagMatch.evaluate(params);
    }
}

TagMatch create(OpsScorerInitParams params, CString queryKey, CString fieldName, double kvResult, CString mergeOperatorName, boolean hasDefaultValue, boolean fieldIsKv, int maxKvCount)

Creates a TagMatch object based on the detailed format of the field to be matched in documents. For example, the hasDefautlValue parameter specifies whether the field has a default value. The fieldIsKv parameter specifies whether the field consists of key-value pairs. The maxKvCount parameter specifies the maximum number of key-value pairs in the field.

Scenarios:

Goods can have multiple attribute tags. For example, 1 indicates young (age), 2 indicates middle-aged (age), 3 indicates fresh (style), 4 indicates fashion (style), 5 indicates women (gender), and 6 indicates men (gender).

You may only want to match tags but do not want to calculate the weights of tags for sorting. In this case, you can use the options field to store tags. If the clothes have the young, fashion, and women tags, the value of the options field is [1 4 5]. The field value consists of only keys. Users also have attribute tags that are similar to the attribute tags of goods. For example, a young female user used to purchase fresh-style clothes in historical transactions. In this case, the user_options=1:3:5 field can be added to a query clause when this user searches for clothes. Note that the field that is defined by the kvpairs clause consists of only keys.

If you want to sort goods that have the favorite tags of users by calculating weights, you can use the TagMatch("user_options", "options", 10F, "sum", false, false) function in a sort script. In the preceding function, user_options is the field that stores the tags in a query clause and options is the field that stores the tags in a document. The value 10 of the kvOperatorName parameter indicates that 10 is the score for each pair of matched keys. The value false of the hasDefaultValue parameter indicates that the initial score is not used. The value false of the fieldIsKv parameter indicates that the value of the field in the document consists of only keys.

When the preceding young female user searches for the preceding clothes, both the women and young tags can be matched and the scores of both tags are 10. After the sum operation specified by the mergeOperatorName parameter is performed on the two scores, the final score of the good is 20. This way, you can also sort documents by weight without the weight information about tags.

Parameters:

params: the parameters that are used for score calculation. For more information, see CategoryScore.

queryKey: a string constant that specifies the name of the field to be matched in query clauses. This field must be specified as a kvpairs clause. Separate the key and value of each key-value pair with an equal sign (=). Separate key-value pairs with colons (:). Example: kvpairs=query_tags:10=0.67:960=0.85:1=48. In this example, the query_tags field consists of three keys: 10, 960, and 1. Their corresponding values are 0.67, 0.85, and 48. You can also specify only a list of keys for the field. Example: kvpairs=cats:10:960:1. fieldName: a string constant that specifies the name of the field to be matched in documents. The field must be an attribute field and the value must be an integer or a float array. If the value of the field is a float array, the key values are converted to 64-bit integers during matching. Odd positions in the array are occupied by keys and even positions are occupied by values. Format: key0 value0 key1 value1.

kvResult: a floating-point constant to be returned when a key in the field specified by the queryKey parameter matches a key in the field specified by the fieldName parameter. mergeOperatorName: a string constant that specifies the operation to be performed on all the calculation results of matched keys. To obtain the maximum calculation result, set this parameter to max. To obtain the minimum calculation result, set this parameter to min. To obtain the sum of all the calculation results, set this parameter to sum. To obtain the average of all the calculation results, set this parameter to avg. To obtain the calculation result of the first matched key and ignore the calculation results of the other matched keys, set this parameter to first_match.

hasDefaultValue: a boolean constant that specifies whether to use the first key-value pair in the field that is specified by the fieldName parameter as the default value of the field. If you set this parameter to false, the field has no default value. If you set this parameter to true, the first key-value pair of the field is used as the default value. The format of a field specified by the fieldName parameter is default_score k0 v0 k1 v1. fieldIsKv: a boolean constant that specifies whether the field specified by the fieldName parameter consists of key-value pairs. If this parameter is set to false, the field specified by the fieldName parameter consists only of a list of keys. In this scenario, tags are required in documents and tags have no weights, which provides more convenience.

maxKvCount: an integer constant that specifies the maximum number of key-value pairs that are to be matched in the field specified by the queryKey parameter. The value of the integer constant cannot exceed 5120.

Sample code:

package users.scorer;
import com.aliyun.opensearch.cava.framework.OpsScoreParams;
import com.aliyun.opensearch.cava.framework.OpsScorerInitParams;
import com.aliyun.opensearch.cava.framework.OpsRequest;
import com.aliyun.opensearch.cava.framework.OpsDoc;
import com.aliyun.opensearch.cava.features.TagMatch;

class BasicSimilarityScorer {
    TagMatch _tagMatch;
    boolean init(OpsScorerInitParams params) {
        _tagMatch = TagMatch.create(params, "tag_match_key", "multi_int8", 3.3D,
                                 "first_match", false, false, 100);
        return true;
    }

    double score(OpsScoreParams params) {
        return _tagMatch.evaluate(params);
    }
}

TagMatch create(OpsScorerInitParams params, CString queryKey, CString fieldName, double double, CString mergeOperatorName, boolean hasDefaultValue, boolean fieldIsKv)

Works in the same way as the TagMatch(CString queryKey, CString fieldName, float kvResult, CString mergeOperatorName, boolean hasDefaultValue, boolean fieldIsKv, int maxKvCount) function. The maxKvCount parameter is set to 50. Sample code:

package users.scorer;
import com.aliyun.opensearch.cava.framework.OpsScoreParams;
import com.aliyun.opensearch.cava.framework.OpsScorerInitParams;
import com.aliyun.opensearch.cava.framework.OpsRequest;
import com.aliyun.opensearch.cava.framework.OpsDoc;
import com.aliyun.opensearch.cava.features.TagMatch;

class BasicSimilarityScorer {
    TagMatch _tagMatch;
    boolean init(OpsScorerInitParams params) {
        _tagMatch = TagMatch.create(params, "tag_match_key", "multi_int8", 3.3D,
                                 "first_match", false, false);
        return true;
    }

    double score(OpsScoreParams params) {
        return _tagMatch.evaluate(params);
    }
}

TagMatch create(OpsScorerInitParams params, CString queryKey, CString fieldName, double kvResult, CString mergeOperatorName, boolean hasDefaultValue)

Works in the same way as the TagMatch(CString queryKey, CString fieldName, float kvResult, CString mergeOperatorName, boolean hasDefaultValue, boolean fieldIsKv, int maxKvCount) function. The fieldIsKv parameter is set to true. The maxKvCount parameter is set to 50.

package users.scorer;
import com.aliyun.opensearch.cava.framework.OpsScoreParams;
import com.aliyun.opensearch.cava.framework.OpsScorerInitParams;
import com.aliyun.opensearch.cava.framework.OpsRequest;
import com.aliyun.opensearch.cava.framework.OpsDoc;
import com.aliyun.opensearch.cava.features.TagMatch;

class BasicSimilarityScorer {
    TagMatch _tagMatch;
    boolean init(OpsScorerInitParams params) {
        _tagMatch = TagMatch.create(params, "tag_match_key", "multi_int8", 3.3D,
                                 "first_match", true);
        return true;
    }

    double score(OpsScoreParams params) {
        return _tagMatch.evaluate(params);
    }
}

TagMatch create(OpsScorerInitParams params, CString queryKey, CString fieldName, double kvResult, CString mergeOperatorName)

Works in the same way as the TagMatch(CString queryKey, CString fieldName, float kvResult, CString mergeOperatorName, boolean hasDefaultValue, boolean fieldIsKv, int maxKvCount) function. The hasDefaultValue parameter is set to false. The fieldIsKv parameter is set to true. The maxKvCount parameter is set to 50. Sample code:

package users.scorer;
import com.aliyun.opensearch.cava.framework.OpsScoreParams;
import com.aliyun.opensearch.cava.framework.OpsScorerInitParams;
import com.aliyun.opensearch.cava.framework.OpsRequest;
import com.aliyun.opensearch.cava.framework.OpsDoc;
import com.aliyun.opensearch.cava.features.TagMatch;

class BasicSimilarityScorer {
    TagMatch _tagMatch;
    boolean init(OpsScorerInitParams params) {
        _tagMatch = TagMatch.create(params, "tag_match_key", "multi_int8", 3.3D,
                                 "first_match");
        return true;
    }

    double score(OpsScoreParams params) {
        return _tagMatch.evaluate(params);
    }
}

double evaluate(OpsScoreParams params)

Matches terms in a request with tags in documents and then allocates weights to the documents based on match results. Parameter: params: the parameters that are used for score calculation. For more information, see OpsScoreParams. Sample code:

package users.scorer;
import com.aliyun.opensearch.cava.framework.OpsScoreParams;
import com.aliyun.opensearch.cava.framework.OpsScorerInitParams;
import com.aliyun.opensearch.cava.framework.OpsRequest;
import com.aliyun.opensearch.cava.framework.OpsDoc;
import com.aliyun.opensearch.cava.features.TagMatch;

class BasicSimilarityScorer {
    TagMatch _tagMatch;
    boolean init(OpsScorerInitParams params) {
        _tagMatch = TagMatch.create(params, "tag_match_key", "multi_int8", "query_value",
                                  "first_match", false, false, 100);
        return true;
    }

    double score(OpsScoreParams params) {
        return _tagMatch.evaluate(params);
    }
}