All Products
Search
Document Center

Artificial Intelligence Recommendation:Engine configuration case for recommendation scenario

Last Updated:Nov 12, 2024

This topic describes how to set an engine configuration for a recommendation scenario. The recommendation scenario is used as an example to introduce the basic configurations of RecallConfs, FilterConfs, Feature, ranking, and re-ranking, along with how to configure online data sources Hologres and FeatureStore.

Code example (click to expand for details)

{
    "HologresConfs": {
        "holo_info": {
            "DSN": "postgres://${AccessKey}:${AccessSecret}@hgpostcn-cn-xxxx-cn-shanghai-vpc-st.hologres.aliyuncs.com:80/test_db?sslmode=disable&connect_timeout=1"
        }
    },
    "FeatureStoreConfs": {
        "fs_info": {
            "RegionId": "cn-shanghai",
            "AccessId": "${AccessKey}",
            "AccessKey": "${AccessSecret}",
            "ProjectName": "projectName"
        }
    },
    "RecallConfs": [
        {
            "Name": "etrec_u2i2i_recall_v1",
            "RecallType": "UserCustomRecall",
            "RecallCount": 300,
            "DaoConf": {
                "AdapterType": "hologres",
                "HologresName": "holo_info",
                "HologresTableName": "home_feed_etrec_u2i2i_score_holo_v1"
            }
        },
        {
            "Name": "user_global_hot_recall_v1",
            "RecallType": "UserGlobalHotRecall",
            "RecallCount": 200,
            "DaoConf": {
                "AdapterType": "hologres",
                "HologresName": "holo_info",
                "HologresTableName": "home_feed_global_hot_holo_v1"
            }
        }
    ],
    "FilterConfs": [
        {
            "Name": "UserExposureFilter",
            "FilterType": "User2ItemExposureFilter",
            "MaxItems": 50,
            "TimeInterval": 604800,
            "WriteLog": true,
            "DaoConf": {
                "AdapterType": "hologres",
                "HologresName": "holo_info",
                "HologresTableName": "exposure_history"
            }
        },
        {
            "Name": "ItemStateFilter",
            "FilterType": "ItemStateFilter",
            "ItemStateDaoConf": {
                "AdapterType": "hologres",
                "HologresName": "holo_info",
                "HologresTableName": "item_status_table_v1",
                "ItemFieldName": "item_id",
                "SelectFields": "is_online"
            },
            "FilterParams": [
                {
                    "Name": "is_online",
                    "Type": "int",
                    "Operator": "equal",
                    "Value": 1
                }
            ]
        }
    ],
    "AlgoConfs": [
        {
            "Name": "home_feed_dbmtl_v1",
            "Type": "EAS",
            "EasConf": {
                "Processor": "EasyRec",
                "ResponseFuncName": "easyrecMutValResponseFunc",
                "Url": "http://xxx.vpc.cn-shanghai.pai-eas.aliyuncs.com/api/predict/home_feed_dbmtl_v1",
                "EndpointType": "DIRECT",
                "Auth": "xxxxx"
            }
        }
    ],
    "SortConfs": [
        {
            "Name": "BoostScoreSortByAuthor",
            "SortType": "BoostScoreSort",
            "Debug": false,
            "BoostScoreConditions": [
                {
                    "Conditions": [
                        {
                            "Name": "author_status",
                            "Domain": "item",
                            "Type": "string",
                            "Value": "teacher",
                            "Operator": "equal"
                        }
                    ],
                    "Expression": "score * 10"
                }
            ]
        }
    ],
    "SortNames": {
        "home_feed": [
            "BoostScoreSortByAuthor"
        ]
    },
    "FilterNames": {
        "default": [
            "UniqueFilter",
            "UserExposureFilter"
        ]
    },

    "RankConf": {
        "home_feed": {
            "RankAlgoList": [
                "home_feed_dbmtl_v1"
            ],
            "RankScore": "${home_feed_dbmtl_v1_probs_is_click}+${home_feed_dbmtl_v1_probs_is_collect_like_comment}",
            "BatchCount": 100,
            "Processor": "EasyRec"
        }
    },
    "FeatureConfs": {
        "home_feed": {
            "AsynLoadFeature": true,
            "FeatureLoadConfs": [
                {
                    "FeatureDaoConf": {
                        "AdapterType": "hologres",
                        "HologresName": "holo_info",
                        "FeatureKey": "user:uid",
                        "UserFeatureKeyName": "user_id",
                        "HologresTableName": "user_table_v1_all_feature_v1_online",
                        "UserSelectFields": "*",
                        "FeatureStore": "user"
                    },
                    "Features": []
                },
                {
                    "FeatureDaoConf": {
                        "AdapterType": "hologres",
                        "HologresName": "holo_info",
                        "ItemFeatureKeyName": "item_id",
                        "FeatureKey": "item:id",
                        "HologresTableName": "item_table_v1_all_feature_v1_online",
                        "ItemSelectFields": "item_id,author",
                        "FeatureStore": "item"
                    },
                    "Features": []
                }
            ]
        }
    },
    "SceneConfs": {
        "home_feed": {
            "default": {
                "RecallNames": [
                    "etrec_u2i2i_recall_v1",
                    "user_global_hot_recall_v1"
                ]
            }
        }
    }
}

Logic of defining engine configurations: You must first define the configurations of RecallConfs, FilterConfs, AlgoConfs, and SortConfs, and then reference them in the SceneConfs. For example, you first define the FilterConfs, and the configuration name is a unique identifier. FilterConfs are referenced in FilterNames, RecallConfs in SceneConfs, AlgoConfs in RankConfs, and SortConfs in SortNames. home_feed is the scenario name. However, the scenario name in FilterNames is default, which means if the scenario configuration to be used cannot be found, it will use the default.

The following table shows the details of the example:

Item

Description

Data source

In this case two data sources, Hologres and FeatureStore, are configured. These mainly store data for online services, such as user features and item features, hot recall and i2i reacll data, and item IDs that a user has already viewed, as the basis for exposure filtering.

If FeatureStore uses FeatureDB as the online data source, you must configure the username (FeatureDBUsername) and password (FeatureDBPassword) of FeatureDB. If FeatureStore uses other data sources, these two parameters do not need to be configured. The ${AccessKey} and ${AccessSecret} in the two data sources do not need to be replaced, and the engine automatically completes the replacement.

Recall

This case configures two recalls in RecallConfs, U2I recall and global hot recall. The data of the two recalls come from Hologres. The explanations of different recall parameters are as follows:

  • Name: The custom name of the recall, which is the unique identifier of the current recall configuration.

  • RecallType: The type of recall, enumeration value. The currently built-in recall types can be referenced in Configure recalls.

  • RecallCount: The number of recalls.

  • AdapterType: The type of data source for recall data. Here we use Hologres as an example.

  • HologresName: The name of the configured data source.

  • HologresTableName: The table from which the recall data originates.

Filter

This case configures the following two filtering methods in FilterConfs:

  • Exposure filter: You need to create an exposure table in Hologres in advance. For more information, see User2ItemExposureFilter.

  • Status filter: You can filter based on the status of the item. For example, in the FilterParams part of the configuration, only items with is_online=1 will be retained, and others will be discarded.

Explanations of some parameters:

  • Name: The custom name of the filter.

  • FilterType: The built-in filter type. For more information about the supported types, see Configure filters.

  • MaxItems: The limit of data items during exposure filtering, corresponding to limit in SQL.

  • TimeInterval: The time range for exposure filtering. For example, if you want to filter exposures within one hour, set it to 3600. Unit: seconds.

  • WriteLog: Determines whether to write the list of items recommended this time into the exposure table during exposure filtering.

  • AdapterType: The type of data source. Here we use Hologres as an example.

  • HologresName: The configuration name of Hologres, corresponding to the third line in the configuration.

  • HologresTableName: The table name of the exposure table.

  • ItemStateDaoConf: Some parameters for status filtering. HologresTableName is the table name of the state table, ItemFieldName is the primary key of the table, and SelectFields are the fields to be queried.

Feature

Load user features and item features into the engine. User features are used for model scoring, and item features are mostly used for re-ranking.

  • AsynLoadFeature: Determines whether to load features asynchronously.

  • AdapterType: The type of data source. Here we use Hologres as an example.

  • HologresName: The name of the Hologres configuration, corresponding to the third line in the code.

  • FeatureKey: With user:uid as the default, it is a fixed writing method, which means it uses the uid attribute value of the user to query in the database.

  • UserFeatureKeyName: The primary key field in the user data table.

  • HologresTableName: The table name of the user data table.

  • UserSelectFields: The fields to be queried when querying. Multiple fields are separated by commas, and * can be used to replace all fields.

  • FeatureStore: The enumeration value is user or item, respectively corresponding to whether the queried features are stored on the user side or the item side.

Ranking

The ranking service is first configured in AlgoConfs, and multiple ranking services can be defined. In RankConf, set a recommendation scenario to reference the ranking services defined in AlgoConfs, along with the scoring formula and weight adjustment. Multiple target prediction scores can be weighted and then added or multiplied.

AlgoConfs:

  • Name: Custom name.

  • Type: EAS, a static field. Most models are deployed on EAS.

  • Processor: EasyRec, a static field, the type of Processor.

  • ResponseFuncName: easyrecMutValResponseFunc, a static field.

  • Url: The address information of the model service.

  • EndpointType: Determines whether to use a direct connection method. The direct connection method has less network loss.

  • Auth: The Token information of the model service.

RankConf:

  • RankAlgoList: Determines which model services are used for scoring.

  • RankScore: The scoring formula. The rule is the name in RankAlgoList + the actual target value of the model.

  • BatchCount: The number of batches for scoring.

  • Processor: EasyRec, a static field, the type of Processor.

Re-ranking

This case configures the following two re-ranking strategies in SortConfs:

  • Boost strategy: You can boost based on the attributes of the item. For example, the BoostScoreSort configuration means that if the author_status attribute value of the item is teacher, the model score will be multiplied by 10.

  • Diversity strategy: This means that in a window with a length of 20, items can only appear once based on the author dimension of the item.

Explanations of some configuration parameters:

  • Name: The custom name of the re-ranking strategy.

  • SortType: The built-in re-ranking strategy type. The supported types can be referenced in Configure re-ranking.

  • Debug: Determines whether to enable the Debug model. After enabling, more logs will be printed on the console to facilitate troubleshooting.

  • Conditions: The conditions of the item. Items that meet this condition will be processed according to the expression in Expression.

  • DiversityRules: A diversity parameter. Dimensions are dimension fields, and items will be diversified based on this field of the item. WindowSize is the window size, and FrequencySize is the maximum number of occurrences of different dimension values.