This topic describes the use scenarios of Artificial Intelligence Recommendation (AIRec) and the event tracking data required for model training. The news industry is used as an example.
Use scenarios
The "you may also like" service type is suitable for homepage recommendations based on feed streams. This increases user clicks. The following figure shows an example.
Event tracking is used to record browsing behavior data of users and aims to guide users to click and read the content recommended in feed streams. This helps increase the values of metrics such as browsing depth and stay duration.
Training data
Item table
An item table contains the items that you want to recommend to users. AIRec performs model training and recommends appropriate item data to each user. We recommend that you provide an item table that contains valid data. This can avoid noise caused by invalid data and improve the recommendation effect.
User table
A user table contains information related to users. AIRec performs data training based on user preferences and recommends the content in which users are most interested.
Behavior table
A behavior table contains behavioral data related to recommendations. This is the core for model training in AIRec. Dirty data severely affects the model training result. The following figure shows an example of behavior related to recommendations. After a user performs the operations shown in the preceding figure, four recommendation-related behavior entries are generated: exposures to three items and a click on Item A. The four behavior entries must be recorded and uploaded to AIRec as required.
Note: The items must be the items recommended by the system. Do not include the behavior entries that are generated on the items users searched for in a behavior table. Otherwise, the accuracy of the training result is affected.
Data for event tracking
Exposures
What is an exposure?
Exposure is a behavior type specified by the bhv_type field in a behavior table. If a piece of data is displayed to a user, one exposure is counted.
Implementation of event tracking
Event tracking is performed on the client side. Example: As shown in the preceding figure, when a user browses the homepage, Item A and Item B are exposed. The exposure entries must be recorded.
Precautions
Take note of the following precautions on exposure event tracking:
Users can visually see recommended content.
A specific stay duration is required, such as 1s.
If the same content is exposed to the same user multiple times in a short period of time, only one exposure entry is counted.
Counterexamples
1. The exposure entries generated when users flick the client homepage are not accumulated based on the last rule in the "Precautions" section.
As shown in the preceding figure, only one exposure entry is generated for each of the items A to E. That is, only five exposure entries in total are recorded and uploaded to AIRec.
2. The behavior that a user clicks an item from a feed stream to go to the item details page and then returns to the homepage is not accumulated based on the last rule in the "Precautions" section.
As shown in the preceding figure, only one exposure entry is generated for each of the items A to C. One click entry is generated for Item A. Three exposure entries are generated for items A to C and uploaded to AIRec. One click entry is generated for Item A and uploaded to AIRec.
Clicks
What is a click?
If a user clicks an item to view details, one click is counted.
Implementation of event tracking
Event tracking is performed on the client side. Example:
One click entry is generated for Item A after a user performs Step 2 shown in the preceding figure.
Precautions
Take note of the following precautions on click event tracking: 1. Users perform valid clicks on the client. 2. The item reading duration after an item is clicked is not limited. 3. Multiple clicks by the same user in a short period of time can only be counted as one click.
Scene IDs
What is a scene?
Scenes are specified by the scene_id field in item and behavior tables. Different data is launched to different scenes. You can consider scenes as data categories. If only one scene requires data launching, the scene_id field can be left empty. In this case, the default value 1 is used. This field is required when you query test results.
Implementation of event tracking
Event tracking on items is performed on the server side to record the categories of items. Event tracking on behavioral data is performed on the client side to record the scenes to which data is launched. Example:
As shown in the preceding figure, three scenes whose IDs are 1, 2, and 3 are provided in AIRec. The scene IDs can be customized. The IDs of all scenes in which behavior entries and item data are generated must be recorded by using the scene_id field. The recorded IDs must be uploaded to AIRec.
Precautions
Scene IDs are used only to identify the frontend scenes to which items are launched. In theory, it is only necessary to distinguish different scenes. The parent-child relationships among scenes are not represented. For example, assume that the following sections are included in a sports channel: latest scores, event forecast, and event commentary. We recommend that you configure categories, instead of specifying scene IDs, to classify data.
Usage notes
When you use SDKs to query recommendation results, you must specify scene_id to view the recommendation results in a specific scene. Some users have recommendation requirements on a homepage when the homepage does not provide item data based on the interests of users. The users also want the homepage to display item data aggregated in multiple scenes.
The scene ID specified when an item is recalled must be the same as the scene ID uploaded to AIRec.
Tags
What is a tag?
Tags are specified by the tag field in user and item tables. Tags are the summarized text descriptions of features of item data. Multiple tags are separated by commas (,).
Implementation of event tracking
For example,tags such as SUV, Toyota, and Highlander are extracted from the description of the item. The tags are separated by commas (,) and imported to an item table.
Category levels
What is a category level?
Category levels are specified by the category_path and category_level fields in an item table. You can specify category levels to classify items that have the same theme and attributes. For more information, see Improve the diversity of recommendations by using instance operation rules.
Implementation of event tracking
Event tracking is performed on the server side to record the categories of item data. Example:
The preceding figure shows event tracking data of the items.
Usage notes
When you use the mixed sorting or discretization feature, query results are displayed by proportion of different categories of data. For example, if the discretization level is 2, the recommendation result in the scene is a mix of Item A, Item B, and Item C in the Guangzhou Evergrande Taobao Football Club, CBA, and Chinese Football Association Super League categories. If the discretization level is 1, the recommendation result is a mix of football and basketball.
Expiration time
What is expiration time?
Expiration time is specified by the expire_time field in an item table. Expiration time is the time when a data entry expires. If the current time on the server side is later than the value of the expire_time field, the corresponding data entry is not recommended.
Implementation of event tracking
Event tracking is performed on the server side. The expiration time is automatically configured based on a system rule or manually configured.
Precautions
1. If an item has no expiration time, the expire_time field is left empty. 2. The expiration time is specified by using a timestamp that is accurate to the second. 3. If all the data that is uploaded for the first time expires, the service cannot be started.
Status
What is status?
Status is specified by the status field in an item table. This field specifies whether an item can be recommended. Valid values: 0 and 1. The value 0 indicates that the item cannot be recommended. The value 1 indicates that the item can be recommended.
Implementation of event tracking
Event tracking is performed on the server side. The status information is automatically configured based on a system rule or manually configured.
Precautions
1. The status field in an item table is required. 2. When data is uploaded for the first time, if the value of the status field is 0 for all items, the service cannot be started.
Terms
feed stream: an information flow that continuously provides content to users.
browsing depth: the page views generated by each visit in a calculation cycle.
stay duration: the time that users spend browsing a page.
discretization: the feature that data contained in returned recommendation results for a recommendation query is classified into multiple categories.