This topic describes video recommendation-related fields to help you build a comprehensive video recommendation system. By analyzing user and content features, and user behaviors on video content, the video recommendation system can provide personalized suggestions.
The following tables describe recommended fields in user, item, and behavior tables in video recommendation scenarios. Configuring more fields will give you better recommendation results. You can also provide additional fields that are not listed in the following tables to further improve the results. The names of fields do not need to be the same as the ones in the following tables.
User table
Field | Type | Required | Example |
user_id | Integer or string | Yes | The ID of the user, which is the unique identifier of the user. |
age | Integer | No | The age of the user, which can be segmented. User age can be categorized into segments, such as 0 to 12, 12 to 18, 18 to 24, and 25 to 34, and converted from numerical features into categorical features by discretization. |
gender | String | No | The gender of the user. For example, male, female, and other genders can be used as categorical features. You can also use integers 0, 1, and 2 to indicate the gender of a user. |
occupation | String | No | The occupation of the user. For example, student, teacher, engineer, and other occupations can be used as categorical features. |
education | String | No | The educational background of the user. For example, senior high school, undergraduate, and master can be used as categorical features. |
income | Integer or string | No | The income level of the user. For example, low, medium, and high income levels can be used as categorical features. |
user_level | Integer or string | No | The level or membership level of the user on the platform. |
register_time | Timestamp | No | The time when the user registers the account. Unit: seconds. The time can be used as numerical features after being segmented by year, month, and day. It can be converted into categorical features after discretization. |
country | String | No | The country in which the user is located, which can be used as a categorical feature. |
province | String | No | The province in which the user is located, which can be used as a categorical feature. |
city | String | No | The city in which the user is located, which can be used as a categorical feature. |
active_time | Integer or string | No | The period of time during which the user is active on the platform. For example, the morning, afternoon, evening, and other periods of time can be used as categorical features. |
device_type | String | No | The type of device used by the user. For example, PC, mobile phone, tablet, and other devices can be used as categorical features. |
os | String | No | The operating system of the user device. For example, iOS, Android, Windows, and other operating systems can be used as categorical features. |
browser | String | No | The type of the browser used by the user. For example, Google Chrome, Firefox, Safari, and other browsers can be used as categorical features. |
language | String | No | The language preferred by the user. For example, English, Chinese, Spanish, and other languages can be used as categorical features. |
interests | String | No | The interests of the user. For example, sports, music, travel, and other interests can be used as tag features. |
Item table
Field | Type | Required | Example |
item_id | Integer or string | Yes | The ID of the item, which is the unique identifier of the video. |
category | String | No | The main category to which the video belongs, which can be used as a categorical feature. |
leaf_category | String | No | The sub-category to which the video belongs, which can be used as a categorical feature. |
brand | String | No | The brand or producer of the video, which can be used as a categorical feature. |
video_type | String | No | The type of the video. For example, movie, TV series, documentary, short film, and other types can be used as categorical features. |
duration | Integer | No | The duration of the video. The duration of the video can be discretized into the following categories: shorter than 10 minutes, 10 to 30 minutes, and longer than 30 minutes. These categories can be used as categorical features. |
title | String | No | The title of the video. |
series_name | String | No | The series title of the video, such as Journey to the West. |
series_total_number | Integer | No | The total number of episodes for the video series. |
series_number | Integer | No | The current episode number of the video series. For example, 1 indicates the first episode. |
release_date | Timestamp | No | The release date of the video. Unit: seconds. The release date can be used as a numerical feature. |
director | String | No | The director of the video. |
actors | String | No | The main actors of the video, which are separated with commas (,). Multiple values can be used as tag features. |
rating | Float | No | The rating of the video. For example, IMDb, Douban, and other ratings can be used as numerical features. |
language | String | No | The original language of the video. For example, English, Chinese, Japanese, and other languages can be used as categorical features. |
has_subtitle | Integer | No | Specifies whether the subtitle service is provided. |
region | String | No | The region in which the video is produced. For example, Hollywood, Bollywood, Chinese mainland, and other regions can be used as categorical features. |
tags | String | No | The tag of the video, such as comedy, action, love, and other tags. Multiple values can be used as tag features. |
Behavior table
To obtain all types of user behaviors, we recommend that you collect user behaviors such as exposure and click from the full stack, including the recommendation, hot items, and search scenarios. In search scenarios, search queries are recorded.
User clicks and viewing behaviors in non-recommendation scenarios can also serve as sources of insights into user preferences.
Field | Type | Required | Example |
request_id | String | No | The ID of the request, which is the unique ID of each recommendation request. The absence of the request_id field affects the accuracy of the sample and addition of real-time features. New recommendation scenarios do not require the request_id field. However, after you create a recommendation scenario, you must add the request_id field and modify the code of the training sample before model training. |
user_id | Integer or string | Yes | The ID of the user, which is the unique identifier of the user. |
item_id | Integer or string | Yes | The ID of the item, which is the unique identifier of the video. |
event | String | Yes | The behavior the user performs on the video. For example, exposure, click, like, and other types of behaviors can be used as categorical features. |
event_value | Numeric | Yes | If you set the event field to |
timestamp | Timestamp | Yes | The time when the user performs the behavior. Unit: seconds. The time can be segmented by hour, day of the week, or holiday and used as categorical features. |
scene | String | Yes | home_feed indicates homepage recommendation. hot_items indicates popular items. Note that this field is required in all scenarios. search indicates the search scenario in which you must configure the query field. |
query | String | No | The search query required in search scenarios. |
device_type | String | No | The type of device used by the user. For example, PC, mobile phone, tablet, and other devices can be used as categorical features. |
browser | String | No | The type of the browser used by the user. For example, Google Chrome, Firefox, Safari, and other browsers can be used as categorical features. |
mobile_brand | String | No | The brand of the mobile phone used by the user, which can be used as a categorical feature. |
os | String | No | The operating system of the user device. For example, iOS, Android, Windows, and other operating systems can be used as categorical features. |
ip | String | No | The IP address of the user, which can be used to position the province and city of the user and can be used as a categorical feature. |
rating | Decimal | No | The average user rating on the video. For example, the video scores 8.5 of 10. |
weather | String | No | The weather condition of the region in which the user lives. For example, sunny, rainy, snowy, and other weather conditions can be used as categorical features. |
holiday | Boolean | No | Specifies whether the user behavior takes place during a holiday. For example, Spring Festival, National Day, and other holidays can be used as categorical features. |
season | String | No | The season. For example, spring, summer, autumn, and winter can be used as categorical features. |
longitude | Float | No | The longitude of the location of the user, which can be used as a numerical feature, and can be used as a categorical feature after discretization. |
latitude | Float | No | The latitude of the location of the user, which can be used as a numerical feature, and can be used as a categorical feature after discretization. |