Different businesses across industries widely use the recommendation systems to help us find what we want. However, the existing recommendation systems still persist with two problems:
Effective interaction between users and recommendation systems may mitigate the two problems. Apart from this, interactions give users a higher sense of participation and allow them to browse recommendation results actively. On the other hand, interactions empower recommendation systems to better understand users' preferences and therefore achieve better results.
This article describes how users interact with the recommendation systems in the following sections.
In today's times, children prefer watching or playing on mobile phones and tablets than television. Unlike television, mobile phones allow the user to rewind or fast-forward a program to the part that interests them more. The interaction while watching television is restricted. However, when they play Kings of Glory on mobile phones or tablets, they have more interactions to gain a higher sense of participation and more fun. Although mainstream recommendation systems are good at judging users' preferences, for example, judging users' beloved products, there are not many explorations in interactive recommendation academically or from the perspective of enterprise applications.
For this article, a large number of academic papers were referred to understand the functions of interactions in recommendation systems (hereinafter referred to as interactive recommendation). In Knowledge Discovery and Data Mining (KDD) issued in 2018, we find a feasible framework in the paper Q&R. A Two-Stage Approach Toward Interactive Recommendation . Figure 1 shows the framework diagram. (The picture is taken from Q&R: A Two-Stage Approach Toward Interactive Recommendation, KDD 2018 .)
Figure 1 Interactive recommendation framework
According to this framework, the recommendation system asks a relevant question and then recommends an item (such as a user's beloved product) based on immediate interests extracted from user feedback. The interaction of the recommendation system reflects across the entire process of asking questions and gaining user feedback.
The following section describes each module in the framework:
Specifically, the model generates several keywords such as scarf, windbreaker, and hat, puts them in a card, and recommends them to users for clicking. These keywords indicate the question, "It is cold. Would you like to buy a scarf, a windbreaker, or a hat?" Taking a cue from this example, a question generates through keyword-based recommendations. However, a new problem arises. How does the model obtain sufficient keyword options for a keyword-based recommendation? Use the search terms in search logs as options for a keyword-based recommendation because the search terms in search logs are entered by users, and many of them are meaningful words that can express the user needs. Also, this helps to obtain a large enough number of search terms that cover a wide range of requirements. Therefore, there will be appropriate words to recommend to different users in different environments, for example, in different seasons.
Figure 2 Question generation logic
Figure 3 User feedback and product recommendation logic
In essence, this process consists of recommendations and search in sequence. Recommendation is implemented based on the user-clicked keywords followed by the search. A search task may result in more transactions than a recommendation task, but a recommendation task has lower interaction costs than a search task because it requires no user input. The framework returns satisfying products based on clear user intention without the need for user input. It combines the advantages of both search and recommendation.
Directly call existing search algorithms to complete the tasks because the last two steps are converted to classic search tasks. (Customization is required for interactive recommendation.) Therefore, keyword-based recommendation in the first step is the optimization focus.
Note that the keyword-based recommendation is only a way to generate questions. Videos, guides, and items may also be used as materials to generate questions. Accordingly, user feedback and product recommendations might be in various forms. For example, user feedback on guides may be collected from user behaviors such as screen swipes, likes, comments, and stay duration on the guide's page. Recommended products are sorted based on videos, guides, or items.
Before diving into the keyword-based recommendation system, let's first discuss the product, which we call "weather vane" for convenience's sake. Weather vane has the following features:
(2) Demand focused, interpretable, and strongly somatosensory
As shown in Figure 4, the user clicks a product among recommended items based on user preferences prediction on the Taobao homepage (the first image). On the product details page (the second image), the user slides on the screen and adds items to the shopping cart or favorites. When the user goes back to the previous page, the algorithm determines whether to generate a weather vane and a scenario-based text for the weather vane. For example, for weather vane "Clothing for Boys". The related keywords is in the third image. After the user clicks a word, the second-hop search result page (the fourth image) appears.
Figure 4 Product form of weather vane
Figure 5 shows the overall technical framework. First, basic data tables generate on the basis of search logs, product information, and knowledge graph data, followed by keyword recall and item sorting. We consider not only the user's historical preferences but also the user information (terminal-based intelligence) on the details page during item sorting. Then, the presentation and control module is used to adjust the sorting result. After the user clicks a keyword, the search process is triggered. Finally, the generated logs are flowed back to the search logs. The dotted lines in Figure 5 indicate the modules to be optimized in the future.
Figure 5 Overall technical framework
Next, let's take a look at the four core modules.
Let's understand the entire recall process using a meta-path model. To be specific, users, products, keywords, scenarios, and categories are regarded as heterogeneous nodes in Heterogeneous Information Networks (HINs), as shown in Figure 6. Use different recall policies to find a meta-path matching each query. The recall type and recall score correspond to the meta-path type and the meta-path score, respectively. The objective is to determine whether users and keywords are related and their correlation strength based on meta-paths in HINs. Finally, the words with the strongest correlation are recommended to the user. This article introduces three recall policies.
Figure 6 Meta-path model
Figure 7 Scenario-based query and i2score2q recall processes
At present, it only allows recalling the keywords. Other entities such as videos and guides will be recalled in the future. In addition, it is highly possible to improve the recall effects by using the action data on the details page and the knowledge graph.
Sorting is divided into two processes, fine-sorting and resorting. The fine-sorting process has two versions:
Theoretical research on Follow the Regularized Leader (FTRL) started more than ten years ago. The paper published by Google in KDD 2013 brings this theoretical model into practical engineering, allowing many enterprise-level online learning models to be implemented based on FTRL. XFTRL is a custom model developed by the XPS platform based on FTRL. It processes hundreds of billions of discrete features on Taobao. An offline experiment uses the data of the previous four days for training and the last day's data for testing. XFTRL generates an AUC of 0.67 in this test set.
(1) Motivation: XFTRL (FTRL) is effective and has mature engineering implementations. However, it has the following disadvantages:
A large amount of work has proved that users' short- and long-term interests and action sequences [4,5] help to improve the recommendation effect. However, XFTRL (FTRL) captures only a small amount of information about short- and long-term interests and action sequences through feature engineering.
XFTRL (FTRL) is essentially a linear model. Feature interaction is only be implemented by adding interactive features to feature engineering, which relies heavily on algorithm engineers' understanding and experience of the business. In general, algorithm engineers only consider second-order interactive features when performing feature engineering. High-order interaction between these features is not captured. In short, using feature engineering to capture feature interactions is prone to duplication of invalid interactive features or unavailability of important interactive features.
To solve this problem, we propose a custom Attention_GRU model to improve the effect of keyword-based recommendations.
On one hand, Attention_GRU is proven to be suitable for modeling of sequence data (including historical action sequences). On the other hand, Attention_GRU is a neural network model in essence. The (high-order) interactions between features are captured by nonlinear activation functions in the neural network. Input some interactive features with high confidence in the neural network to establish an explicit model for these interactive features. This custom Attention_GRU model is proposed mainly based on two papers published in IJCAI 2017 and IJCAI 2018[4,5].
(2) Model Framework: This model consists of four groups of features: user-side non-real-time features, user-side real-time features, query-side features, and other features. We concatenate the four groups of features that are obtained by using the Concat method and input them to a 3-layer neural network. Then, calculate the loss based on the outputs of the neural network and the labels.
Figure 8 Model framework of weather vane
In the preceding formulas, Attend and Generate are functions. gm is a vector. The element j indicates the attention weight of the jth input. gm is called glimpse. Recurrence is a cyclic activation function. In the Attention_GRU model, the cyclic activation function is GRU.
In GRU implementation, x in the preceding formula corresponds to i in Figure 8.
There is a series of experiments on the Attention_GRU model. In the first experiment, we recommend items based on the query-side category features only. We use category features with large granularity rather than query IDs because the query IDs are too sparse.
The AUC value is 0.5685, higher than 0.5, indicating that the popularity of the keywords (categories) is helpful in keyword-based recommendation. When the product category features in the historical action sequence are used, the AUC score is increased to 0.6037, indicating that historical product categories and their sequence information are useful. Product categories are used instead of product IDs due to the preceding reason.
On this basis, we add the text features such as the product title and keywords. The AUC value is increased to 0.6203, indicating that text information can further improve the effect of keyword-based recommendation. Finally, we add all the features in Figure 8. The AUC score reaches 0.6830, exceeding the benchmark obtained by using XFTRL. Table 1 lists the experiment results.
Table 1 also compares the offline experiment results of XFTRL, Attention_GRU, and improved Attention_GRU.
(4) Improvement over Attention_GRU: To further improve the effect of keyword-based recommendation, we have made some innovative improvements to the Attention mechanism in Attention_GRU. The main motivations are as follows:
The more remote the user's historical action, the smaller its impact on keyword-based recommendation and the smaller the Attention weight. That is, the Attention weight declines with time.
Different action types have different impacts on the keyword-based recommendation. For example, the impact of the purchase behavior is larger than that of the click action, so the two action types have different Attention weights. Hence, the Attention weight varies with the action type.
The calculation diagram for formulas 1.1 and 1.2 is as follows:
Figure 9 Calculate the Attention weight
Figure 10 Time decay and action type considered during the Attention weight calculation
In Figure 10, each action type corresponds to one matrix. Assume that the dimension of is d. Then, is a matrix with d x d dimensions. The time interval Time_decay indicates the time difference between the occurrence of historical action and the occurrence of keyword-based recommendations. Time_decay is a monotonically decreasing function. As shown in Table 1, the AUC value reaches 0.6999 for improved Attention_GRU. In addition, if we convert MaxCompute data into TFRecord data and use it as the model input, the training speed will increase by 40%.
(5) Some Thoughts: XFTRL contains some features unavailable in Attention_GRU. However, XFTRL does not employ sequence and long- and short-term interests in modeling, which is implemented in Attention_GRU. Therefore, XFTRL and Attention_GRU are complementary to each other. In the future, the two models may integrate to further improve the recommendation effect.
There is still a lot of room for improvement in the fine-sorting process. For example, considering how to sort items based on multiple entities including keywords, videos, and guides. Real-time features such as user actions on the details page are very important for the keyword-based recommendation. Considering how to better establish models based on these features and how to use the knowledge graph data to effectively improve the modeling effect.
The resorting process is relatively simple. The focus is to generate more diversified keywords while ensuring the sorting effect.
The presentation and control module is composed of two parts:
Rich user action information on the product details page, including the actions of adding items to the shopping cart and favorites and the stay duration is very important for capturing users' immediate interests. Adding terminal-based intelligence information to the model significantly improves the effect of keyword-based recommendations.
Anyway, this model still has a lot of room for improvement:
Much terminal-based intelligence information, including screen-sliding tracks, product introduction viewing, and buyer comments viewing has not yet been used for modeling. If more useful information is added, the recommendation effect will surely improve.
User actions on the product details page constitute sequences. Using Attention_GRU to establish a model based on the action sequence to improve the recommendation effect further.
Currently, the recall, fine-sorting, resorting, and presentation and control processes are all completed on the server. However, each terminal stores its own model. If the preceding processes are completed on terminals, models are updated for terminals in real-time. In the server-side framework, a model serves all users. In the terminal-side framework, every user has a personalized model.
The weather vane worked well during the Double 11 Shopping Festival of 2018. All business indicators exceeded expectations. Based on user convergence and passive demands, we applied different control policies in different phases of the Double 11 Shopping Festival. In the first phase, users had clear shopping intentions. We promoted convergent queries accordingly. Passive demands increased in the decline phase. Then, we mainly promoted scenario-based queries. With the sharp increase in shopping demands at night, the number of convergent queries increased.
Figure 11 Weather vane control policies for the Double 11 Shopping Festival
The interactive recommendation is a promising research direction. We made an innovative attempt to implement interactive recommendation to recommend products based on user preferences on the homepage through weather vane. Through our efforts, weather vane had made some achievements in the Double 11 Shopping Festival of 2018.
However, weather vane still has a lot of room for improvement. We will introduce some of them in the following:
Alibaba Clouder - January 22, 2020
- March 16, 2018
Alibaba Cloud MaxCompute - February 28, 2020
ApsaraDB - April 19, 2019
Alibaba Clouder - November 11, 2019
Alibaba Clouder - November 14, 2018
An end-to-end platform that provides various machine learning algorithms to meet your data mining and analysis requirements.Learn More
This solution enables you to rapidly build cost-effective platforms to bring the best education to the world anytime and anywhere.Learn More
Relying on Alibaba's leading natural language processing and deep learning technology.Learn More
SDDP automatically discovers sensitive data in a large amount of user-authorized data, and detects, records, and analyzes sensitive data consumption activities.Learn More
More Posts by Alibaba Clouder