Community Blog How to Improve User Participation: The Rise of Interactive Recommendations

How to Improve User Participation: The Rise of Interactive Recommendations

This article describes how users interact with the recommendation systems. It demonstrates the implementation of the interactive recommendation through the weather vane model.

The Alibaba Cloud 2021 Double 11 Cloud Services Sale is live now! For a limited time only you can turbocharge your cloud journey with core Alibaba Cloud products available from just $1, while you can win up to $1,111 in cash plus $1,111 in Alibaba Cloud credits in the Number Guessing Contest.

Different businesses across industries widely use the recommendation systems to help us find what we want. However, the existing recommendation systems still persist with two problems:

  1. The recommendations focus on only a few topics and allow users to only passively browse what the system recommends to them.
  2. Some recommendations aren't satisfying as the systems do not know the users well.

Effective interaction between users and recommendation systems may mitigate the two problems. Apart from this, interactions give users a higher sense of participation and allow them to browse recommendation results actively. On the other hand, interactions empower recommendation systems to better understand users' preferences and therefore achieve better results.

This article describes how users interact with the recommendation systems in the following sections.


In today's times, children prefer watching or playing on mobile phones and tablets than television. Unlike television, mobile phones allow the user to rewind or fast-forward a program to the part that interests them more. The interaction while watching television is restricted. However, when they play Kings of Glory on mobile phones or tablets, they have more interactions to gain a higher sense of participation and more fun. Although mainstream recommendation systems are good at judging users' preferences, for example, judging users' beloved products, there are not many explorations in interactive recommendation academically or from the perspective of enterprise applications.

For this article, a large number of academic papers were referred to understand the functions of interactions in recommendation systems (hereinafter referred to as interactive recommendation). In Knowledge Discovery and Data Mining (KDD) issued in 2018, we find a feasible framework in the paper Q&R. A Two-Stage Approach Toward Interactive Recommendation [1]. Figure 1 shows the framework diagram. (The picture is taken from Q&R: A Two-Stage Approach Toward Interactive Recommendation, KDD 2018 [1].)

Figure 1 Interactive recommendation framework

According to this framework, the recommendation system asks a relevant question and then recommends an item (such as a user's beloved product) based on immediate interests extracted from user feedback. The interaction of the recommendation system reflects across the entire process of asking questions and gaining user feedback.

The following section describes each module in the framework:

  • Question Generation Model: This model determines the question form and question generation method first. It would be awesome if a complete question in human language is generated, for example, "It is cold. Would you like to buy a scarf?" Although natural language processing has gained great achievements, it is still difficult to generate a large number of such high-quality questions based on user preferences. As a result, this model converts the question generation task to the keyword-based recommendation task, as shown in Figure 2.

Specifically, the model generates several keywords such as scarf, windbreaker, and hat, puts them in a card, and recommends them to users for clicking. These keywords indicate the question, "It is cold. Would you like to buy a scarf, a windbreaker, or a hat?" Taking a cue from this example, a question generates through keyword-based recommendations. However, a new problem arises. How does the model obtain sufficient keyword options for a keyword-based recommendation? Use the search terms in search logs as options for a keyword-based recommendation because the search terms in search logs are entered by users, and many of them are meaningful words that can express the user needs. Also, this helps to obtain a large enough number of search terms that cover a wide range of requirements. Therefore, there will be appropriate words to recommend to different users in different environments, for example, in different seasons.

Figure 2 Question generation logic

  • User Feedback: This model obtains user feedback information. Let's go back to the previous example. If a user clicks the keyword scarf, windbreaker, or a hat, we consider that the user replies "Yes, I want to buy a scarf, windbreaker, or a hat." If the user does not click any keyword, it means that the user does not want to buy any of these items. The user feedback may also contain other information, such as whether the user clicks scarf or windbreaker before clicking the hat, whether the user browses items on the product details page, adds items to the shopping cart, and adds items to favorites, and the interval between question generation and user feedback. (A longer interval indicates weaker user demand.)
  • Item Recommendation Model: This model recommends products. If the user clicks a keyword, it implies that the user wants to buy the product represented by the keyword. This keyword is then used as the input to call a search algorithm to obtain the product list. As shown in Figure 3, the last two steps are classic search tasks.

Figure 3 User feedback and product recommendation logic

In essence, this process consists of recommendations and search in sequence. Recommendation is implemented based on the user-clicked keywords followed by the search. A search task may result in more transactions than a recommendation task, but a recommendation task has lower interaction costs than a search task because it requires no user input. The framework returns satisfying products based on clear user intention without the need for user input. It combines the advantages of both search and recommendation.

Directly call existing search algorithms to complete the tasks because the last two steps are converted to classic search tasks. (Customization is required for interactive recommendation.) Therefore, keyword-based recommendation in the first step is the optimization focus.

Note that the keyword-based recommendation is only a way to generate questions. Videos, guides, and items may also be used as materials to generate questions. Accordingly, user feedback and product recommendations might be in various forms. For example, user feedback on guides may be collected from user behaviors such as screen swipes, likes, comments, and stay duration on the guide's page. Recommended products are sorted based on videos, guides, or items.

Product Overview

Before diving into the keyword-based recommendation system, let's first discuss the product, which we call "weather vane" for convenience's sake. Weather vane has the following features:

(1) Interactive
(2) Demand focused, interpretable, and strongly somatosensory
(3) Scenario-aware

As shown in Figure 4, the user clicks a product among recommended items based on user preferences prediction on the Taobao homepage (the first image). On the product details page (the second image), the user slides on the screen and adds items to the shopping cart or favorites. When the user goes back to the previous page, the algorithm determines whether to generate a weather vane and a scenario-based text for the weather vane. For example, for weather vane "Clothing for Boys". The related keywords is in the third image. After the user clicks a word, the second-hop search result page (the fourth image) appears.

Figure 4 Product form of weather vane


Figure 5 shows the overall technical framework. First, basic data tables generate on the basis of search logs, product information, and knowledge graph data, followed by keyword recall and item sorting. We consider not only the user's historical preferences but also the user information (terminal-based intelligence) on the details page during item sorting. Then, the presentation and control module is used to adjust the sorting result. After the user clicks a keyword, the search process is triggered. Finally, the generated logs are flowed back to the search logs. The dotted lines in Figure 5 indicate the modules to be optimized in the future.

Figure 5 Overall technical framework

Next, let's take a look at the four core modules.

1. Recall

Let's understand the entire recall process using a meta-path model[2]. To be specific, users, products, keywords, scenarios, and categories are regarded as heterogeneous nodes in Heterogeneous Information Networks (HINs), as shown in Figure 6. Use different recall policies to find a meta-path matching each query. The recall type and recall score correspond to the meta-path type and the meta-path score, respectively. The objective is to determine whether users and keywords are related and their correlation strength based on meta-paths in HINs. Finally, the words with the strongest correlation are recommended to the user. This article introduces three recall policies.

Figure 6 Meta-path model

  • u2i2q: The core idea of this recall policy is to obtain the product-to-query relationship (i2q) based on the query-to-product relationship (q2i). i2q establishes the relationship between the user-clicked product and the query. The more co-occurrences of query and product in search logs, the closer the relationship between them. In the meta-path model, this recall policy indicates that products are found based on a meta-path, precisely one which denotes the bidirectional relationships between query and product.
  • u2i2scene2q: As shown in Figure 7, this recall policy consists of two processes: i2scene (on the left) and scene2q (on the right). In the i2scene process, the corresponding scenario activates after the user clicks a product. In the scene2q process, related keywords are generated based on the scenario.
  • u2i2c2q: This recall policy uses the knowledge graph and category information to improve the query and recall effects.

Figure 7 Scenario-based query and i2score2q recall processes

At present, it only allows recalling the keywords. Other entities such as videos and guides will be recalled in the future. In addition, it is highly possible to improve the recall effects by using the action data on the details page and the knowledge graph.

2. Sorting

Sorting is divided into two processes, fine-sorting and resorting. The fine-sorting process has two versions:

  • Version 1: XFTRL

Theoretical research on Follow the Regularized Leader (FTRL) started more than ten years ago. The paper published by Google in KDD 2013[3] brings this theoretical model into practical engineering, allowing many enterprise-level online learning models to be implemented based on FTRL. XFTRL is a custom model developed by the XPS platform based on FTRL. It processes hundreds of billions of discrete features on Taobao. An offline experiment uses the data of the previous four days for training and the last day's data for testing. XFTRL generates an AUC of 0.67 in this test set.

  • Version 2: Attention_GRU

(1) Motivation: XFTRL (FTRL) is effective and has mature engineering implementations. However, it has the following disadvantages:

A large amount of work has proved that users' short- and long-term interests and action sequences [4,5] help to improve the recommendation effect. However, XFTRL (FTRL) captures only a small amount of information about short- and long-term interests and action sequences through feature engineering.

XFTRL (FTRL) is essentially a linear model. Feature interaction is only be implemented by adding interactive features to feature engineering, which relies heavily on algorithm engineers' understanding and experience of the business. In general, algorithm engineers only consider second-order interactive features when performing feature engineering. High-order interaction between these features is not captured. In short, using feature engineering to capture feature interactions is prone to duplication of invalid interactive features or unavailability of important interactive features.

To solve this problem, we propose a custom Attention_GRU model to improve the effect of keyword-based recommendations.

On one hand, Attention_GRU is proven to be suitable for modeling of sequence data (including historical action sequences). On the other hand, Attention_GRU is a neural network model in essence. The (high-order) interactions between features are captured by nonlinear activation functions in the neural network. Input some interactive features with high confidence in the neural network to establish an explicit model for these interactive features. This custom Attention_GRU model is proposed mainly based on two papers published in IJCAI 2017 and IJCAI 2018[4,5].

(2) Model Framework: This model consists of four groups of features: user-side non-real-time features, user-side real-time features, query-side features, and other features. We concatenate the four groups of features that are obtained by using the Concat method and input them to a 3-layer neural network. Then, calculate the loss based on the outputs of the neural network and the labels.

Figure 8 Model framework of weather vane

(3) Attention_GRU:



In the preceding formulas, Attend and Generate are functions. gm is a vector. The element j indicates the attention weight of the jth input. gm is called glimpse. Recurrence is a cyclic activation function. In the Attention_GRU model, the cyclic activation function is GRU.

In GRU implementation, x in the preceding formula corresponds to i in Figure 8.

There is a series of experiments on the Attention_GRU model. In the first experiment, we recommend items based on the query-side category features only. We use category features with large granularity rather than query IDs because the query IDs are too sparse.

The AUC value is 0.5685, higher than 0.5, indicating that the popularity of the keywords (categories) is helpful in keyword-based recommendation. When the product category features in the historical action sequence are used, the AUC score is increased to 0.6037, indicating that historical product categories and their sequence information are useful. Product categories are used instead of product IDs due to the preceding reason.

On this basis, we add the text features such as the product title and keywords. The AUC value is increased to 0.6203, indicating that text information can further improve the effect of keyword-based recommendation. Finally, we add all the features in Figure 8. The AUC score reaches 0.6830, exceeding the benchmark obtained by using XFTRL. Table 1 lists the experiment results.


Table 1 also compares the offline experiment results of XFTRL, Attention_GRU, and improved Attention_GRU.

(4) Improvement over Attention_GRU: To further improve the effect of keyword-based recommendation, we have made some innovative improvements to the Attention mechanism in Attention_GRU. The main motivations are as follows:

The more remote the user's historical action, the smaller its impact on keyword-based recommendation and the smaller the Attention weight. That is, the Attention weight declines with time.

Different action types have different impacts on the keyword-based recommendation. For example, the impact of the purchase behavior is larger than that of the click action, so the two action types have different Attention weights. Hence, the Attention weight varies with the action type.

The calculation diagram for formulas 1.1 and 1.2 is as follows:

Figure 9 Calculate the Attention weight


Figure 10 Time decay and action type considered during the Attention weight calculation

In Figure 10, each action type corresponds to one 15 matrix. Assume that the dimension of 16 is d. Then, 17 is a matrix with d x d dimensions. The time interval Time_decay indicates the time difference between the occurrence of historical action and the occurrence of keyword-based recommendations. Time_decay is a monotonically decreasing function. As shown in Table 1, the AUC value reaches 0.6999 for improved Attention_GRU. In addition, if we convert MaxCompute data into TFRecord data and use it as the model input, the training speed will increase by 40%.

(5) Some Thoughts: XFTRL contains some features unavailable in Attention_GRU. However, XFTRL does not employ sequence and long- and short-term interests in modeling, which is implemented in Attention_GRU. Therefore, XFTRL and Attention_GRU are complementary to each other. In the future, the two models may integrate to further improve the recommendation effect.

There is still a lot of room for improvement in the fine-sorting process. For example, considering how to sort items based on multiple entities including keywords, videos, and guides. Real-time features such as user actions on the details page are very important for the keyword-based recommendation. Considering how to better establish models based on these features and how to use the knowledge graph data to effectively improve the modeling effect.

The resorting process is relatively simple. The focus is to generate more diversified keywords while ensuring the sorting effect.

3. Presentation and Control

The presentation and control module is composed of two parts:

  • (1) location, time, and intention control.
  • (2) scenario and industry intervention.
  • Location, time, and intention control: Location and time control mainly prevents the emergence of some bad cases. Intention control uses a model to identify user intentions and then determines whether to generate a weather vane.
  • Scenario and industry intervention: Sorting-generated words are highly relevant to requirements but are not scenario-aware or divergent enough. Therefore, we use a model to make the four words more scenario-centered and generate the corresponding scenario-based texts (for example, "Clothing for Boys" in Figure 4). In addition, we introduce industry intervention during the Double 11 Shopping Festival and Black Friday. On one hand, we leverage industry knowledge to improve the weather vane effect. On the other hand, we complete some industry objectives, achieving win-win results.

4. Terminal-based Intelligence

Rich user action information on the product details page, including the actions of adding items to the shopping cart and favorites and the stay duration is very important for capturing users' immediate interests. Adding terminal-based intelligence information to the model significantly improves the effect of keyword-based recommendations.

Anyway, this model still has a lot of room for improvement:

Much terminal-based intelligence information, including screen-sliding tracks, product introduction viewing, and buyer comments viewing has not yet been used for modeling. If more useful information is added, the recommendation effect will surely improve.

User actions on the product details page constitute sequences. Using Attention_GRU to establish a model based on the action sequence to improve the recommendation effect further.

Currently, the recall, fine-sorting, resorting, and presentation and control processes are all completed on the server. However, each terminal stores its own model. If the preceding processes are completed on terminals, models are updated for terminals in real-time. In the server-side framework, a model serves all users. In the terminal-side framework, every user has a personalized model.

Double 11 Shopping Festival

The weather vane worked well during the Double 11 Shopping Festival of 2018. All business indicators exceeded expectations. Based on user convergence and passive demands, we applied different control policies in different phases of the Double 11 Shopping Festival. In the first phase, users had clear shopping intentions. We promoted convergent queries accordingly. Passive demands increased in the decline phase. Then, we mainly promoted scenario-based queries. With the sharp increase in shopping demands at night, the number of convergent queries increased.

Figure 11 Weather vane control policies for the Double 11 Shopping Festival


The interactive recommendation is a promising research direction. We made an innovative attempt to implement interactive recommendation to recommend products based on user preferences on the homepage through weather vane. Through our efforts, weather vane had made some achievements in the Double 11 Shopping Festival of 2018.

However, weather vane still has a lot of room for improvement. We will introduce some of them in the following:

  • The data on the product details page is limited. Only the actions of adding items to the shopping cart and favorites and the stay duration are used.
  • In the future, we will obtain more details page data to optimize our model. This data includes the buttons clicked on the details page, sections viewed, screen-swipe tracking, and screen-swipe speed.
  • The recall logic of the scenario-based query is relatively simple now. We will use the knowledge graph information to improve the recall and fine-sorting effect of scenario-based queries.
  • The feature system and objective function need improvement. The current feature system mainly includes user-side features and query-side features. It does not make full use of trigger item information. The objective function currently mainly considers the first-hop click-through rate. However, the second-hop IPV and transaction amount are more important. Therefore, we can optimize the objective function by applying second-hop metrics to it.
  • Currently, the page that appears upon the second hop is for us to call the search API. However, the search API mainly considers the correlation between the second-hop result page and the keywords as well as user personalization. It does not take the first-hop trigger item into account. As a result, we can optimize the second-hop algorithm or design a new one.


  • 1) Christakopoulou, Konstantina, Alex Beutel, Rui Li, Sagar Jain, and Ed H. Chi. Q&R: A Two-Stage Approach Toward Interactive Recommendation. KDD, pp. 139-148. ACM, 2018.
  • 2) Zhao, Huan, Quanming Yao, JiandaLi, Yangqiu Song, and Dik Lun Lee. Meta-Graph Based Recommendation Fusion over Heterogeneous Information Networks. KDD, pp. 635-644. ACM, 2017.
  • 3) McMahan, H. Brendan, GaryHolt, David Sculley, Michael Young, Dietmar Ebner, Julian Grady, Lan Nie et al. Ad Click Prediction: A View from the Trenches. KDD, pp. 1222-1230.ACM, 2013.
  • 4) Zhu, Yu, Hao Li, Yikang Liao, Beidou Wang, Ziyu Guan, Haifeng Liu, and Deng Cai. What to Do Next: Modeling User Behaviors by Time-LSTM. IJCAI, pp. 3602-3608. 2017.
  • 5) Zhu, Yu, Junxiong Zhu, JieHou, Yongliang Li, Beidou Wang, Ziyu Guan, and Deng Cai. A Brand-level Ranking System with the Customized Attention_GRU Model. IJCAI, pp.3947-3953. 2018.
  • 6) Chorowski, Jan K., Dzmitry Bahdanau, Dmitriy Serdyuk, Kyunghyun Cho, and Yoshua Bengio. Attention-based Models for Speech Recognition. NIPS, pp. 577-585.2015.
  • 7) Mnih, Volodymyr, NicolasHeess, and Alex Graves. Recurrent Models of Visual Attention. NIPS, pp. 2204-2212. 2014.
0 0 0
Share on

Alibaba Clouder

2,630 posts | 644 followers

You may also like


Alibaba Clouder

2,630 posts | 644 followers

Related Products