Important Practice of edge computing in Taobao Recommendation System

​1. Preface

1.1 Edge Computing vs. Cloud Computing

In the past ten years, relying on big data, cloud computing has achieved very dazzling development, but at the same time it is also facing some problems: With the explosive growth of Internet applications and user scale, 5G popularization and bandwidth increase will bring The pressure of cloud storage; at present, the deployment of large-scale neural networks in online systems has become increasingly common, putting enormous pressure on cloud computing; for some applications with relatively high real-time requirements, the huge communication overhead with the cloud is also a bottleneck for interaction and experience; The "centralized" computing model also brings operation and maintenance costs and failure risks.

The concept of edge computing has actually been proposed for a long time. With the rapid development of storage and computing capabilities of terminal devices in recent years, especially the performance of smartphones (the running scores of various CPUs and GPUs, and the memory is getting bigger and bigger) has become has become a major selling point, and its computing power currently appears to be far from being fully utilized. Moreover, the advantages of edge computing lie in the following four points: 1) Data localization, solving cloud storage and privacy issues; 2) Computing localization, solving cloud computing overload problems; 3) Low communication costs, solving interaction and experience problems; 4) Decentralized computing, fault avoidance and extreme personalization.

1.2 Pain points in recommendation system

In the era of fully entering the wireless era, in order to solve the problem of information load, more and more recommendation scenarios have emerged, especially the information flow recommendation mainly in the form of list recommendation. Taking the mobile Taobao information flow as an example, users who enter the scene that guesses what you like often have unclear interests. Users often do not have clear product needs when browsing, but gradually discover the products they want to buy during the shopping process. The recommendation system will deliver and present different types of products to the client during the user's browsing process for the user to choose from. The recommendation system will capture the user's interest changes during this process, thereby recommending products that are more in line with the user's interest. However, can the recommendation system respond immediately when the user's interest changes?

The previous practice of the recommendation system is to trigger the product sorting of the cloud server after the request of the client, and then send the sorted products to the user, and then present the products on the terminal side accordingly. There are two problems in this way:

Delay in decision-making of the recommendation system: Due to the QPS pressure limit of the cloud server, information flow recommendations will be made in the form of paging requests, which will result in fewer opportunities for the cloud recommendation system to adjust the content recommended by end users, and cannot respond to changes in user interests in a timely manner. As shown in the figure below, the user’s interaction with the fourth product shows that he doesn’t like “motorcycle”, but since the pagination request can only be after 50 products, other “motorcycle” products at the back of the page cannot be adjusted in time.

Delay in real-time perception of user behavior: At present, the personalization of the recommendation system is expressed by using the behavior of the user interacting with the product as a feature, but the user's behavior actually occurs on the client. The recommendation system model wants to take The behavior characteristics of the user need to send the data on the end to the server, which will cause a delay problem. As shown in the figure below, the delay of the user behavior may reach 10s~1min. At the same time, due to the problem of network bandwidth delay, a large number of other detailed user behaviors (such as real-time exposure of products, user swipe gestures, etc.) cannot be modeled.

To sum up, the pain point of the current recommendation system is that changes in user preferences do not match the timing of user perception and content adjustment by the recommendation system, and the recommended content is not what the user wants at the moment. decline.

1.3 Edge Computing + Recommendation System

The advantage of edge computing is that the edge nodes (here refers to the mobile terminal) have the ability of "independent thinking", which makes some decisions and calculations no longer depend on the cloud, and the terminal side can give results in a more real-time and more strategic manner . Speaking of real-time, the arrival of the 5G era, its low-latency feature greatly reduces the interaction time between the terminal and the cloud, but this does not affect our use of terminal intelligence to achieve lower-cost decision-making and quick response. The advantage is that it can be more closely integrated with the cloud. In addition, because the user's intentions can be perceived at the second level to make decisions on the terminal side, the product and the user are posted closer, which has given birth to more real-time gameplay. Instead of giving new content feedback, think about how the product should provide content that matches the intent when users express specific user intent.

The EdgeRec recommendation system on the terminal uses the real-time perception and real-time feedback of edge computing to solve the problem of insufficient real-time perception and real-time feedback capabilities of the current Client-Server architecture recommendation system. The EdgeRec recommendation system provides capabilities such as on-device user intent perception, on-device rearrangement, and on-device real-time card insertion. By perceiving the user's intention at the end-side second level to make a decision, and providing feedback that matches the intention, the user's willingness to click and browse will be improved, and the overall sense of the waterfall will be changed.

2. On-end algorithm model
2.1 Overview

As shown in Figure (a) below, the end-to-end recommendation algorithm model in EdgeRec mainly includes two modules: "on-device real-time user perception" and "on-device real-time rearrangement". Among them, "on-device real-time user perception" is modeled as Heterogeneous User Behavior Sequence Modeling, which includes "Item Exposure (IE) Behavior Sequence Modeling" and "Item Exposure (IE) Behavior Sequence Modeling" and "Item Exposure (IE) Behavior Sequence Modeling" Page-View (IPV) Behavior Sequence Modeling)” in two parts; “on-device real-time rearrangement” is modeled as Reranking with Behavior Attention Networks (BAN). Next, we will introduce these two modules in detail.

2.2 Real-time user perception on the device

2.2.1 Significance

First of all, in personalized search and recommendation, "thousands of people and thousands of faces" come from the personalization of features, while "personalization" mainly depends on the user's behavior data, refer to DIN[1] and other works, they all model the user A sequence of recently interacted items, as input to the personalization model. However, the previous work generally only considered the "positive feedback" interaction between the user and the product (such as clicks, transactions), and rarely considered the "negative feedback" interaction between the user and the product (such as exposure). Indeed, the characteristics of "positive feedback" are relatively clear, and the noise is relatively small; but we believe that the real-time "negative feedback" interaction between users and products is also very important. To give an intuitive example: a certain category of products After multiple exposures in real time, the click-through rate of this category of products will drop significantly.

On the other hand, the previous work on "personalized models" generally only considered the characteristics of products that "interact" with users. The central word in this sentence is "interactive products". However, the "interaction action" between the user and the product is actually very important. For example, the behavior of the user on the details page after clicking on the product reflects the real preference for the product, and there may be "pseudo" clicks in the real data; Specifically, if the user does not click on a product, but the user's exposure on this product is very focused, that is, the duration of product exposure is very long, this situation does not absolutely mean that the product's exposure and no clicks represent the user's inability to click. I like it, especially now that the picture display of products in the information flow recommendation page is getting bigger and bigger, and various keywords will be revealed, and videos can even be played automatically. Maybe clicking has become a very "luxury" positive feedback for some users. up.

Finally, we believe that the user's "real-time behavior" in the recommendation scene will also be very important. For example, if the user clicks in real time and does not like negative feedback, or a certain category is exposed multiple times in real time but does not click, these all reflect the user's attitude at that time. Real-time preferences, so the recommendation system needs to have the ability to model user preferences in real time and make timely adjustments.

To sum up, the significance of real-time user perception on the end lies in the following five points:

2.2.2 Real-time behavior feature system

According to the above analysis, compared with the user perception modeling of the current cloud recommendation algorithm, the real-time user perception on the end must have the following characteristics: 1) Advance from "relying on positive feedback interaction" to "focusing on both positive and negative feedback interaction", 2) Improve from "interactive object commodities" to "what degree of interaction with commodities", and 3) advance from "quasi-real-time interaction" to "super-real-time interaction". And these three characteristics should be reflected by the characteristics of the terminal. Based on the above three characteristics, we have designed a real-time user behavior feature system for the information flow recommendation system together with the Taobao client BehaviX team. As shown in the figure below, the real-time user behavior characteristics on the terminal mainly include two parts: "(a) product exposure behavior" and "(b) product detail page behavior".

2.2.3 Modeling Heterogeneous Behavioral Sequences

There are two aspects of heterogeneity here, first: the heterogeneity of "user behavior (Action)" and "interactive product (Item)", second: "waterfall (exposure) behavior (Item Exposure (IE) Behavior)" And "item page-view (IPV) behavior (Item Page-View (IPV) Behavior)" is heterogeneous. First, let’s introduce the organization of model input: 1) A user behavior is defined as a Pair , and a behavior sequence is defined as a List (); 2) Item Exposure (IE) Behavior Sequence (Item Exposure (IE) Behavior Sequence), "item" is an exposed item, and "action" is the user's interaction with this item in the waterfall, such as exposure time, scrolling speed, scrolling direction, etc. ;3) Item Page-View (IPV) Behavior Sequence (Item Page-View (IPV) Behavior Sequence), "item" is a clicked item, and "action" is the user's interaction with this item on the detail page, such as the length of stay, whether to add purchase, whether to collect, etc.

The above model diagram (a) contains the framework of our network structure diagram for Heterogeneous User Behavior Sequence Modeling. Here we will focus on two points: 1) "Product Exposure Behavior Sequence (IE Behavior Sequence)" and "Product Details Page Behavior Sequence" (IPV Behavior Sequence)" are modeled separately first and then merged (if required) at the end. The main consideration here is that the click behavior is generally relatively sparse, while the exposure behavior is very large. If it is first fused into a behavior sequence and then modeled, it is likely that the model will be dominated by the exposure behavior. 2) Encode the product features (Item) and behavioral action features (Action) first, and then perform Fusion. The main consideration here is that product features and behavioral action features are heterogeneous inputs. If the downstream tasks require attention to specific products, only attention to isomorphic input will be meaningful. Later, we will talk about the rearrangement model on the end. I will focus on this issue again.

Here, the product feature sequence (including IE Item Sequence and IPV Item Sequence) is encoded using the GRU network, and the action feature sequence (including IE Action Sequence and IPV Action Sequence) is directly encoded using the Identity function. Fusion of product sequence Embedding (including IE Item Embedding and IPV Item Embedding) and action sequence Embedding (including IE Action Embedding and IPV Action Embedding) uses a simple Concat operation to obtain behavior sequence Embedding (including IE Behavior Embedding and IPV Behavior Embedding).

2.3 On-end rearrangement

2.3.1 Significance

On-device rearrangement is the basis of on-device recommendation. It has the ability to change the recommended order of products in real time. On-device rearrangement can be regarded as the recommendation optimization of the user's Local domain, that is, optimization in the current page recommendation results. On-end rearrangement relies on real-time user perception, according to real-time positive/negative feedback (exposure, detail page) and more detailed user behavior characteristics, constantly reorders the sorted products in the information flow, and truly achieves the information flow. Real-time perception + real-time recommendation.

The task of reranking has many predecessors’ works in both search and recommendation fields. The core point of these works is actually context-aware ranking. The context here refers to the context between the items to be sorted. The construction of context The modules can be various, such as: RNN, Transformer, or artificially define global features + DNN.

The real-time rearrangement of EdgeRerank on the terminal is also based on context-aware ranking, but the context here not only includes the context between the items to be sorted, but also includes real-time user behaviors (real-time exposure of items, real-time clicks on items, user interaction behaviors) ) context. Through these contextual information, EdgeRerank can do: I know what has been ranked, and also know the user's behavior in the previous ranking, and give me a context of the products to be sorted, and how to rank them to achieve the best. The following focuses on the model framework for on-end rearrangement, which we call Reranking with Behavior Attention Networks (BAN).

2.3.2 Reranking with Behavior Attention Networks

The model diagram (a) above contains the framework of our network structure diagram for Reranking with Behavior Attention Networks. As mentioned in the background, EdgeRerank considers two kinds of context information. We still use the commonly used sequence modeling method to model the context between sorted products, and introduce the GRU network to encode the product collection; in order to consider the real-time behavior of users In the context of the context, the commonly used method is still used here, which is actually Attention (sometimes also called target attention). Recall the input of heterogeneous behavior sequence modeling in real-time user perception: a user behavior is defined as a Pair , and a behavior sequence is defined as List (), where "commodity" refers to the user Items to interact with, and "action" refers to the action by which the user interacts with the item. As can be seen from the above network diagram, Attention acts on the products to be sorted and the products in the behavior sequence, which is actually between products. Students who are familiar with Attention should know the triplet (Query, Key, Value). In this model, Query is the Encode result of the product to be sorted (Candidate Item Embedding), and Key is the Encode result of the product in the behavior sequence (including IE Item Embedding and IPV Item Embedding), Value is the Embedding result after the behavior sequence Fusion (including IE Behavior Embedding and IPV Behavior Embedding). Describe motivation in vernacular: For a certain product in the sorted product collection, first look at what the products that the user has interacted with look like, and focus on products with similar characteristics. What is the performance on the website, it is taken together as a reference for the sorting of this product.

3. Experimental effect
3.1 Offline experiment

In order to verify the effectiveness of introducing on-device real-time user perception into on-device rearrangement as a context, we first conducted offline experiments. The comparison methods and experimental results are shown in the table below:


Among them, baseline means that there is no reordering of real-time user behavior context on the device; w/ IE and w/ IPV mean that only product exposure behavior and product detail page behavior are considered as context; All means a complete model.

3.2 Online effect

On the day of Double Eleven, the EdgeRec recommendation system provided a click-oriented and transaction-oriented rearrangement function on the terminal. Guess you like to run 500 million times on Taobao’s homepage. Compared with not enabling EdgeRec, the number of clicks on rearranged products on the click-oriented terminal increased by 10%, and the transaction value of transaction-oriented rearranged products on the terminal increased by 5%. EdgeRec improves the accuracy of product recommendations and provides more timely feedback on user intentions. The best manifestation of this is that the click-through rate of the cards at the end of the information flow page has increased significantly.

4. Summary

EdgeRec is the first small test of the recommendation algorithm in the direction of edge computing. Judging from the business results obtained, its development space is very huge. By utilizing the capability of device-side computing, the in-depth model can make predictions on the device, and run the model on the device to make up for the difficulty in obtaining real-time behavior on the cloud and the weak ability to adjust policies in real time. In addition, the device-side computing capability can not only be used for model prediction, but also can be considered for training on the device to train individual models for each user, bringing more room for device-side intelligence.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us