What is cold start?
Most recommendation systems generate recommendation candidates by using the collaborative filtering, matrix factorization, or deep learning algorithms. In most cases, these algorithms rely on the user-item interaction matrix. Real-world recommendation systems always have new users and new items added. Due to the lack of sufficient historical behavioral data, the recommendation systems are unable to obtain accurate candidate items or accurately recommend items to appropriate users. This is the so-called cold start problem. Cold start is a challenge for recommendation systems. The reason is that traditional recommendation algorithms, regardless of whether they are used by the candidate generation module, coarse ranking module, or fine ranking module, do not work well for new users and new items because the algorithms rely too much on the behavioral data collected by recommendation systems but recommendation systems do not have sufficient behavioral data of new users and new items. As a result, few impressions can be generated for new items and the interests of new users cannot be accurately modeled.
For some services, timely recommendations of new items and sufficient exposure of new items are important for the ecological environment construction and long-term benefits of the service platforms. For example, the news and information industry is exactly time-sensitive. If items cannot be exposed in a timely manner, their news value significantly diminishes. If the content newly published by a creator on user-generated content (UGC) platforms cannot be exposed to sufficient users in a timely manner, the enthusiasm of the content creator will be affected. This may further affect the amount of high-quality content that the platforms can attract in the future. If dating platforms cannot help new users obtain adequate concerns, the platforms may not continually attract new users, finally becoming unattractive.
In summary, the cold start problem is challenging in recommendation systems. How can we resolve this problem?
Solutions to the cold start problem
The algorithms or policies that can be used to resolve the cold start problem in recommendation systems have four features: generalized, quick, transferable, and few.
Generalized: New items can be generalized in respect of attributes or categories. For example, a newly published item can be recommended to users who liked the items of the same category as the new item. A newly launched short video can be recommended to users who liked the creator of the video. A newly released news article can be recommended to users who liked the articles of the same topic as the new article. Essentially, the preceding recommendation methods use content-based recommendation algorithms. To achieve better results, sometimes recommendation systems need to provide concept-based or topic-based recommendations. A new item not only needs to be recommended to users who liked the items of the same category as the new item, but also needs to be recommended to other users such as the users who liked the items of the same brand as the new item, users who liked the items in the same shop as the new item, users who liked the items of the same style as the new item, and users who liked the items of the same color as the new item. Some items can be generalized without using any algorithm, such as the items for which merchants have configured attributes when they launch the items. Some items can be generalized only by using algorithms, such as the articles whose topics are not presented when they are launched.
In addition to attribute-based or topic-based generalization, it is common to use a certain algorithm to obtain the embedding vectors of users and items and then use the distance or similarity between these vectors to match user interests with items. Matrix factorization and deep neural network models can generate embedding vectors for both users and items. However, the training of these traditional models still relies on the behavioral data of users and items. The matrix factorization and deep neural network models cannot accurately generate embedding vectors for cold start users and items.
In essence, the generalization method is to use the content or attributes of new items to make up for the lack of historical behavioral data of new items. For example, the multimodal information of an item, such as a picture or a video, can be used to generate recommendations for the item. A dating platform can score the appearance of a new user (the item to be recommended) and then recommend the user to other users (users who browse the recommendation list) who have a preference for such appearance.
Quick: Cold start items are new items that lack historical behavioral data. A natural approach, therefore, is to quickly gather interaction data for these new items and promptly incorporate the data into recommendation systems. Traditional recommendation algorithm models and relevant features are updated on a daily basis, whereas online learning models and relevant features can be updated in minutes or even seconds. This type of approach is usually based on reinforcement learning or contextual bandit algorithms.
Transferable: Transfer learning uses data in different scenarios to build models. Transfer learning can migrate knowledge from a source domain to a target domain. For example, a new service has only a small number of samples and data of other service scenarios needs to be used to train a model for the service. In this case, other scenarios are the source domain, and the new service scenario is the target domain. For another example, the platforms of a cross-border e-commerce company vary in different countries, and the platform in a country may be newly deployed and has only a few user behavioral data. In this case, the behavioral data on other mature platforms can be used to train a model, and the small number of samples on the newly deployed platform can be fine-tuned to achieve good cold start performance. When the transfer learning technique is deployed, ensure that the source domain and target domain have certain relevance. For example, the platforms in different countries just mentioned may sell a large number of identical items.
Few: Few-shot learning, as the name implies, is a technique of training models by using a minimal amount of labeled data. A typical few-shot learning method is meta learning.