Generation and Analysis of Personalized Recommendation Reasons for Knowledge
1 Introduction
The text generation system can generate knowledge-based personalized recommendation reasons. There are mainly the following difficulties: 1. The generated copy needs to be grammatical and fluent in expression; Knowledge, such as food recommendation reasons, cannot use expressions in other fields, such as "easy to assemble" expressions commonly used to describe toys; 3. The text generation in the recommendation scene should be as personalized as possible. For the same product category (We will replace the current physical category with the interest category based on graph embedding in the future.) Different user groups can get different recommendation reasons.
In the past, text generation was mostly based on the traditional RNN-based Seq2Seq+Attention (Sutskever et al., 2014; Bahdanau et al., 2014; Luong et al., 2015) framework. This model was first used in machine translation, and later gradually extended to the tasks of major text generation classes. At present, some changes have taken place in the field of machine translation, and the academic and industrial circles generally favor the faster and better transformer architecture. We took the lead in introducing the transformer architecture (Vaswani et al., 2017) into the scenario of recommendation reason generation, and achieved better performance than traditional RNN-based Seq2Seq at the baseline level. In response to the above two problems, knowledge introduction and personalized recommendation, we propose our new model, KOBE (KnOwledge-Based pPersonalized text generation system) based on the transformer architecture. The figure below briefly describes the current task and the model running process.
In the following, this article will define the current problem in Chapter 2, elaborate on the basic model architecture and the specific implementation of our KOBE model in Chapter 3, report the experimental results in Chapter 4 and introduce The application of the theme scene.
2. Problem Definition
For our current recommendation scenario, our problem definition has changed compared to the past. To achieve knowledge introduction and personalization of recommendation reasons, we introduce feature attributes and knowledge. Characteristic attributes refer to certain aspects of a product, such as appearance, quality, etc. For each product, besides x, the current input also has its corresponding feature attribute image.png. l represents the total amount of feature categories, and for each feature category, there are multiple specific features. In the model implementation section below, we will introduce the source of feature attributes in detail. These feature attributes can be the tags of users who often pay attention to similar products, so the system can use this part of information to pursue more personalized copywriting.
In addition, we use an external knowledge base to link our input items with their related knowledge. We used the external knowledge base CN-DBpedia with textual descriptions. Each named entity has an associated plain text description in this knowledge base. For each commodity, we mine its corresponding knowledge from the knowledge base w:
On this basis, our problem is transformed into, input product name x, and its corresponding characteristic attribute a and relevant knowledge w, the system needs to generate a piece of personalized recommendation reason y for this product with relevant knowledge.
3. KOBE model design
KOBE's model architecture is different from the traditional RNN-based Seq2Seq, using the transformer framework that has achieved SOTA performance in machine translation tasks (Vaswani et al, 2017). Considering that traditional RNN-based Seq2Seq is not as effective in capturing long-distance dependencies as self-attention, we base our method on a transformer model based on self-attention mechanism. In the baseline comparison, the transformer performed better than the RNN-based Seq2Seq. In the following sections, we first introduce the transformer model, and then introduce how we introduce personalization-related features and external knowledge based on this model to complete the construction of the KOBE model. The figure below shows the basic framework of the KOBE model.
3.1. Transformers
Transformer itself is also a model based on the encoder-decoder framework, which can also be considered as a type of Seq2Seq. Unlike the RNN-based Seq2Seq, both encoder and decoder implementations are based on the self-attention mechanism without using the RNN architecture. Here we briefly describe the encoder and decoder implementations.
The encoder is composed of multiple layers of self-attention (usually 6 layers, and the number of layers can be increased if a large model is to be implemented). The encoder takes an input x of a text sequence and converts it into a word embedding sequence image.png through a word embedding layer. Since the model is different from RNN, it cannot directly capture the sequence relationship of the input, so we add position encoding to e and then send it to the self-attention layer of the encoder.
Each self-attention layer mainly includes a self-attention mechanism and a fully connected layer, internally using residual connections and layer normalization. Take the first layer as an example here. The attention mechanism can be expressed as follows:
Among them, mathbf{Q}, mathbf{K} and mathbf{V} represent the input of attention mechanism respectively. In the self-attention mechanism, these three items are all linear transformations of the input e, where image.png. Self-attention can be expressed as:
where can denote the size of the hidden layer here, and the size of the head in the multi-head attention mechanism. The multi-head attention mechanism divides the input into multiple heads, performs attention calculations separately, and then stitches the respective outputs together.
After the self-attention module, the fully connected layer (FFN) transforms the previous output, and the specific implementation is as follows:
The implementation of the decoder is similar to that of the encoder, and also uses a multi-layer self-attention layer. The difference is that the decoder adds context attention to connect the encoder and decoder. The specific implementation is similar to the implementation of self-attention, just replace K and V with the linear transformation of the encoder output h.
In the next section, we will focus on our KOBE model, which includes the Attribute Fusion module and the Knowledge Incorporation module, explaining how to introduce personalized recommendation related features and external knowledge into the original framework.
3.2. Attribute Fusion
Including transformer and RNN-based Seq2Seq, baseline models have the problem of insufficient diversity of generated text. These models usually use relatively broad and vague expressions, and the expressions are relatively single, and the content is relatively similar and monotonous. To alleviate this problem, we introduce personalized recommendation related feature attributes, mainly focusing on two aspects. The first is aspect, which refers to a specific aspect of a product, such as appearance, quality, etc. The second is the user category, that is, the user group that the product matches. For example, the user group corresponding to the mechanical keyboard is a technology enthusiast. Through this method, the model is able to generate texts that are preferred by different user groups, including style and content levels. Below we will introduce the source of these features and how they are integrated into the model through the module of Attribute Fusion.
For aspect, we hope that the model can generate corresponding recommendation reasons based on a given aspect. For example, given "quality", the model should generate recommendation reasons describing the quality of the product. However, due to the large size of the training set, it is difficult for us to obtain a large number of recommendation reasons for well-marked aspects. We use the heuristic method to first train a CNN-based text classifier with the labeled data set, which can classify the input recommendation reasons into the corresponding aspects, and then we use this classifier to label a large amount of unlabeled data. So all samples in the training set get their own aspect. As for the user category, we use the data of user feature preference and interest in the data supermarket, and we consider these labels as user categories. We use the user's implicit feedback to label the data, such as click behavior and dwell time, so that each sample can get the corresponding user category. After the data is prepared, we use the Attribute Fusion module to integrate them organically.
In terms of specific implementation, we use different word embedding layers to obtain the word embedding representation image.png and image.png of aspect and user category respectively, and then average the two to obtain the representation image.png of attribute. Then we add the representation of this attribute and the word embedding table ${e_{i}$ at each time point of the word embedding sequence e of the product title, wait for a new word embedding sequence, and then send it to the transformer model.
3.3. Knowledge Incorporation
Another common problem with text generation is lack of informativeness. The recommendation reasons generated by the baseline model are relatively monotonous, with insufficient information, especially the ability to convey relevant knowledge. For readers, the recommendation reasons that readers want to see should contain valid information, that is, useful knowledge. Considering this problem, we designed the knowledge incorporation module to incorporate product-related knowledge into text generation models. Therefore, the recommendation reasons generated by the model can become more "speakable".
To encode knowledge, we create a new knowledge encoder. The encoder is also an encoder based on the self-attention mechanism, and the specific model structure is consistent with the encoder described above. Similar to the behavior of the encoder above, the knowledge encoder encodes the input knowledge into a new representation u. So far, we have obtained the product title representation h and related knowledge representation u.
In order to achieve an effective combination of item title representation and related knowledge representation, we implement a bidirectional attention mechanism (Seo et al., 2016), which is divided into "title-to-knowledge attention" and "knowledge-to-title attention". Both have different functions. Headline-to-knowledge attention can acquire knowledge related to the title, while knowledge-to-headline attention can capture the content of the title related to knowledge. For this part of the attention, first calculate the similarity between the two and then perform a weighted summation of the content to be obtained according to the weight. The similarity is calculated as follows:
4. Spring Festival Cloud Theme Actual Combat
During the Spring Festival, we launched the KOBE model in the Spring Festival cloud-themed project, providing corresponding personalized recommendation reasons for each category. It can be seen that the recommendation reasons we generated are fluent in expression, can choose descriptions that meet product-related knowledge according to categories and product characteristics, and can carry some interesting expressions. The following is a schematic diagram of the online effect and a demo video:
5. Results & Outlook
For the generation of recommendation reasons, it may be different from the interpretation of the classic recommender system recommending specific products, but the recommendation reason generation technology we considered above has gradually matured. Consumption of content is the current trend, and the contentization of e-commerce platforms must be promising. We hope to bring some help to the platform in the exploration of content generation.
The text generation system can generate knowledge-based personalized recommendation reasons. There are mainly the following difficulties: 1. The generated copy needs to be grammatical and fluent in expression; Knowledge, such as food recommendation reasons, cannot use expressions in other fields, such as "easy to assemble" expressions commonly used to describe toys; 3. The text generation in the recommendation scene should be as personalized as possible. For the same product category (We will replace the current physical category with the interest category based on graph embedding in the future.) Different user groups can get different recommendation reasons.
In the past, text generation was mostly based on the traditional RNN-based Seq2Seq+Attention (Sutskever et al., 2014; Bahdanau et al., 2014; Luong et al., 2015) framework. This model was first used in machine translation, and later gradually extended to the tasks of major text generation classes. At present, some changes have taken place in the field of machine translation, and the academic and industrial circles generally favor the faster and better transformer architecture. We took the lead in introducing the transformer architecture (Vaswani et al., 2017) into the scenario of recommendation reason generation, and achieved better performance than traditional RNN-based Seq2Seq at the baseline level. In response to the above two problems, knowledge introduction and personalized recommendation, we propose our new model, KOBE (KnOwledge-Based pPersonalized text generation system) based on the transformer architecture. The figure below briefly describes the current task and the model running process.
In the following, this article will define the current problem in Chapter 2, elaborate on the basic model architecture and the specific implementation of our KOBE model in Chapter 3, report the experimental results in Chapter 4 and introduce The application of the theme scene.
2. Problem Definition
For our current recommendation scenario, our problem definition has changed compared to the past. To achieve knowledge introduction and personalization of recommendation reasons, we introduce feature attributes and knowledge. Characteristic attributes refer to certain aspects of a product, such as appearance, quality, etc. For each product, besides x, the current input also has its corresponding feature attribute image.png. l represents the total amount of feature categories, and for each feature category, there are multiple specific features. In the model implementation section below, we will introduce the source of feature attributes in detail. These feature attributes can be the tags of users who often pay attention to similar products, so the system can use this part of information to pursue more personalized copywriting.
In addition, we use an external knowledge base to link our input items with their related knowledge. We used the external knowledge base CN-DBpedia with textual descriptions. Each named entity has an associated plain text description in this knowledge base. For each commodity, we mine its corresponding knowledge from the knowledge base w:
On this basis, our problem is transformed into, input product name x, and its corresponding characteristic attribute a and relevant knowledge w, the system needs to generate a piece of personalized recommendation reason y for this product with relevant knowledge.
3. KOBE model design
KOBE's model architecture is different from the traditional RNN-based Seq2Seq, using the transformer framework that has achieved SOTA performance in machine translation tasks (Vaswani et al, 2017). Considering that traditional RNN-based Seq2Seq is not as effective in capturing long-distance dependencies as self-attention, we base our method on a transformer model based on self-attention mechanism. In the baseline comparison, the transformer performed better than the RNN-based Seq2Seq. In the following sections, we first introduce the transformer model, and then introduce how we introduce personalization-related features and external knowledge based on this model to complete the construction of the KOBE model. The figure below shows the basic framework of the KOBE model.
3.1. Transformers
Transformer itself is also a model based on the encoder-decoder framework, which can also be considered as a type of Seq2Seq. Unlike the RNN-based Seq2Seq, both encoder and decoder implementations are based on the self-attention mechanism without using the RNN architecture. Here we briefly describe the encoder and decoder implementations.
The encoder is composed of multiple layers of self-attention (usually 6 layers, and the number of layers can be increased if a large model is to be implemented). The encoder takes an input x of a text sequence and converts it into a word embedding sequence image.png through a word embedding layer. Since the model is different from RNN, it cannot directly capture the sequence relationship of the input, so we add position encoding to e and then send it to the self-attention layer of the encoder.
Each self-attention layer mainly includes a self-attention mechanism and a fully connected layer, internally using residual connections and layer normalization. Take the first layer as an example here. The attention mechanism can be expressed as follows:
Among them, mathbf{Q}, mathbf{K} and mathbf{V} represent the input of attention mechanism respectively. In the self-attention mechanism, these three items are all linear transformations of the input e, where image.png. Self-attention can be expressed as:
where can denote the size of the hidden layer here, and the size of the head in the multi-head attention mechanism. The multi-head attention mechanism divides the input into multiple heads, performs attention calculations separately, and then stitches the respective outputs together.
After the self-attention module, the fully connected layer (FFN) transforms the previous output, and the specific implementation is as follows:
The implementation of the decoder is similar to that of the encoder, and also uses a multi-layer self-attention layer. The difference is that the decoder adds context attention to connect the encoder and decoder. The specific implementation is similar to the implementation of self-attention, just replace K and V with the linear transformation of the encoder output h.
In the next section, we will focus on our KOBE model, which includes the Attribute Fusion module and the Knowledge Incorporation module, explaining how to introduce personalized recommendation related features and external knowledge into the original framework.
3.2. Attribute Fusion
Including transformer and RNN-based Seq2Seq, baseline models have the problem of insufficient diversity of generated text. These models usually use relatively broad and vague expressions, and the expressions are relatively single, and the content is relatively similar and monotonous. To alleviate this problem, we introduce personalized recommendation related feature attributes, mainly focusing on two aspects. The first is aspect, which refers to a specific aspect of a product, such as appearance, quality, etc. The second is the user category, that is, the user group that the product matches. For example, the user group corresponding to the mechanical keyboard is a technology enthusiast. Through this method, the model is able to generate texts that are preferred by different user groups, including style and content levels. Below we will introduce the source of these features and how they are integrated into the model through the module of Attribute Fusion.
For aspect, we hope that the model can generate corresponding recommendation reasons based on a given aspect. For example, given "quality", the model should generate recommendation reasons describing the quality of the product. However, due to the large size of the training set, it is difficult for us to obtain a large number of recommendation reasons for well-marked aspects. We use the heuristic method to first train a CNN-based text classifier with the labeled data set, which can classify the input recommendation reasons into the corresponding aspects, and then we use this classifier to label a large amount of unlabeled data. So all samples in the training set get their own aspect. As for the user category, we use the data of user feature preference and interest in the data supermarket, and we consider these labels as user categories. We use the user's implicit feedback to label the data, such as click behavior and dwell time, so that each sample can get the corresponding user category. After the data is prepared, we use the Attribute Fusion module to integrate them organically.
In terms of specific implementation, we use different word embedding layers to obtain the word embedding representation image.png and image.png of aspect and user category respectively, and then average the two to obtain the representation image.png of attribute. Then we add the representation of this attribute and the word embedding table ${e_{i}$ at each time point of the word embedding sequence e of the product title, wait for a new word embedding sequence, and then send it to the transformer model.
3.3. Knowledge Incorporation
Another common problem with text generation is lack of informativeness. The recommendation reasons generated by the baseline model are relatively monotonous, with insufficient information, especially the ability to convey relevant knowledge. For readers, the recommendation reasons that readers want to see should contain valid information, that is, useful knowledge. Considering this problem, we designed the knowledge incorporation module to incorporate product-related knowledge into text generation models. Therefore, the recommendation reasons generated by the model can become more "speakable".
To encode knowledge, we create a new knowledge encoder. The encoder is also an encoder based on the self-attention mechanism, and the specific model structure is consistent with the encoder described above. Similar to the behavior of the encoder above, the knowledge encoder encodes the input knowledge into a new representation u. So far, we have obtained the product title representation h and related knowledge representation u.
In order to achieve an effective combination of item title representation and related knowledge representation, we implement a bidirectional attention mechanism (Seo et al., 2016), which is divided into "title-to-knowledge attention" and "knowledge-to-title attention". Both have different functions. Headline-to-knowledge attention can acquire knowledge related to the title, while knowledge-to-headline attention can capture the content of the title related to knowledge. For this part of the attention, first calculate the similarity between the two and then perform a weighted summation of the content to be obtained according to the weight. The similarity is calculated as follows:
4. Spring Festival Cloud Theme Actual Combat
During the Spring Festival, we launched the KOBE model in the Spring Festival cloud-themed project, providing corresponding personalized recommendation reasons for each category. It can be seen that the recommendation reasons we generated are fluent in expression, can choose descriptions that meet product-related knowledge according to categories and product characteristics, and can carry some interesting expressions. The following is a schematic diagram of the online effect and a demo video:
5. Results & Outlook
For the generation of recommendation reasons, it may be different from the interpretation of the classic recommender system recommending specific products, but the recommendation reason generation technology we considered above has gradually matured. Consumption of content is the current trend, and the contentization of e-commerce platforms must be promising. We hope to bring some help to the platform in the exploration of content generation.
Related Articles
-
A detailed explanation of Hadoop core architecture HDFS
Knowledge Base Team
-
What Does IOT Mean
Knowledge Base Team
-
6 Optional Technologies for Data Storage
Knowledge Base Team
-
What Is Blockchain Technology
Knowledge Base Team
Explore More Special Offers
-
Short Message Service(SMS) & Mail Service
50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00