Knowledge Graph Prediction and Interpretation Based on Interactive Embedding

Summary

The knowledge graph embedding algorithm maps entities and relationships into a continuous vector space. Although many knowledge graph embedding algorithms have been proposed and successfully applied to knowledge graph-related tasks, there is little discussion on the cross-interaction between entities and relationships. Crossover interaction refers to the interaction between entities and relationships in the process of relationship reasoning, that is, the relationship to be predicted will affect the selection of entity information, and the information owned by the entity will affect the relationship to be predicted. In the way of reasoning, cross-interaction is ubiquitous in the process of human beings in the process of relational reasoning.

In this paper, we propose a knowledge graph reasoning algorithm (CrossE) that explicitly models cross-interactions between entities and relations. CrossE not only learns a general embedding representation for each entity and relationship, but also learns multiple interactive embedding representations for it, so that the model can more fully capture the interaction information between entities and relationships, thereby achieving more accurate relationship reasoning Effect. At the same time, in this paper, we also analyze the knowledge graph embedding results from the perspective of providing explanations.

Research Background

Knowledge graphs store numerous facts as triples, namely (head entity, relation, tail entity), and are applied to many artificial intelligence tasks, such as search, question answering, recommendation, etc. Knowledge graph embedding methods learn representations of vector spaces for entities and relations, called entity embeddings and relational embeddings, usually presented in the form of vectors or matrices. Many effective graph embedding methods have been proposed in recent years, including RESCAL, TransE, NTN, and DistMult, which are widely used in tasks such as knowledge graph completion, question answering, and relation extraction.

Cross-interaction refers to the interaction between entities and relations affecting relations in the process of relational reasoning, which can be divided into interactions from relations to entities and interactions from entities to relations. Cross-interaction is a widespread phenomenon in relational reasoning. Let's take the following figure as an example to illustrate what cross-interaction is:

Assuming that the knowledge map shown in the above figure is known to predict (X, isFatherOf, ?), entity X has 6 triples related to it, of which 4 (red series) are related to the current prediction relationship isFatherOf, The other two triples (blue series) describe that X's occupational information has nothing to do with the current prediction, which is related to the interaction of the entity, that is, the predicted relationship will affect the information screening of the entity; the relationship isFatherOf has two in the above figure. There are two kinds of inference paths (represented in dark red and light red respectively), for entity X, only one of them is available, that is, the inference path containing hasWife and hasChild, which is the interaction from entity to relationship, that is, the entity has The information will affect the choice of the reasoning path of the relationship.

Considering cross-interactions, entity and relation embeddings under a specific triplet should cover the role of cross-interactions and should be different for different triples. However, the knowledge graph embedding methods proposed before all ignore the phenomenon of cross interaction, so most of them learn a common embedding for entities and relations, which is unreasonable. Therefore, this paper proposes a new embedding method CrossE that explicitly simulates cross-interactions.

Algorithm Description

CrossE not only learns generic embeddings for entities and relations, but also generates interaction embeddings for them that vary with triplet environments via an interaction matrix. The overall approach of CrossE is shown in the diagram below:

Where E is an entity matrix, each row is an embedded representation of an entity, C is an interaction matrix, each row is related to a relation in the knowledge graph, R is a relational embedding matrix, each row is an embedded representation of a relation, and in predicting the tail entity In the process of , first find out the embedded representation h of the current head entity from the entity matrix, find out the embedded representation r of the current relationship from the relationship matrix, and find out the interaction vector c_r related to the current relationship from the interaction matrix, during the calculation process , h interacts with c_r to get the interactive embedding of h, r interacts with h and c_r to get the interactive representation of r, then adds it to the bias vector and passes through an activation function to get the embedding of the predicted tail entity, and compares it with the real tail entity t The final triplet score is obtained by comparison, and the specific formula is as follows:

We adopted the log-likelihood loss function as the training objective. It can be seen that although we have added a lot of interaction embeddings for entities and relationships in the model, they are all calculated, and the increase in parameters only comes from the interaction matrix C.

Experiment

In the experiment, we evaluate the embedding effect from two aspects: the link prediction effect and the interpretation of the prediction result.

In the experiment of link prediction, we tested CrossE on three standard datasets WN18, FB15k, and FB15k-237. Due to the effective capture of the cross-interaction between entities and relationships, CrossE performed well in the challenging dataset FB15k and A more obvious improvement has been achieved on FB15k-237. The experimental results are as follows:

We also evaluate CrossE from the perspective of providing explanations for prediction results. The explanation referred to in this paper is a certain path between the head entity and the tail entity. On the basis of summarizing all possible paths and structures within the two-degree range of the head and tail entities, we propose six analogy structures, and through the knowledge The way analogous structures are searched in the graph provides explanations for the predicted results. We propose two indicators to evaluate the ability of the model to provide explanations, Recall and AvgSupport, where Recall refers to the proportion of the prediction results that can provide explanations through the analogy structure to the total prediction results, and AvgSupport refers to the average of the prediction results that can provide explanations. The support degree (support), the support degree refers to the number of instances of similar structures between the head and tail entities existing in the knowledge graph. In general, embedding models with better ability to provide explanations usually have higher Recall and AvgSupport due to better embedding of entities.

Thanks to the interactive embedding, combined with Recall and AvgSupport, CrossE can better explain the prediction results. The specific results are as follows:

We also provide explanatory examples of 6 types of analogy structures from CrossE's results on FB15k-237, as follows:

The explanation of the above table can effectively help us understand whether the judgment of the model is correct, which is one of the reasons why the explanation is necessary.

In conclusion

As a common phenomenon, cross-interaction should be captured in the process of knowledge graph reasoning. Therefore, we designed an embedding model CrossE that explicitly simulates the cross-interaction of entities and relationships. Experiments show that capturing cross-interaction can effectively improve the performance of the model. Reasoning ability, especially on complex and more challenging datasets. In this paper, we also evaluate the embedded model from a new perspective, which is to provide explanations for the prediction results, and propose two evaluation indicators. From the experimental results, the predictive ability of the embedded model is not directly related to the ability to provide explanations. , a model with a better predictive ability does not necessarily have a better ability to provide explanations, which are two different evaluation dimensions. We hope our research can inspire other researchers.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us