Alibaba open source Graph-Learn


Alibaba recently open-sourced Graph-Learn (GL, formerly AliGraph), a framework for Graph Neural Networks (GNN). The framework is developed by Ali's internal team, and the R&D students are from the Computing Platform Division - PAI Team, New Retail Intelligent Engine Business Group - Intelligent Computing Laboratory, and Security Department - Data and Algorithm Team.

GL aims to reduce the cost of GNN application implementation and accelerate the iteration of the entire GNN ecosystem. Ali started GNN-related exploration a few years ago, and has accumulated a lot of valuable experience from research to actual implementation. We gradually pass on these experiences through GL, hoping to be helpful to relevant practitioners.

GL is designed for industrial scenarios and provides a basic operating framework for the current mainstream GNN algorithm. Since it originated in the industry, GL naturally supports large-scale graph data, heterogeneous graphs, attribute graphs, etc., which are necessary but difficult problems, and the current deep learning frameworks (TensorFlow, PyTorch, etc.) are not good at them. At the same time, considering that the upper layer NN is extremely business-oriented and customized, GL supports the combination with any deep learning framework with python interface. The GL framework is light and flexible, and the internal modules reserve sufficient expansion space to facilitate customization based on different scenarios. At the same time, GL has built-in various GNN models and programming interfaces based on TensorFlow for reuse and reference.

design concept
GNN is a very popular subfield in the current AI field, and researchers have given high expectations. In the deep learning where everything is a vector, we hope to incorporate more knowledge to make deep learning move from perceptual learning to cognitive learning. Human knowledge exists in computers relying on the graph structure, which is why Graph and Neural Network are integrated.

Transitioning from deep learning to GNN, as developers, we are well aware of the difficulties in implementing a GNN algorithm and what hinders GNN from being widely used. We also know that although GNN is hot, it is not as good as everyone said. The widening of application scenarios, changes in algorithm theory, and changes in programming paradigms may bring about platform changes or even subversion. Faced with these uncertainties, What should the platform do.

We have accumulated these experiences and integrated them into GL. Whether it is a direct user of GL or a student who designs a similar system with reference to GL, we hope it will be helpful. Only when more people understand GNN and play GNN, the leap from perception to cognition will not be empty talk.

GL follows the principle of portability and ease of use, fully retains the scalability of internal sub-modules, and is compatible with the open source ecosystem. Generally speaking, it includes: lightweight and portable, scalable modules, reusable interfaces, and ecological compatibility.

lightweight and portable

Like the mainstream deep learning framework, the GL platform code is written in C. Under the Linux system, all compilers that support C11 can compile and package the source code, and several external libraries that it depends on are also well-known in the open source community. The first compilation process takes about a few minutes due to the download of external dependencies, and each subsequent development, compilation and packaging takes only seconds, and the deployment cost is very low. GL can run on a physical machine or in Docker. It can also be launched with one click based on Alibaba Cloud's ACK service to ensure network connectivity between distributed machines.

module expandable

The system is highly modularized, and each module can be expanded independently. Scalability gives the system enough flexibility to adapt to the uncertainty of future development and adapt to different operating environments. Scalability also provides developers with a space for free expansion and rapid prototype verification, not limited to the functions currently provided by the system.

The storage module abstracts the two layers of FileSystem and Storage. If you need to expand custom storage, you only need to implement the FileSystem interface, or directly connect to other graph storage systems by implementing the Storage interface. The Partition module defines how data is distributed between distributed servers and how a distributed computing request is forwarded correctly. New Partition strategies (such as a graph segmentation that is more suitable for business scenarios) only need to expand this module. The calculation module is composed of operators, and the operators can be customized. Operators are executed in a distributed manner by the framework, and the calculations defined by each operator are forwarded according to the specified Partition policy, and global storage can be accessed in the form of resources. Currently built-in operator types include: Sampling, Negative Sampling, Aggregation, Graph Traverse, Graph Query, Graph Update. The RPC module, decoupled from other modules, is only responsible for sending and receiving requests, and it is easy to connect to other RPC frameworks. The Naming module is used for distributed address discovery, which can facilitate docking with different scheduling environments.

Interface can be reused

Interface reusability is reflected in two aspects: backward compatibility and function expansion. Compatibility is self-evident, and many developers suffer from interface compatibility issues after version updates. For function expansion, the general practice is to add new APIs, which means bringing learning costs on the basis that users can pay attention. New APIs are relatively less annoying to users than compatibility, but the addition of APIs also brings huge system maintenance costs. In the current rapid iteration of GNN, function expansion occurs almost all the time, such as adding a new graph sampling algorithm, and backward compatibility is not easy to achieve due to changes in application scenarios and algorithm changes. We try our best to make When these changes occur, there is minimal loss to the user.

GL reduces the above risks by increasing the level of abstraction of the User Interface. We found that many GNN researchers have a background in graphs because they have more or less understanding of Gremlin. Gremlin is an abstract graph query language. Gremlin is to Graph, similar to SQL to Table. GL has designed a set of Gremlin-Like python interfaces, which will be translated into various operator implementations. For each new operator, you only need to change the parameters corresponding to the interface. For example, if the user randomly samples the neighbors of a certain vertex, when adding a new sampling algorithm, only the following changes are required:

import graphlearn as gl
g = gl. Graph()
# sample 10 neighbors for each node in this batch by random sampler
# sample 10 neighbors for each node in this batch by new sampler
In terms of the programming interface of the GNN model, we have also abstracted to a certain extent and provided many examples for reference.

Ecologically Compatible

GL provides a user interface in the form of python, and the results are presented in the form of NumPy, which is easy to use. In addition, GL can be used in conjunction with the current mainstream deep learning frameworks, such as TensorFlow, PyTorch, etc., to enrich the expressive capabilities of the upper NN. In an e2e GNN application scenario, there is a good complementary relationship between GL and deep learning frameworks. It is our consistent principle to hand over the calculations to the frameworks that are good at, Graph->GL, Numeric->TensorFlow, and PyTorch.

make achievement
GL has been implemented in dozens of scenarios within the Ali Group, including search recommendation, security risk control, new retail, knowledge graph, etc. GL's daily task data is a heterogeneous graph with a scale of tens of billions of edges and billions of vertices, and contains more than a hundred mixed-type attributes. Compared with the previous method of processing graph data into samples that can be used by deep learning frameworks through big data computing tasks (such as Map-Reduce), each model day can save 10,000 CPU hours (core x hour) computing power, hundreds of TB For storage, the period from the development of the GNN algorithm to its launch has been shortened to 1/3 of the original, and it has brought about a significant improvement in business effects. In addition, GL received the SAIL Pioneer Award at the 2019 World Artificial Intelligence Conference.

Let's take various typical scenarios in security risk control as examples to see the application effect of GL. The data and algorithm team of the Alibaba Security Department has been committed to fighting black and gray products, and protecting users' experience and vital interests on related platforms such as Taobao, Tmall, and Xianyu. In the face of various black and gray products, a series of algorithmic weapons have been developed, and graph neural network (GNN) is one of the important prevention and control technologies. As an emerging technology in recent years, GNN can not only consider the attributes of the node itself, but also consider the characteristics of the network structure, and then describe the relationship between black and gray production, gangs and industrial chain information, and has been widely used in risk control scenarios. Effect gain. It is very challenging to apply GNN to risk control scenarios. The graph structure we face often has the following two characteristics:

Highly heterogeneous: both nodes and edges are rich and diverse
Huge data scale: many graph structures have billions of nodes, billions or even tens of billions of edges
Spam Registration Identification

Among the newly registered users of Taobao every day, normal users account for the vast majority, but there are also many black and gray users pretending to be normal users, trying to obtain an account to conduct activities such as swiping orders, spam comments, etc. We call these accounts "junk accounts" ". If the "junk account" is registered, it may engage in various harmful activities, so it is very necessary to identify and delete it when registering. We construct the connection relationship between accounts through various relationships such as mobile phone number, device information, and ip address, and build an isomorphic graph between accounts and accounts based on graph-learn to describe new representations of accounts. The spam registration graph model is currently online It has been running stably for nearly a year. Compared with the characteristics of purely using accounts, it can identify an additional 10-15% of spam accounts every day, maintaining a very high recognition accuracy rate.

Amoy counterfeit identification

Alibaba has been making a lot of efforts to protect intellectual property rights, and has achieved remarkable results. However, there are still very few sellers selling fake goods on Taobao, which we have always hated. To this end, in addition to applying the characteristics of the fake product itself, we have carefully screened out various relationships between the fake product and the counterfeit sellers, such as the gang relationship between the counterfeit sellers, logistics and other industrial chain relationships, and through these Relationship construction is a heterogeneous graph between merchants and commodities. The Taobao counterfeit graph model developed based on graph-learn has been implemented in many categories such as clothing, footwear, and jewelry. Compared with directly using the characteristics of commodities and merchants information, the graphical model is able to additionally identify more than 10% of counterfeit goods.

Xianyu Spam Comment Identification

Xianyu is currently the largest trading platform for second-hand commodities in China. Buyers and sellers can communicate and ask questions by commenting on the products. However, there are also black and gray products that leave some suspected advertisements, fraudulent, counterfeit or even prohibited comments under the products. It not only affects the user experience, but also brings risks to users. In order to identify spam comments on Xianyu, we independently designed an anti-spam system based on a heterogeneous graph convolutional network — GAS, based on business characteristics. Compared with a single-node deep model, it can achieve 16% accuracy At the same time, we summarized the methods in the project. The article "Spam Review Detection with Graph Convolutional Networks" was published on CIKM2019, the top conference in the field of information retrieval, and won the Best Application Paper Award.

Malicious evaluation identification

Malicious evaluations include evaluation blackmail, peer attacks, and false evaluations, which have always been one of the main problems plaguing merchants on the Taobao platform. Compared with the traditional graph model, the heterogeneous graph neural network eliminates the subjective judgment of strong and weak edges by aggregating different subgraphs, and can fuse edge information of different strengths through graph fusion. In the scenario of malicious evaluation of Taobao, the malicious evaluation graph model developed based on graph-learn has optimized the overall data preparation process and improved training efficiency. business experience of merchants.

Identification of "professional foodie"

There are still some "professional foodies" on the Taobao platform. They place orders frantically on platforms such as Taobao and Are you hungry, but immediately apply for "refund only" and refuse to return the goods after receiving the goods, and then threaten the merchants to compromise by means of professional complaints. , This is a typical "professional foodie" behavior. For those who abuse the rights of Taobao members and damage the normal operation order of the platform, we build a "professional foodie" graph model based on graph-learn through various media relations. Compared with the traditional model of GBDT, the "professional foodie" graph model has the same accuracy. In the case of a high rate, an additional 15% of malicious buyers were identified, protecting the rights and interests of merchants on the platform.

future plan
Around GL, we will invest energy in the following aspects in the future. GL's good scalability also allows more room for imagination in the future.

new hardware
Applications represented by images have catalyzed the development of GPUs. It is foreseeable that the combination of Graph as the most widely used data format in actual production and NN will also lead to more thinking about pursuing excellent performance from a hardware perspective. At present, Ali has already begun to explore this aspect internally.

new algorithm

In recent years, papers related to GNN algorithms have been mainly extended under the GCN framework, and the programming method is relatively fixed. At present, some bottlenecks have been encountered more or less. We have also begun to try to make some innovations in algorithm theory, and the ease of use and scalability of GL will help this process.

New business

GNN covers a wide range of businesses and will bring many unexpected effects. The currently implemented applications are mainly concentrated in a few large companies, and have not yet become popular. GL's open source also hopes to help relevant practitioners to jointly expand the ecology of GNN.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us