This topic describes how to use Bipartite Graph SAmple and aggreGatE (GraphSAGE) to
obtain feature vectors of users and items for matching recall.
Background information
Graph neural network is a widely discussed concept in deep learning. The open source
Graph-Learn framework of Machine Learning Platform for AI (PAI) provides a large number
of graph learning algorithms. GraphSAGE is a matching algorithm for graph neural networks.
Bipartite GraphSAGE is an extension of GraphSAGE and is used to process bipartite
graphs. Bipartite GraphSAGE is used by Taobao for matching recall.
In a bipartite graph, each user or item is represented by a vertex. The correlation,
such as clicking or purchasing, between a user and an item is represented by an edge.
The system samples adjacent vertices of each vertex that represents a user and each
vertex that represents an item based on the meta paths User-Item-User-Item... and Item-User-Item-User….
Limits
The RecSys-GraphEmbedding experiment template is provided only in the China (Beijing), China (Hangzhou), China
(Shanghai), and China (Shenzhen) regions.
Procedure
- Go to the Machine Learning Studio console.
- Log on to the PAI console.
- In the left-side navigation pane, choose .
- On the PAI Visualization Modeling page, find the project in which you want to create an experiment and click Machine Learning in the Operation column.

- Create an experiment.
- In the left-side navigation pane, click Home.
- In the Templates section, click Create below RecSys-GraphEmbedding.
- In the New Experiment dialog box, set the parameters that are described in the following table. You can
use the default values of the parameters.
Parameter |
Description |
Name |
The name of the experiment. Default value: RecSys-GraphEmbedding.
|
Project |
The name of the project to which the experiment belongs. You cannot change the value
of this parameter.
|
Description |
The description of the experiment. Default value: Rec system GraphEmbedding matching.
|
Save To |
The directory for storing the experiment. Default value: My Experiments.
|
- Click OK.
- Optional:Wait about 10 seconds. Then, click Experiments in the left-side navigation pane.
- Optional:Click RecSys-GraphEmbedding_XX under My Experiments. The canvas of the experiment appears.
My Experiments is the directory for storing the experiment that you created and RecSys-GraphEmbedding_XX is the name of the experiment. In the experiment name, _XX is the ID that the system automatically creates for the experiment.
- View the components of the experiment on the canvas, as shown in the following figure.
The system automatically creates the experiment based on the preset template.

Component No. |
Description |
1 |
This component imports data from the table that records user behavior on items. The
table contains the following fields:
- user: the ID of the user. The value must be of the BIGINT type.
- item: the ID of the item. The value must be of the BIGINT type.
- weight: the behavior that was performed by the user on the item. The value must be of the
DOUBLE type. For example, the value 1 indicates that the user has purchased the item, and the value 2 indicates that the user has added the item to favorites.
|
2 |
This component imports data from the user feature table. The table contains the following
fields:
- user: the ID of the user. The value must be of the BIGINT type.
- feature: the one or more features of the user. The value must be of the STRING type. If the
user has multiple features, separate them with colons (:). The feature value 0 must
be included in the value of feature. Each feature must be indicated by a FLOAT-type
number. The system processes the features as continuous features.
|
3 |
This component imports data from the item feature table. The table contains the following
fields:
- item: the ID of the item. The value must be of the BIGINT type.
- feature: the one or more features of the item. The value must be of the STRING type. If the
item has multiple features, separate them with colons (:). The feature value 0 must
be included in the value of feature. Each feature must be indicated by a FLOAT-type
number. The system processes the features as continuous features.
|
4 |
This component generates a user vector table and an item vector table for matching
recall.
|
- Run the experiment and view the result.
- In the top toolbar of the canvas, click Run.
- After the experiment is run, right-click graphSage-1 on the canvas and choose . In the dialog box that appears, view the feature vectors that are generated for
users.
- Right-click graphSage-1 on the canvas and choose . In the dialog box that appears, view the feature vectors that are generated for
items.