This topic describes how to use graph algorithms to manage financial risks.

Background information

Graph algorithms are used in relationship analysis scenarios. Graph algorithms arrange data into a relationship graph that contains connections between vertices. The connections are represented as edges. Machine Learning Platform for AI (PAI) provides several graph algorithm components, including K-Core, Maximum Connected Subgraph, and Label Propagation Classification.

The following figure shows the relationship graph of an interlinked group of people. The arrows in the figure represent the relationships between these people, such as colleagues or relatives. In this graph, Enoch is a trusted customer and Evan is a fraudster. Based on this information and the relationship graph, you can use graph algorithms to calculate the credit index of each person, that is, the probability of the person being a fraudster.Relationship graph

Dataset

The following table describes the fields in the dataset that is used in this topic.
Field Meaning Data type Description
start_point Start vertex of an edge STRING The name of a person.
end_point End vertex of an edge STRING The name of a person.
count Closeness DOUBLE The closeness between two persons. A greater value indicates a closer relationship between the two persons.
The following figure shows the sample data that is used in the experiment.Dataset

Procedure

  1. Go to the Machine Learning Studio console.
    1. Log on to the PAI console.
    2. In the left-side navigation pane, choose Model Training > Studio-Modeling Visualization.
    3. On the PAI Visualization Modeling page, find the project in which you want to create an experiment and click Machine Learning in the Operation column.Machine Learning
  2. Create an experiment.
    1. In the left-side navigation pane, click Home.
    2. In the Templates section, click Create below [Chart Algorithms] Financial Risk Management.
    3. In the New Experiment dialog box, set the experiment parameters. You can use the default values of the parameters.
      Parameter Description
      Name The name of the experiment. Default value: [Chart Algorithms] Financial Risk Management. The name must be 1 to 32 characters in length. Enter a name that meets this requirement, for example, Financial Risk Management.
      Project The project in which you want to create the experiment. You cannot change the value of this parameter.
      Description The description of the experiment. Default value: Use chart algorithms to resolve manage financial issues based on customer credit.
      Save To The directory for storing the experiment. Default value: My Experiments.
    4. Click OK.
    5. Optional:Wait about 10 seconds. Then, click Experiments in the left-side navigation pane.
    6. Optional:Click Financial Risk Management_XX under My Experiments. The canvas of the experiment appears.
      My Experiments is the directory for storing the experiment that you created and Financial Risk Management_XX is the name of the experiment. In the experiment name, _XX is the ID that the system automatically creates for the experiment.
    7. View the components of the experiment on the canvas, as shown in the following figure. The system automatically creates the experiment based on the preset template.
      Experiment on using graph algorithms to manage financial risks
      Area No. Description
      1

      The Maximum Connected Subgraph-1 component classifies the people in the relationship graph into two groups, and assigns an ID to each group. Then, the SQL Script-1 and Join-1 components remove unrelated people in the relationship graph.

      The Maximum Connected Subgraph-1 component can find the set that contains the largest number of interlinked people and remove unrelated people, as shown in the following figure.Maximum Connected Subgraph
      2 The component in this area explores the distance between two vertices. In the output of the Single-source Shortest Path-1 component, the distance field indicates the number of people that Enoch must contact to reach the target, as shown in the following figure.Single-source Shortest Path
      3
      The Provided data component imports the labeled data. The weight field indicates the probability of a person being a fraudster. Then, the Label Propagation Classification-1 component predicts the labels of unlabeled vertices. Finally, the fraudulent weight_SQL script component filters results and shows the probability of each person being a fraudster.Example of labeled data

      Label propagation classification is a semi-supervised classification algorithm. It uses a relationship graph and labeled data as its input and predicts the labels of unlabeled vertices based on the labels of labeled ones. Label propagation classification propagates the label of each vertex to the vertices next to the vertex.

  3. Run the experiment and view the result.
    1. In the top toolbar of the canvas, click Run.
    2. After the experiment is run, right-click fraudulent weight_SQL script on the canvas and select View Data. In the dialog box that appears, view the probability of each person being a fraudster.Result of the experiment