All Products
Search
Document Center

Platform For AI:Lorenz Curve

Last Updated:Oct 31, 2023

A Lorenz curve can be used to show the income distribution of a country or region. The slope of the curve indicates the degree of income inequality. The greater the slope, the more unequal the income distribution.

In a rectangle, the height represents the total wealth and is equally divided into N parts. The length represents the families arranged from least wealthy to most wealthy. The length is also equally divided into N parts. The first part indicates the least wealthy 1/N families. The points, each of which indicates a wealth proportion of 1/N families, are connected to form a Lorenz curve.

Configure the component

You can use one of the following methods to configure the Lorenz Curve component.

Method 1: Configure the component on the pipeline page

You can configure the parameters of the Lorenz Curve component on the pipeline page of Machine Learning Designer of Machine Learning Platform for AI (PAI). Machine Learning Designer is formerly known as Machine Learning Studio. The following table describes the parameters.

Tab

Parameter

Description

Fields Setting

Columns

N/A

Parameters Setting

Quantile

Default value: 100.

Tuning

Computing Cores

The number of cores used in computing. The value must be a positive integer.

Memory Size per Core (Unit: MB)

The memory size of each core.

Method 2: Use PAI commands

Configure the component parameters by using PAI commands. You can use the SQL Script component to call PAI commands. For more information, see SQL Script.

PAI -name LorenzCurve
    -project algo_public
    -DinputTableName=maple_test_lorenz_basic10_input
    -DcolName=col0
    -DoutputTableName=maple_test_lorenz_basic10_output -DcoreNum=20
    -DmemSizePerCore=110;

Parameter

Required

Description

Default value

inputTableName

Yes

The name of the input table.

No default value

outputTableName

Yes

The name of the output table.

No default value

ColName

No

The columns selected from the input table. You can select multiple columns and separate them with commas (,).

No default value

N

No

The quantile.

100

inputTablePartitions

No

The partitions that are selected from the input table for training. The following formats are supported:

  • Partition_name=value

  • name1=value1/name2=value2: multi-level partitions

Note

If you specify multiple partitions, separate them with commas (,).

No default value

lifecycle

No

The lifecycle of the output table. This value must be an integer. Unit: days.

28

coreNum

No

This parameter is used with memSizePerCore. The value must be a positive integer. The system calculates the number of instances based on the amount of input data.

Determined by the system

memSizePerCore

No

The memory size of each core. Unit: MB. The value must be a positive integer. Recommended values: (1024,64 × 1024).

Determined by the system

Example

  1. Generate the following test data:

    col0:double

    4

    7

    2

    8

    6

    3

    9

    5

    0

    1

    10

  2. Run the following PAI command:

    PAI -name LorenzCurve
        -project algo_public
        -DinputTableName=maple_test_lorenz_basic10_input
        -DcolName=col0
        -DoutputTableName=maple_test_lorenz_basic10_output
        -DcoreNum=20
        -DmemSizePerCore=110;
  3. View the output as described in the following table.

    quantile

    col0

    0

    0

    1

    0.01818181818181818

    2

    0.01818181818181818

    3

    0.01818181818181818

    4

    0.01818181818181818

    5

    0.01818181818181818

    6

    0.01818181818181818

    7

    0.01818181818181818

    8

    0.01818181818181818

    9

    0.01818181818181818

    10

    0.01818181818181818

    11

    0.05454545454545454

    12

    0.05454545454545454

    13

    0.05454545454545454

    14

    0.05454545454545454

    ...

    ...

    85

    0.8181818181818182

    86

    0.8181818181818182

    87

    0.8181818181818182

    88

    0.8181818181818182

    89

    0.8181818181818182

    90

    1

    91

    1

    92

    1

    93

    1

    94

    1

    95

    1

    96

    1

    97

    1

    98

    1

    99

    1

    100

    1