A Lorenz curve can be used to show the income distribution of a country or region. The slope of the curve indicates the degree of income inequality. The greater the slope, the more unequal the income distribution.

In a rectangle, the height represents the total wealth and is equally divided into N parts. The length represents the families arranged from least wealthy to most wealthy. The length is also equally divided into N parts. The first part indicates the least wealthy 1/N families. The points, each of which indicates a wealth proportion of 1/N families, are connected to form a Lorenz curve.

Configure the component

You can configure the component by using one of the following methods:
  • Machine Learning Platform for AI console
    Tab Parameter Description
    Fields Setting Columns N/A.
    Parameters Setting Quantile Default value: 100.
    Tuning Computing Cores The number of cores used in computing. The value must be a positive integer.
    Memory Size per Core (Unit: MB) The memory of each core.
  • PAI command
    PAI -name LorenzCurve
        -project algo_public
        -DinputTableName=maple_test_lorenz_basic10_input
        -DcolName=col0
        -DoutputTableName=maple_test_lorenz_basic10_output -DcoreNum=20
        -DmemSizePerCore=110;
    Parameter Required Description Default value
    inputTableName Yes The name of the input table. No default value
    outputTableName Yes The names of output tables. No default value
    ColName No The columns selected from the input table. You can select multiple columns and separate them with commas (,). No default value
    N No Quantile. 100
    inputTablePartitions No The partitions selected from the input table for training. The system supports the following formats:
    • Partition_name=value
    • name1=value1/name2=value2: multi-level partitions
    Note If you specify multiple partitions, separate them with commas (,).
    No default value
    lifecycle No The lifecycle of the output table. This value must be an integer. Unit: days. 28
    coreNum No This parameter is used with memSizePerCore. The value must be a positive integer. The system calculates the number of instances based on the amount of input data. Automatically calculated
    memSizePerCore No The memory size of each core. Unit: MB. The value is a positive integer in the range of (1024, 64 × 1024). Automatically calculated

Example

  1. Generate the following test data.
    col0:double
    4
    7
    2
    8
    6
    3
    9
    5
    0
    1
    10
  2. Run the following PAI command:
    PAI -name LorenzCurve
        -project algo_public
        -DinputTableName=maple_test_lorenz_basic10_input
        -DcolName=col0
        -DoutputTableName=maple_test_lorenz_basic10_output
        -DcoreNum=20
        -DmemSizePerCore=110;
  3. View the output as shown in the following table.
    quantile col0
    0 0
    1 0.01818181818181818
    2 0.01818181818181818
    3 0.01818181818181818
    4 0.01818181818181818
    5 0.01818181818181818
    6 0.01818181818181818
    7 0.01818181818181818
    8 0.01818181818181818
    9 0.01818181818181818
    10 0.01818181818181818
    11 0.05454545454545454
    12 0.05454545454545454
    13 0.05454545454545454
    14 0.05454545454545454
    ... ...
    85 0.8181818181818182
    86 0.8181818181818182
    87 0.8181818181818182
    88 0.8181818181818182
    89 0.8181818181818182
    90 1
    91 1
    92 1
    93 1
    94 1
    95 1
    96 1
    97 1
    98 1
    99 1
    100 1