A Pearson coefficient is a linear correlation coefficient that measures the linear correlation between two variables. In Machine Learning Platform for AI, the Pearson Coefficient component is used to calculate the Pearson correlation coefficient of two numeric columns in an input table or partition. The calculation result is exported to an output table.

Configure the component

You can configure the component by using one of the following methods:
  • Machine Learning Platform for AI console
    Tab Parameter Description
    Fields Setting Input Column 1 The name of the column whose correlation coefficient is to be calculated.
    Input Column 2 The name of the column whose correlation coefficient is to be calculated.
  • PAI command
    pai -name pearson
        -project algo_public
        -DinputTableName=wpbc
        -Dcol1Name=f1
        -Dcol2Name=f2
        -DoutputTableName=wpbc_pear;
    Parameter Description Required
    inputTableName The name of the input table. Yes
    inputTablePartitions The partitions in the input table. By default, all partitions are selected.
    • Specify a single partition in the format of partition_name=value.
    • Specify multiple partitions in the format of name1=value1,name2=value2.
      Note Separate multiple partitions with commas (,).
    • Specify multi-level partitions in the format of name1=value1/name2=value2.
    No
    col1Name The name of input column 1. Yes
    col2Name The name of input column 2. Yes
    outputTableName The name of the output table. Yes
    lifecycle The lifecycle of the output table. By default, the output table has no lifecycle.
    Note The parameter value must be a positive integer.
    No

Example

  • Input table
    create table pai_pearson_test_input as
    select * from
    (
    select 1.0 as f0,0.11 as f1
    union all
    select 2.0 as f0,0.12 as f1
    union all
    select 3.0 as f0,0.13 as f1
    union all
    select 5.0 as f0,0.15 as f1
    union all
    select 8.0 as f0,0.18 as f1
    )tmp;
  • PAI command
    pai -name pearson
        -project algo_public
        -DinputTableName=pai_pearson_test_input
        -Dcol1Name=f0
        -Dcol2Name=f1
        -DoutputTableName=pai_pearson_test_output;
  • Output table
    +------------+------------+------------+------------+-------------+-------------+---------------------+
    | src_table  | src_parts  | col1_name  | col2_name  | count_total | count_valid | pearson_coefficient |
    +------------+------------+------------+------------+-------------+-------------+---------------------+
    | sre_mpi_algo_dev.pai_pearson_test_input |            | f0         | f1         | 5           | 5           | 0.9999999999999973  |
    +------------+------------+------------+------------+-------------+-------------+---------------------+