All Products
Search
Document Center

Platform For AI:Pearson Coefficient

Last Updated:Apr 29, 2025

A Pearson coefficient is a linear correlation coefficient that measures the linear correlation between two variables. In Machine Learning Designer, the Pearson Coefficient component is used to calculate the Pearson correlation coefficient of two numeric columns in an input table or partition.

Configure the component

You can use one of the following methods to configure the Pearson Coefficient component.

Method 1: Configure the component on the pipeline page

You can configure the parameters of the Pearson Coefficient component on the pipeline page of Machine Learning Designer of Platform for AI (PAI). The following table describes the parameters.

Tab

Parameter

Description

Fields Setting

Input Column 1

The name of the column whose correlation coefficient is to be calculated.

Input Column 2

The name of the column whose correlation coefficient is to be calculated.

Method 2: Use PAI commands

You can configure the component parameters by using PAI commands. You can use the SQL Script component to call PAI commands. For more information, see SQL Script.

pai -name pearson
    -project algo_public
    -DinputTableName=wpbc
    -Dcol1Name=f1
    -Dcol2Name=f2
    -DoutputTableName=wpbc_pear;

Parameter

Description

Required

inputTableName

The name of the input table.

Yes

inputTablePartitions

The partitions in the input table. By default, all partitions are selected.

  • Specify a single partition in the format of partition_name=value.

  • Specify multiple partitions in the format of name1=value1,name2=value2.

    Note

    Separate multiple partitions with commas (,).

  • Specify multi-level partitions in the format of name1=value1/name2=value2.

No

col1Name

The name of Input Column 1.

Yes

col2Name

The name of Input Column 2.

Yes

outputTableName

The name of the output table.

Yes

lifecycle

The lifecycle of the output table. By default, the output table has no lifecycle.

Note

The value must be a positive integer.

No

Example

  • Input table

    Develop a MaxCompute SQL task to create the pai_pearson_test_input table. Sample statements:

    create table pai_pearson_test_input as
    select * from
    (
    select 1.0 as f0,0.11 as f1
    union all
    select 2.0 as f0,0.12 as f1
    union all
    select 3.0 as f0,0.13 as f1
    union all
    select 5.0 as f0,0.15 as f1
    union all
    select 8.0 as f0,0.18 as f1
    )tmp;
  • PAI command

    Execute the SQL script to run PAI commands or develop a MaxCompute SQL task to run PAI commands.

    pai -name pearson
        -project algo_public
        -DinputTableName=pai_pearson_test_input
        -Dcol1Name=f0
        -Dcol2Name=f1
        -DoutputTableName=pai_pearson_test_output;
  • Output table

    +------------+------------+------------+------------+-------------+-------------+---------------------+
    | src_table  | src_parts  | col1_name  | col2_name  | count_total | count_valid | pearson_coefficient |
    +------------+------------+------------+------------+-------------+-------------+---------------------+
    | sre_mpi_algo_dev.pai_pearson_test_input |            | f0         | f1         | 5           | 5           | 0.9999999999999973  |
    +------------+------------+------------+------------+-------------+-------------+---------------------+