All Products
Search
Document Center

Platform For AI:Correlation Coefficient Matrix

Last Updated:Jan 03, 2025

A correlation coefficient matrix is a tool used to quantify and display the pairwise correlations between multiple variables. Each element in the matrix represents the correlation coefficient between corresponding variables. In most cases, the Pearson correlation coefficient is used to measure linear relationships. The correlation coefficient matrix is essential to feature selection, data analytics, and model building, helping identify linear dependencies and multicollinearity issues among variables.

Configure the component

Method 1: Configure the component on the pipeline page

On the pipeline details page in Machine Learning Designer, add the Correlation Coefficient Matrix component to the pipeline and configure the parameters described in the following table.

Tab

Parameter

Description

Fields Setting

All Selected by Default

The feature columns that are used in matrix calculation. By default, all feature columns are selected for correlation analysis.

Tuning

Cores

This parameter must be used with the Memory Size parameter.

Memory Size

This parameter must be used with the Cores parameter.

Method 2: Use PAI commands

Configure the component parameters by using PAI commands. You can use the SQL Script component to call PAI commands. For more information, see Scenario 4: Execute PAI commands within the SQL script component.

PAI -name corrcoef
    -project algo_public
    -DinputTableName=maple_test_corrcoef_basic12x10_input
    -DoutputTableName=maple_test_corrcoef_basic12x10_output
    -DcoreNum=1
    -DmemSizePerCore=110;

Parameter

Required

Default value

Description

inputTableName

Yes

No default value

The name of the input table.

inputTablePartitions

No

No default value

The partitions that are selected from the input table for training. The following formats are supported:

  • partition_name=value

  • name1=value1/name2=value2: multi-level partitions

Note

If you specify multiple partitions, separate them with commas (,). Example: name1=value1,value2.

outputTableName

Yes

No default value

The name of the output table.

selectedColNames

No

All columns

The columns selected from the input table.

lifecycle

No

No default value

The lifecycle of the output table.

coreNum

No

Determined by the system

This parameter must be used with the memSizePerCore parameter. The value must be a positive integer. Valid values: 1 to 9999.

memSizePerCore

No

Determined by the system

The memory size of each core. Unit: MB. The value must be a positive integer in the range of [1024, 64 × 1024].

Example

  1. Generate the following test data.

    col0:double

    col1:bigint

    col2:double

    col3:bigint

    col4:double

    col5:bigint

    col6:double

    col7:bigint

    col8:double

    col9:double

    19

    95

    33

    52

    115

    43

    32

    98

    76

    40

    114

    26

    101

    69

    56

    59

    116

    23

    109

    105

    103

    89

    7

    9

    65

    118

    73

    50

    55

    81

    79

    20

    63

    71

    5

    24

    77

    31

    21

    75

    87

    16

    66

    47

    25

    14

    42

    99

    108

    57

    11

    104

    38

    37

    106

    51

    3

    91

    80

    97

    84

    30

    70

    46

    8

    6

    94

    22

    45

    48

    35

    17

    107

    64

    10

    112

    53

    34

    90

    96

    13

    61

    39

    1

    29

    117

    112

    2

    82

    28

    62

    4

    102

    88

    100

    36

    67

    54

    12

    85

    49

    27

    44

    93

    68

    110

    60

    72

    86

    58

    92

    119

    0

    113

    41

    15

    74

    83

    18

    111

  2. Run the following PAI commands:

    PAI -name corrcoef
        -project algo_public
        -DinputTableName=maple_test_corrcoef_basic12x10_input
        -DoutputTableName=maple_test_corrcoef_basic12x10_output
        -DcoreNum=1
        -DmemSizePerCore=110;
  3. View the returned results.

    columnsnames

    col0

    col1

    col2

    col3

    col4

    col5

    col6

    col7

    col8

    col9

    col0

    1

    -0.2115657251820724

    0.0598306259706561

    0.2599903570684693

    -0.3483249188225586

    -0.28716254396809926

    0.47880162127435116

    -0.13646519484213326

    -0.19500158764680092

    0.3897390240949085

    col1

    -0.2115657251820724

    1

    -0.8444477377898585

    -0.17507636221594533

    0.40943384150571377

    0.09135976026101403

    -0.3018506374626574

    0.40733726912808044

    -0.11827739124590071

    0.12433851389455183

    col2

    0.0598306259706561

    -0.8444477377898585

    1

    0.18518346647293102

    -0.20934839228057014

    -0.1896417512389659

    0.1799377498863213

    -0.3858885676469948

    0.20254569203773892

    0.13476160753756655

    col3

    0.2599903570684693

    -0.17507636221594533

    0.18518346647293102

    1

    0.03988018649854009

    -0.43737887418329147

    -0.053818296425267184

    0.2900856441586986

    -0.3607547910075688

    0.4912019074930449

    col4

    -0.3483249188225586

    0.40943384150571377

    -0.20934839228057014

    0.03988018649854009

    1

    0.1465605209246875

    -0.5016030364347955

    0.5496024325711117

    0.013743256115394122

    0.07497231559184887

    col5

    -0.28716254396809926

    0.09135976026101403

    -0.1896417512389659

    -0.43737887418329147

    0.1465605209246875

    1

    0.16729809310873522

    -0.29890655828796964

    0.3618518101014617

    -0.1713960957286885

    col6

    0.47880162127435116

    -0.3018506374626574

    0.1799377498863213

    -0.053818296425267184

    -0.5016030364347955

    0.16729809310873522

    1

    -0.8165019880156462

    -0.11173420918721436

    -0.10363860378347944

    col7

    -0.13646519484213326

    0.40733726912808044

    -0.3858885676469948

    0.2900856441586986

    0.5496024325711117

    -0.29890655828796964

    -0.8165019880156462

    1

    0.07435907471544469

    0.11711976051999162

    col8

    -0.19500158764680092

    -0.11827739124590071

    0.20254569203773892

    -0.3607547910075688

    0.013743256115394122

    0.3618518101014617

    -0.11173420918721436

    0.07435907471544469

    1

    -0.18463012549540175

    col9

    0.3897390240949085

    0.12433851389455183

    0.13476160753756655

    0.4912019074930449

    0.07497231559184887

    -0.1713960957286885

    -0.10363860378347944

    0.11711976051999162

    -0.18463012549540175

    1