This topic describes the Scatter Plot component provided by Machine Learning Studio.

In regression analysis, a scatter chart shows the distribution of data points in a Cartesian coordinate system.

Configure the component

You can use one of the following methods to configure the Scatter Plot component.

Method 1: Configure the component on the pipeline page

You can configure the parameters of the Scatter Plot component on the pipeline page of Machine Learning Designer of Machine Learning Platform for AI (PAI). Machine Learning Designer is formerly known as Machine Learning Studio. The following table describes the parameters.
ParameterDescription
Feature ColumnsThe columns to represent the features of data in training samples.
Label ColumnThe label column.
SamplesThe number of samples.

Method 2: Use PAI commands

Configure the component parameters by using PAI commands. You can use the SQL Script component to call PAI commands. For more information, see SQL Script.
PAI -name scatter_diagram -project algo_public
 -DselectedCols=emp_var_rate,cons_price_rate,cons_conf_idx,euribor3m
 -DlabelCol=y
 -DmapTable=pai_temp_2447_22859_2
 -DinputTable=scatter_diagram
 -DoutputTable=pai_temp_2447_22859_1;
ParameterRequiredDescriptionDefault value
inputTableYesThe name of the input table. No default value
inputTablePartitionsNoThe partitions that are selected from the input table for training. The following formats are supported:
  • Partition_name=value
  • name1=value1/name2=value2: multi-level partitions
Note If you specify multiple partitions, separate them with commas (,).
No default value
outputTableYesThe name of the output table. No default value
mapTableYesThe name of the output table that stores the maximum value, minimum value, and enumeration value of each feature. No default value
selectedColsYesThe columns selected from the input table and used to draw a scatter chart. A maximum of five columns can be selected. No default value
labelColYesThe INT- or STRING- type column that you want to use as the label column. Empty
lifecycleYesThe lifecycle of the output table. Unit: days. 28

Example

  • Input data
    create table scatter_diagram as select emp_var_rate,cons_price_rate, cons_conf_idx,euribor3m,y from pai_bank_data limit 10
    emp_var_ratecons_price_ratecons_conf_idxeuribor3my
    1.493.918-42.74.9620
    -0.193.2-42.04.0210
    -1.794.055-39.80.7291
    -1.893.075-47.11.4050
    -2.992.20131.40.8691
    1.493.918-42.74.9610
    -1.892.893-46.21.3270
    -1.892.89392.8931.3130
    -2.992.963-40.81.2661
    -1.893.075-47.11.410
    1.193.994-36.44.8640
    1.493.444-36.14.9640
    1.493.444-36.14.9651
    -1.892.893-46.21.2910
    1.494.465-41.84.960
    1.493.918-42.74.9620
    -1.893.075-47.11.3651
    -0.193.798-40.44.861
    1.193.994-36.44.860
    1.493.918-42.74.960
    -1.893.075-47.11.4050
    1.494.465-41.84.9670
    1.493.918-42.74.9630
    1.493.918-42.74.9680
    1.493.918-42.74.9620
    -1.892.893-46.21.3440
    -3.492.431-26.90.7540
    -1.893.075-47.11.3650
    -1.892.893-46.21.3130
    1.493.918-42.74.9610
    1.494.465-41.84.9610
    -1.892.893-46.21.3270
    -1.892.893-46.21.2990
    -2.992.963-40.81.2681
    1.493.918-42.74.9630
    -1.892.893-46.21.3340
    1.493.918-42.74.960
    -1.893.075-47.11.4050
    1.494.465-41.84.960
    1.493.444-36.14.9620
    1.193.994-36.44.860
    1.193.994-36.44.8570
    1.493.918-42.74.9610
    -3.492.649-30.10.7151
    1.493.444-36.14.9660
    -0.193.2-42.04.0760
    1.493.444-36.14.9650
    -1.892.893-46.21.3540
    1.493.444-36.14.9670
    1.494.465-41.84.9590
    -1.892.893-46.21.3540
    1.494.465-41.84.9580
    -1.892.893-46.21.3540
    1.494.465-41.84.8640
    1.193.994-36.44.8590
    1.193.994-36.44.8570
    -1.892.893-46.21.270
    1.193.994-36.44.8570
    1.193.994-36.44.8590
    1.494.465-41.84.9590
    1.193.994-36.44.8560
    -1.893.075-47.11.4050
    -1.892.843-50.01.8111
    -0.193.2-42.04.0210
    -2.992.469-33.61.0290
    1.493.918-42.74.9620
    -1.893.075-47.11.3650
    1.193.994-36.44.8570
    -1.892.893-46.21.2590
    1.193.994-36.44.8570
    1.494.465-41.84.8660
    -2.992.201-31.40.8830
    -0.193.2-42.04.0760
    1.193.994-36.44.8570
    1.493.918-42.74.960
    1.493.444-36.14.9620
    1.193.994-36.44.8580
    1.193.994-36.44.8570
    1.193.994-36.44.8560
    1.493.918-42.74.9680
    1.493.444-36.14.9660
    1.494.465-41.84.9620
    1.493.444-36.14.9630
    -1.892.843-50.01.561
    1.493.918-42.74.960
    1.493.444-36.14.9630
    -3.492.431-26.90.740
    1.193.994-36.44.8560
    1.493.918-42.74.9620
    1.193.994-36.44.8560
    -0.193.2-42.04.2451
    1.193.994-36.44.8570
    -1.893.075-47.11.4050
    -1.892.893-46.21.3270
    -0.193.2-42.04.120
    1.494.465-41.84.9580
    -1.893.749-34.60.6591
    1.193.994-36.44.8580
    1.193.994-36.44.8580
    1.493.444-36.14.9630
  • Parameter settings

    Select the y column as the optional label column for the scatter chart. Select the emp_var_rate, cons_price_rate, cons_conf_idx, and euribor3m columns as feature columns.

  • Output

    You can view the distribution of the objects specified by the label column for different features in the scatter chart.

    Scatter chart