A histogram is also known as a mass distribution profile. A histogram is a statistical report chart that consists of a series of vertical stripes or line segments with different heights to show the data distribution. The horizontal axis represents the data types, and the vertical axis represents the data distribution.

Configure the component

You can configure the component by using one of the following methods:
  • Machine Learning Platform for AI console
    Tab Parameter Description
    Fields Setting Select Column Select the columns to be analyzed. Only the DOUBLE and BIGINT types are supported.
    Note A maximum of 1,024 columns are supported.
    Parameters Setting Intervals The number of intervals into which the histogram is divided.
    Tuning Cores The number of cores. The parameter value must be a positive integer.
    Memory Size per Core The memory size of each core, in MB. Valid values: 1 to 65536.
  • PAI command
    PAI -name histogram
          -project algo_public
          -DinputTableName=maple_histogram_1to20_input
          -DoutputTableName=maple_histogram_1to20_output
          -DselectedColNames=col0,col1 -DintervalNum=20;
    Parameter Required Description Default value
    inputTableName Yes The name of the input table. No default value
    inputTablePartitions No The partitions selected from the input table for training. The following formats are supported:
    • Partition_name=value
    • name1=value1/name2=value2: multi-level partitions
    Note Separate multiple partitions with commas (,).
    No default value
    outputTableName Yes The name of the output table. No default value
    selectedColNames Yes The names of the columns selected from the input table for training. Separate the names of multiple columns with commas (,). The INT and DOUBLE types are supported.
    Note A maximum of 1,024 columns are supported.
    No default value
    intervalNum No The number of intervals into which the histogram is divided. 100
    lifecycle No The lifecycle of the table. No default value
    coreNum No The number of cores. The parameter value must be a positive integer. Valid values: [1,9999]. Automatically allocated
    memSizePerCore No The memory size of each core, in MB. Valid values: 1 to 65536. Automatically allocated

Example

  • Input
    col0 col1
    1 1.0
    2 2.0
    3 3.0
    4 4.0
    5 5.0
    6 6.0
    7 7.0
    8 8.0
    9 9.0
    10 10.0
    11 11.0
    12 12.0
    13 13.0
    14 14.0
    15 15.0
    16 16.0
    17 17.0
    18 18.0
    19 19.0
    20 20.0
  • PAI command
    PAI -name histogram
        -project algo_public
        -DinputTableName=maple_histogram_1to20_input
        -DoutputTableName=maple_histogram_1to20_output
        -DselectedColNames=col0,col1 -DintervalNum=20;
  • Output
    colname histogram
    col0 [1, 1.95):1;[1.95, 2.9):1;[2.9, 3.85):1;[3.85, 4.8):1;[4.8, 5.75):1;[5.75, 6.7):1;[6.7, 7.65):1;[7.65, 8.6):1;[8.6, 9.55):1;[9.55, 10.5):1;[10.5, 11.45):1;[11.45, 12.4):1;[12.4, 13.35):1;[13.35, 14.3):1;[14.3, 15.25):1;[15.25, 16.2):1;[16.2, 17.15):1;[17.15, 18.1):1;[18.1, 19.05):1;[19.05, 20]:1
    col1 [1, 1.95):1;[1.95, 2.9):1;[2.9, 3.85):1;[3.85, 4.8):1;[4.8, 5.75):1;[5.75, 6.7):1;[6.7, 7.65):1;[7.65, 8.6):1;[8.6, 9.55):1;[9.55, 10.5):1;[10.5, 11.45):1;[11.45, 12.4):1;[12.4, 13.35):1;[13.35, 14.3):1;[14.3, 15.25):1;[15.25, 16.2):1;[16.2, 17.15):1;[17.15, 18.1):1;[18.1, 19.05):1;[19.05, 20]:1