The Random Sampling component randomly samples the input data. You can specify the proportion or number of samples. The samples are independent of each other.

You can configure the component by using one of the following methods:
  • Use the Machine Learning Platform for AI console
    Tab Parameter Description
    Parameters Setting Sample Size The value must be a positive integer.
    Sample Ratio The value must be a floating-point number. Valid values: (0,1).
    Return Samples By default, this check box is not selected. If you select this check box, sampling with replacement is enabled.
    Random Number Seed By default, the system determines the value.
    Tuning Number of Cores The value must be a positive integer. By default, the system determines the value.
    Core Memory Allocation The value must be a positive integer. Valid values: (1,65536). By default, the system determines the value.
  • Use commands
    PAI -name WeightedSample
        -project algo_public \
        -Dlifecycle="28" \
        -DoutputTableName="test2" \
        -DprobCol="previous" \
        -Dreplace="false" \
        -DsampleSize="500" \
        -DinputPartitions="pt=20150501" \
        -DinputTableName="bank_data_partition";
    Parameter Required Description Default value
    inputTableName Yes The name of the input table. N/A
    inputTablePartitions No The partitions that are selected from the input table for training. Specify this parameter in one of the following formats:
    • Partition_name=value
    • name1=value1/name2=value2: multi-level partitions
    Note Separate multiple partitions with commas (,)
    N/A
    outputTableName Yes The name of the output table. N/A
    sampleSize No The number of samples.
    Note
    • If both the sampleSize and sampleRatio parameters are empty, an error is returned.
    • If both the sampleSize and sampleRatio parameters are not empty, the sampleSize parameter takes precedence.
    N/A
    sampleRatio No The sampling proportion. The value must be a floating-point number. Valid values: (0,1) N/A
    replace No Specifies whether to enable sampling with replacement. The value must be of the BOOLEAN type. false
    randomSeed No The random seed. The value must be a positive integer. Determined by the system
    lifecycle No The lifecycle of the output table. Valid values: [1,3650]. N/A
    coreNum No The number of cores. The value must be a positive integer. Determined by the system
    memSizePerCore No The memory size of each core. Valid values: (1,65536). Unit: MB. Determined by the system