All Products
Search
Document Center

Platform For AI:Standard Scaler Batch Predict

Last Updated:Feb 14, 2025

Standard Scaler Batch Predict is a machine learning algorithm used for data preprocessing, aimed at standardizing batch data to mitigate the impact of different scales and ranges across columns. The algorithm assumes that the data follows a normal distribution and standardizes it using the mean and variance, mapping the data from different columns to the same range. This process enhances the stability and accuracy of model training and prediction. Standard Scaler Batch Predict is particularly effective when handling large-scale datasets, ensuring consistent data distribution.

Limits

The supported compute engines are MaxCompute and Realtime Compute for Apache Flink.

Configure the component in Machine Learning Designer

Input ports

Input port (from left to right)

Data type

Recommended upstream component

Required

Input model of the prediction

None

Standard Scaler Train

Yes

Input data of the prediction

Numeric Type

Read Table

Read CSV File

Yes

Component parameters

Tab

Parameter

Description

Parameter Setting

outputCols

Optional. The names of the output columns. By default, the generated prediction result columns replace the original input columns. As such, you must set the number of output columns to a value that is the same as the number of columns selected for training. Separate multiple columns with commas (,).

numThreads

The number of threads used by this component. Default value: 1.

Execution Tuning

Number of Workers

The number of workers. This parameter must be used together with the Memory per worker, unit MB parameter. The value of this parameter must be a positive integer. Valid values: [1,9999].

Memory per worker, unit MB

The memory size of each worker. Valid values: 1024 to 65536. Unit: MB.

Output ports

Output port (from left to right)

Storage location

Recommended downstream component

Model type

Output result

N/A

None

None

Example

You can copy the following code to the code editor of the PyAlink Script component. This allows the PyAlink Script component to function like the Standard Scaler Batch Predict component.

from pyalink.alink import *

def main(sources, sinks, parameter):
    model = sources[0]
    batchData = sources[1]
    predictor = StandardScalerPredictBatchOp()
    result = predictor.linkFrom(model, batchData)
    result.link(sinks[0])
    BatchOperator.execute()