All Products
Search
Document Center

Platform For AI:GBDT Binary Classification Prediction V2

Last Updated:Feb 26, 2024

The GBDT Binary Classification Prediction V2 component of Platform for AI (PAI) provides the prediction feature based on the GBDT Binary Classification V2 component. Gradient boosting decision trees are used to predict the binary classification results. This topic describes how to configure the GBDT Binary Classification Prediction V2 component.

Supported computing resources

You can use the GBDT Binary Classification Prediction V2 component based on the computing resources of MaxCompute and Flink.

Principle

The gradient boosting decision tree model consists of multiple decision trees. Each decision tree corresponds to a weak learner. Combining these weak learners together can achieve better classification and regression results.

The following figure shows the basic recursive structure of gradient boosting.

image

In most cases, image is a CART decision tree, image are the parameters of the decision tree, and imageis the step size. Each decision tree optimizes the objective function on the basis of the previous decision tree. After the preceding process, a model that contains multiple decision trees is obtained.

Configure the component in the PAI console

  • Input ports

    Input port (from left to right)

    Data type

    Recommended upstream component

    Required

    Input

    N/A

    GBDT Binary Classification V2

    Yes

    Predicted Data Table

    N/A

    Read Table

    Yes

  • Parameters

    Tab

    Parameter

    Required

    Description

    Default value

    Fields Information

    Prediction result column name

    Yes

    The name of the prediction result column.

    prediction_result

    predictionDetailCol

    No

    The name of the prediction details column.

    prediction_detail

    Reserved Columns

    No

    The names of reserved columns. By default, all columns are reserved.

    N/A

    Tuning

    Number of Instances

    No

    The number of instances that are used to run the job.

    The value is automatically calculated based on the input data.

    Memory Per Instance

    No

    The memory size of each instance. Unit: MB. Valid values: [100,65536].

    The value is automatically calculated based on the input data.

  • Output ports

    Port

    Storage location

    Recommended downstream component

    Model type

    Output

    N/A

    Binary Classification Evaluation

    N/A

References