Binary Classification Evaluation - Platform For AI - Alibaba Cloud Documentation Center

Binary Classification Evaluation is a technique used to assess the performance of binary classification models by calculating metrics such as AUC, KS, and F1 Score. The evaluation outputs include KS curves, PR curves, ROC curves, LIFT Charts, and Gain Charts, which collectively provide a comprehensive view of the model's classification effectiveness and performance.

Configure the component

You can use one of the following methods to configure the Binary Classification Evaluation component.

Method 1: Configure the component on the pipeline page

You can configure the parameters of the Binary Classification Evaluation component on the pipeline page of Machine Learning Designer of Machine Learning Platform for AI (PAI). Machine Learning Designer is formerly known as Machine Learning Studio. The following table describes the parameters.

Parameter	Description
Original Label Column	The name of the objective column.
Score Column	The prediction score column. Default value: prediction_score.
Positive Sample Label	Specifies whether the samples are positive samples.
Number of Bins with Same Frequency when Calculating Indexes such as KS and PR	The number of bins obtained by using the equal frequency binning method.
Grouping Column	The group ID column. This parameter is used to calculate evaluation metrics for each group.
Advanced Options	If you select Advanced Options, the Prediction Result Detail Column, Prediction Targets Consistent With Evaluation Targets, and Save Performance Index parameters are valid.
Prediction Result Detail Column	The name of the prediction result detail column.
Prediction Targets Consistent with Evaluation Targets	Specifies whether the prediction objective is consistent with the evaluation objective. For example, in a financial scenario, a program is trained to predict the probability that a customer is bad. The larger the probability is, the more likely the customer is bad. Related metrics such as lift evaluate the bad-customer detection rate. In this case, the prediction objective is consistent with the evaluation objective. In a credit scoring scenario, a program is trained to predict the probability that a customer is good. The larger the probability is, the more likely the customer is good. However, related metrics evaluate the bad-customer detection rate. In this case, the prediction objective is inconsistent with the evaluation objective.
Save Performance Index	Specifies whether to save performance metrics.

Method 2: Use PAI commands

Configure the component parameters by using PAI commands. You can use the SQL Script component to call PAI commands. For more information, see SQL Script.

PAI -name=evaluate -project=algo_public
    -DoutputMetricTableName=output_metric_table
    -DoutputDetailTableName=output_detail_table
    -DinputTableName=input_data_table
    -DlabelColName=label
    -DscoreColName=score

Parameter	Required	Description	Default value
inputTableName	Yes	The name of the input table.	N/A
inputTablePartitions	No	The partitions that are selected from the input table for training.	Full table
labelColName	Yes	The name of the objective column.	N/A
scoreColName	Yes	The name of the score column.	N/A
groupColName	No	The name of the group column. This parameter is used for the evaluation of each group.	N/A
binCount	No	The number of bins obtained by using the equal frequency binning method during the calculation of metrics such as KS and PR.	1000
outputMetricTableName	Yes	The output metric table. The metrics include AUC, KS, and F1 score.	N/A
outputDetailTableName	No	The detail data table that is generated.	N/A
positiveLabel	No	Specifies whether the samples are positive samples.	1
lifecycle	No	The lifecycle of the output table.	N/A
coreNum	No	The number of cores.	Determined by the system
memSizePerCore	No	The memory size of each core.	Determined by the system