The Multiclass Classification Evaluation component is used to evaluate the advantages and disadvantages of the models of multiclass classification algorithms based on the prediction results and original results of classification models. Then, evaluation metrics such as accuracy, kappa, and F1 score are generated.

Configure the component

You can configure the component by using one of the following methods:
  • Use the Machine Learning Platform for AI console
    Tab Parameter Description
    Fields Setting Expected Classification Result Column The original label column can be selected for this parameter. A maximum of 1,000 classification results are supported.
    Predicted Classification Result Column The predicted classification result column. In most cases, the default value prediction_result is used.
    Advanced Options If you select Advanced Options, the Predicted Result Probability Column parameter is valid
    Predicted Result Probability Column The column that is used to calculate the log loss of the model. In most cases, this parameter is set to prediction_detail. This parameter is valid only for the random forest model. If you specify this parameter for other models, the system may report an error.
    Tuning Number of Cores The number of cores. By default, the system determines the value. This parameter must be used with the Core Memory Allocation parameter.
    Core Memory Allocation The memory size of each core. Unit: MB. By default, the system determines the value.
  • Use commands
    PAI -name MultiClassEvaluation -project algo_public \
        –DinputTableName="test_input" \
        -DoutputTableName="test_output" \
        -DlabelColName="label" \
        -DpredictionColName="prediction_result" \
        -Dlifecycle=30;
    Parameter Required Description Default value
    inputTableName Yes The name of the input table. N/A
    inputTablePartitions No The partitions that are selected from the input table for training. Full table
    outputTableName Yes The name of the output table. N/A
    labelColName Yes The name of the original label column in the input table. N/A
    predictionColName Yes The name of the label column of the prediction result. N/A
    predictionDetailColName No The name of the probability column of the prediction result. Example: {"A":0.2,"B":0.3,"C": 0.5}. Empty string
    lifecycle No The lifecycle of the output table. N/A
    coreNum No The number of cores. Determined by the system
    memSizePerCore No The memory size of each core. Determined by the system

Output

The evaluation report generated by the Multiclass Classification Evaluation component contains the following tabs:
  • Overview

    On the Overview tab, the overall metrics are displayed. MacroAveraged is the average values of the metrics for each label.

  • Confusion Matrix
  • Proportion Matrix
  • Statistics

    On the Statistics tab, the metrics of each label are calculated by using the one-vs.-all method.

The following code shows the output table in the JSON format that is generated by the Multiclass Classification Evaluation component:
{
    "LabelNumber": 3,
    "LabelList": ["A", "B", "C"],
    "ConfusionMatrix": [ // The confusion matrix [actual][predict]. 
        [100, 10, 20],
        [30, 50, 9],
        [7, 40, 90] ],
    "ProportionMatrix": [ // The proportion matrix [actual][predict] that is based on the proportion of each row. 
        [0.6, 0.2, 0.2],
        [0.3, 0.6, 0.1],
        [0.1, 0.4, 0.5] ],
    "ActualLabelFrequencyList": [ // The actual number of data records in each label. 
        200, 300, 600],
    "ActualLabelProportionList": [ // The actual proportion of data records in each label. 
        0.1, 0.2, 0.7],
    "PredictedLabelFrequencyList": [ // The predicted number of data records in each label. 
        300, 400, 400],
    "PredictedLabelProportionList": [ // The predicted proportion of data records in each label. 
        0.2, 0.1, 0.7],
    "OverallMeasures": {        // The overall metrics. 
        "Accuracy": 0.70,
        "Kappa" : 0.3,
        "MacroList": {       // The average values of metrics of each label. 
            "Sensitivity": 0.4,
            "Specificity": 0.3,
        },
        "MicroList": {      // Calculate the metric based on the sum of TP, TN, FP, and FN of each label. 
            "Sensitivity": 0.4,
            "Specificity": 0.3,
        },
        "LabelFrequencyBasedMicro": { // Calculate the weighted average values of metrics of each label. 
            "Sensitivity": 0.4,
            "Specificity": 0.3,
        },
    },
    "LabelMeasuresList": [ // The metrics of each label. 
        {
            "Accuracy": 0.6,
            "Sensitivity": 0.4,
            "Specificity": 0.3,
            "Kappa": 0.3
        },
        {
            "Accuracy": 0.6,
            "Sensitivity": 0.4,
            "Specificity": 0.3,
            "Kappa": 0.3
        },
    ]
}

Example

  1. Use the following test data as the input.
    id label prediction detail
    0 A A {"A": 0.6, "B": 0.4}
    1 A B {"A": 0.45, "B": 0.55}
    2 A A {"A": 0.7, "B": 0.3}
    3 A A {"A": 0.9, "B": 0.1}
    4 B B {"A": 0.2, "B": 0.8}
    5 B B {"A": 0.1, "B": 0.9}
    6 B A {"A": 0.52, "B": 0.48}
    7 B B {"A": 0.4, "B": 0.6}
    8 B A {"A": 0.6, "B": 0.4}
    9 A A {"A": 0.75, "B": 0.25}
  2. Create the experiment shown in the following figure. For more information, see Generate a model by using an algorithm. Experiment of Multiclass Classification Evaluation
  3. Configure the parameters listed in the following table for the Multiclass Classification Evaluation component. Retain the default values of the parameters that are not listed in the table.
    Tab Parameter Description
    Fields Setting Expected Classification Result Column Select the label column.
    Predicted Classification Result Column Set the parameter to prediction.
    Advanced Options Select Advanced Options.
    Predicted Result Probability Column Set the parameter to detail.
  4. Run the experiment and view the evaluation report generated by the Multiclass Classification Evaluation component:
    • Click the Overview tab to view the overview of the evaluation report.
    • Click the Confusion Matrix tab to view the confusion matrix.
    • Click the Proportion Matrix tab to view the proportion matrix.
    • Click the Statistics tab to view the statistics of the model.
    The following code shows the evaluation report in the JSON format:
    {
        "ActualLabelFrequencyList": [5,
            5],
        "ActualLabelProportionList": [0.5,
            0.5],
        "ConfusionMatrix": [[4,
                1],
            [2,
                3]],
        "LabelList": ["A",
            "B"],
        "LabelMeasureList": [{
                "Accuracy": 0.7,
                "Auc": 0.9,
                "F1": 0.7272727272727273,
                "FalseDiscoveryRate": 0.3333333333333333,
                "FalseNegative": 1,
                "FalseNegativeRate": 0.2,
                "FalsePositive": 2,
                "FalsePositiveRate": 0.4,
                "Kappa": 0.3999999999999999,
                "NegativePredictiveValue": 0.75,
                "Precision": 0.6666666666666666,
                "Sensitivity": 0.8,
                "Specificity": 0.6,
                "TrueNegative": 3,
                "TruePositive": 4},
            {
                "Accuracy": 0.7,
                "Auc": 0.9,
                "F1": 0.6666666666666666,
                "FalseDiscoveryRate": 0.25,
                "FalseNegative": 2,
                "FalseNegativeRate": 0.4,
                "FalsePositive": 1,
                "FalsePositiveRate": 0.2,
                "Kappa": 0.3999999999999999,
                "NegativePredictiveValue": 0.6666666666666666,
                "Precision": 0.75,
                "Sensitivity": 0.6,
                "Specificity": 0.8,
                "TrueNegative": 4,
                "TruePositive": 3}],
        "LabelNumber": 2,
        "OverallMeasures": {
            "Accuracy": 0.7,
            "Kappa": 0.3999999999999999,
            "LabelFrequencyBasedMicro": {
                "Accuracy": 0.7,
                "F1": 0.696969696969697,
                "FalseDiscoveryRate": 0.2916666666666666,
                "FalseNegative": 1.5,
                "FalseNegativeRate": 0.3,
                "FalsePositive": 1.5,
                "FalsePositiveRate": 0.3,
                "Kappa": 0.3999999999999999,
                "NegativePredictiveValue": 0.7083333333333333,
                "Precision": 0.7083333333333333,
                "Sensitivity": 0.7,
                "Specificity": 0.7,
                "TrueNegative": 3.5,
                "TruePositive": 3.5},
            "LogLoss": 0.4548640449724484,
            "MacroAveraged": {
                "Accuracy": 0.7,
                "F1": 0.696969696969697,
                "FalseDiscoveryRate": 0.2916666666666666,
                "FalseNegative": 1.5,
                "FalseNegativeRate": 0.3,
                "FalsePositive": 1.5,
                "FalsePositiveRate": 0.3,
                "Kappa": 0.3999999999999999,
                "NegativePredictiveValue": 0.7083333333333333,
                "Precision": 0.7083333333333333,
                "Sensitivity": 0.7,
                "Specificity": 0.7,
                "TrueNegative": 3.5,
                "TruePositive": 3.5},
            "MicroAveraged": {
                "Accuracy": 0.7,
                "F1": 0.7,
                "FalseDiscoveryRate": 0.3,
                "FalseNegative": 3,
                "FalseNegativeRate": 0.3,
                "FalsePositive": 3,
                "FalsePositiveRate": 0.3,
                "Kappa": 0.3999999999999999,
                "NegativePredictiveValue": 0.7,
                "Precision": 0.7,
                "Sensitivity": 0.7,
                "Specificity": 0.7,
                "TrueNegative": 7,
                "TruePositive": 7}},
        "PredictedLabelFrequencyList": [6,
            4],
        "PredictedLabelProportionList": [0.6,
            0.4],
        "ProportionMatrix": [[0.8,
                0.2],
            [0.4,
                0.6]]}