All Products
Search
Document Center

Container Service for Kubernetes:Evaluate models

Last Updated:Oct 30, 2023

This topic describes how to evaluate the performance of a model, such as the accuracy and recall rate of the model. This topic also describes how to view and compare model evaluation results.

Prerequisites

Create an evaluation job

  1. Log on to AI Developer Console
  2. In the left-side navigation pane of AI Developer Console, click Model Manage.
  3. In the Model Manage List section, click New Model Evaluate in the Operation column of the model that you want to evaluate.

  4. In the EvaluateJob Message and EvaluateJob Config sections, configure the parameters.

    Section

    Parameter

    Description

    EvaluateJob Message

    EvaluateJob Name

    The name must be 1 to 256 characters in length, and can contain digits, letters, and hyphens (-).

    EvaluateJob Image

    The base image used by the evaluation job.

    Namespace

    The namespace to which the evaluation job belongs.

    Image Pull Secrets

    The credentials for pulling private images. This parameter is optional.

    Data Configuration

    Specify a data source. This parameter is optional.

    To use a PVC, you need to specify a data source.

    Model Path

    The path of the model that you want to evaluate.

    Dataset Path

    The path of the datasets that are used by the evaluation job.

    Metrics Path

    The path of the evaluation results generated by the evaluation job.

    Execution Command

    The command that you want the pods of the evaluation job to run.

    Code Configuration

    Specify a Git repository.

    To pull code from Git, you need to specify a Git repository.

    EvaluateJob Config

    CPU (Cores)

    The amount of resources requested by the evaluation job.

    Memory (GB)

    GPU (Card Numbers)

    The preceding parameters correspond to the parameters for submitting evaluation jobs in Arena. You can use the default evaluation job code or customize the code. For more information about how to write evaluation job code, see Write evaluation job code.

  5. Click Submit Evaluation Job.

    You can view information about the evaluation job on the Evaluate Jobs page.

Write evaluation job code

Procedure

Perform the following steps to write custom evaluation job code. For more information about the sample code, see Sample evaluation job code.

  1. Run the following command to download the latest KubeAI package:

    pip install kubeai
  2. Run the following command to import the ABC and KubeAI packages:

    from kubeai.evaluate.evaluator import Evaluator
    from abc import ABC
    from kubeai.api import KubeAI 
  3. Define an Evaluator class to inherit the Evaluator abstract class and rewrite the preprocess_dataset, load_model, evaluate_model, and report_metrics methods. These methods are used to preprocess the dataset, load the model, evaluate the model, and export the evaluation report.

    class CustomerEvaluatorDemo(Evaluator, ABC):
        def preprocess_dataset(self): # Preprocess the dataset. 
        def load_model(self): # Load the model. 
        def evaluate_model(self, dataset): # Evaluate the model. 
        def report_metrics(self, metrics): # Export the evaluation report.
  4. Create an API client and import the created Evaluator object. Then, call the Evaluate method to run the evaluation job.

    customer_evaluator = CustomerEvaluatorDemo()
    KubeAI.evaluate(customer_evaluator)

    If you want to perform the test on your on-premises machine, you can reference dataset_dir, model_dir, and report_dir to call the Test method to perform an on-premises test.

    customer_evaluator = CustomerEvaluatorDemo()
    KubeAI.test(customer_evaluator, model_dir, dataset_dir, report_dir)

Sample evaluation job code

The following sample code is used to evaluate the model trained based on the MNIST dataset of TensorFlow 1.15.

from kubeai.evaluate.evaluator import Evaluator
from abc import ABC
from kubeai.api import KubeAI
import tensorflow as tf
import numpy as np
from tensorflow.keras import layers, models

class CNN(object):
    def __init__(self):
        model = models.Sequential()
        model.add(layers.Conv2D(
            32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
        model.add(layers.MaxPooling2D((2, 2)))
        model.add(layers.Conv2D(64, (3, 3), activation='relu'))
        model.add(layers.MaxPooling2D((2, 2)))
        model.add(layers.Conv2D(64, (3, 3), activation='relu'))
        model.add(layers.Flatten())
        model.add(layers.Dense(64, activation='relu'))
        model.add(layers.Dense(10, activation='softmax'))
        model.summary()
        self.model = model

class TensorflowEvaluatorDemo(Evaluator, ABC):

    def preprocess_dataset(self):   #Preprocess the dataset. 
        with np.load(self.dataset_dir) as f:
            x_train, y_train = f['x_train'], f['y_train']
            x_test, y_test = f['x_test'], f['y_test']

        train_images = x_train.reshape((60000, 28, 28, 1))
        test_images = x_test.reshape((10000, 28, 28, 1))
        train_images, test_images = train_images / 255.0, test_images / 255.0

        train_images, train_labels = train_images, y_train
        test_images, test_labels = test_images, y_test
        test_loader = {
            "test_images" : test_images,
            "test_labels" : test_labels
        }
        return test_loader

    def load_model(self):  #Load the model. 
        latest = tf.train.latest_checkpoint(self.model_dir)
        self.cnn = CNN()
        self.model = self.cnn.model
        self.model.load_weights(latest)

    def evaluate_model(self, dataset):  #Evaluate the model.  
        metrics = Utils.evaluate_function_classification_tensorflow1(model=self.model, evaluate_x=dataset["test_images"], evaluate_y=dataset["test_labels"])
        predictions = self.model.predict(dataset["test_images"])
        pred = []
        for arr in predictions:
            pred.append(np.argmax(arr))
        pred = np.array(pred)
        confusion_matrix = mt.confusion_matrix(dataset["test_labels"], pred)
        metrics["Confusion_matrix"] = str(confusion_matrix)
        return metrics

    def report_metrics(self, metrics):  #Export the evaluation report. 
        print(metrics)
        Utils.ROC_plot(fpr=metrics["ROC"]["fpr"], tpr=metrics["ROC"]["tpr"], report_dir=self.report_dir)
        print("Here is the customer-defined report method")

if __name__ == '__main__':
    tensorflow_evaluator = TensorflowEvaluatorDemo()  #Create an API client and import the created Evaluator object. Then, call the Evaluate method to run the evaluation job. 
    KubeAI.evaluate(tensorflow_evaluator)
    # KubeAI.test(tensorflow_evaluator, model_dir, dataset_dir, report_dir)  #Perform the test on your on-premises machine.

View evaluation metrics

  1. In the left-side navigation pane of the AI development console, click Evaluate Jobs.

    查看评测任务
  2. In the Job List section, you can click the name of an evaluation job to view the metrics.

    查看评测指标

    The preceding figure shows the following metrics: Accuracy, Precision, Recall, F1_score, ROC, and AOC. F1_score indicates the average of Precision and Recall. The Receiver Operating Characteristic (ROC) curve shows the performance of the model. The Area Under ROC Curve (AOC) is the basic evaluation metric.

Compare evaluation metrics

  1. In the left-side navigation pane of AI Developer Console, click Evaluate Jobs.
  2. In the Job List section, select two or more evaluation jobs and click Metrics Compare in the upper-right corner of the section.

    对比评测指标

    On the page that appears, the metrics of the selected evaluation jobs are displayed in column charts. This provides a visualized comparison to help you better select from multiple models.评测指标对比