The Kubeflow service of an E-MapReduce (EMR) Data Science cluster provides the built-in Seldon Core component, which provides online services for models on Kubernetes. If you use Kubernetes, you do not need to focus on the O&M of online services. You can run dsdemo code to deploy models such as TensorFlow, PyTorch, and Python models to Seldon Core.

Prerequisites

  • A Data Science cluster is created, and Kubeflow is selected from the optional services when you create the cluster. For more information, see Create a cluster.
  • Model training by calling the Keras API is performed. For more information, see Use Kubeflow for model training.
  • The dsdemo code is downloaded. If you have created a Data Science cluster, you can join the DingTalk group numbered 32497587 to obtain the dsdemo code.

Procedure

Important In this example, an export model that is trained by calling the Keras API is used. For more information, see Use Kubeflow for model training.
  1. Log on to your cluster in SSH mode. For more information, see Log on to a cluster.
  2. Run the following command to access the mnist_from_pvcmodel directory:
    cd dsdemo/kubeflow_samples/serving/seldon/tf/mnist_from_pvcmodel/
  3. Run the following command to install Seldon Core:
    sudo pip3.7 install seldon_core
  4. Configure the mnist_grpc.yaml file based on your business requirements.
    apiVersion: machinelearning.seldon.io/v1alpha2
    kind: SeldonDeployment
    metadata:
      name: tfserving
    spec:
      name: mnist
      predictors:
      - graph:
          children: []
          implementation: TENSORFLOW_SERVER
          modelUri: "pvc://strategy-volume/saved_model/master/"
          name: mnist-model
          parameters:
            - name: signature_name
              type: STRING
              value: serving_default
            - name: model_name
              type: STRING
              value: mnist-model
            - name: model_input
              type: STRING
              value: images
            - name: model_output
              type: STRING
              value: scores
        name: default
        replicas: 1
  5. Run the following command to obtain the IP address of the Istio ingress gateway:
    kubectl get svc istio-ingressgateway -n istio-system
    Information similar to the following output is returned:
    # kubectl get svc istio-ingressgateway -n istio-system
    NAME                   TYPE       CLUSTER-IP       EXTERNAL-IP   PORT(S)                                                                                                                                      AGE
    istio-ingressgateway   NodePort   10.104.**.**   <none>        15020:31872/TCP,80:31380/TCP,443:31390/TCP,31400:31400/TCP,15029:30016/TCP,15030:30264/TCP,15031:31961/TCP,15032:31309/TCP,15443:31254/TCP   86m
    Note The value of CLUSTER-IP is the IP address of the Istio ingress gateway.
  6. Perform model prediction over Google Remote Procedure Call (gRPC) or the REST protocol.
    Note Change the value of minikube_ambassador_endpoint in the predict_rest.py or predict_grpc.py script to the IP address of the Istio ingress gateway. The IP address of the Istio ingress gateway is obtained in Step 5.
    • Perform model prediction over gRPC.
      python3.7 predict_grpc.py
    • Perform model prediction over the REST protocol.
      python3.7 predict_rest.py
    Information similar to the following output is returned:
    Response:
    null
    You can perform the preceding steps to deploy services such as TensorFlow, PyTorch, and Python in Seldon Core to implement online services.