Machine learning practice based on KubeVela

In the current wave of machine learning, AI engineers not only need to train and debug their own models, but also need to deploy and launch them to verify the effectiveness of the models (of course, sometimes this part of the work is completed by AI system engineers). This part of the work is tedious and consumes additional energy for AI engineers.

In the cloud native era, our model training and model services are usually conducted on the cloud. This not only improves scalability, but also improves resource utilization. This is very effective for machine learning scenarios that require a large amount of computing resources.

But AI engineers often find it difficult to use cloud native capabilities. Over time, the concept of cloud native has become increasingly complex. To deploy a simple model service on cloud native, AI engineers may need to learn several additional concepts, such as deployment, service, Ingress, etc.

As a simple, easy-to-use, and highly scalable cloud native application management tool, KubeVela allows developers to easily and quickly define and deliver applications on Kubernetes without the need to understand any details related to the underlying cloud native infrastructure. KubeVela has rich scalability, and its AI plugin provides functions such as model training, model services, and A/B testing, covering the basic needs of AI engineers. It can help AI engineers quickly conduct model training and model services in cloud native environments.

This article mainly introduces how to use KubeVela's AI plugin to help engineers more conveniently complete model training and model services.

KubeVela AI plugin

The KubeVela AI plugin is divided into two plugins: model training and model service. The model training plugin is based on KubeFlow's training operator and can support distributed model training for different frameworks such as TensorFlow, PyTorch, MXNet, etc. The model service plugin is based on the Seldon Core, which can easily start model services using the model. It also supports advanced functions such as traffic distribution and A/B testing.

Through the KubeVela AI plugin, the deployment of model training tasks and model services can be greatly simplified. At the same time, the process of model training and model services can be combined with KubeVela's own workflow, multi cluster, and other functions to complete the deployment of production available services.

Note: You can find all the source code and YAML files in KubeVela Samples [1]. If you want to use the pre trained model in this example, the style model. yaml and color model. yaml in the folder will copy the model into PVC.

model training

Firstly, start the two plugins for model training and model services.

The model training includes two component types: model training and jupyter notebook, while the model service includes the component type of model serving. You can view the specific parameters in these three components through the vela show command.

You can also choose to consult the KubeVela AI plugin documentation [2] for more information.

Let's train a simple model using the TensorFlow framework, which can turn gray images into colors.

At this point, KubeVela will pull up a TFJob for model training.

It is difficult to see the effect just by training the model. Let's modify this YAML file and place the model service after the steps of model training. Meanwhile, because the model service will directly start the model, and the input and output of the model are not very intuitive (ndarray or Tensor), we will deploy a test service to call the service and convert the results into images.

Model service: grayscale testing

In addition to directly starting the model service, we can also use multiple versions of the model in one model service and allocate different traffic for grayscale testing.

Deploying YAML as follows, it can be seen that both the v1 and v2 versions of the model are set to 50% traffic. Similarly, we deploy a testing service behind the model service.

By distributing traffic to different versions of the model, we can better judge the model results.

Model service: A/B testing

We can transform the same black and white image into color through a model, or we can transfer the style of the original image by uploading another style image.

For users, is it better to have color images or different styles of images? We can explore this issue by conducting A/B testing.

Deploy the following YAML and forward the request with style: transfer in the header to the style migration model by setting customRouting. At the same time, share the same address between the style migrated model and the colored model.


Through KubeVela's AI plugin, it can help you more conveniently conduct model training and model services.

In addition, by combining with KubeVela, we can also distribute the tested model to different environments through KubeVela's multi environment function, thereby achieving flexible deployment of the model.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us