Assistant Engineer
Assistant Engineer
  • UID627
  • Fans2
  • Follows0
  • Posts55

Establish a TensorFlow Serving cluster easily using Docker and Alibaba Cloud Container Service

More Posted time:Jan 13, 2017 14:18 PM

This series will utilize Docker and Alibaba Cloud Container Service to help you get started with TensorFlow machine learning schemes.
• Article 1: Create a TensorFlow experimental environment
• Article 2: Establish a TensorFlow Serving cluster easily - this article
• Article 3 Straighten out a TensorFlow continuous training link
This article is the second one in this series, aiming to quickly familiarize you with TensorFlow Serving's principles and usages and teach you how to easily establish a TensorFlow Serving cluster using Alibaba Cloud Container Service on the cloud.
TensorFlow Serving is an open-source flexible and high-performance machine learning model service system by Google. It can simplify and accelerate the process from model generation to model application to production. Apart from its native support for TensorFlow models, TensorFlow Serving also supports other types of machine learning models through extensions.
The typical process of TensorFlow Serving is as follows: Learner, such as TensorFlow, receives model training according to the input data. After the model training is completed and verified, the model will be published on the TensorFlow Serving system's server side. The client initiates the request and the service side returns the forecast result. The client and the service side adopt the RPC protocol for communications.

The original figure is from First Contact With TensorFlow.
An example of running TensorFlow Serving on a local machine
TensorFlow Serving also supports installation and usage through Docker. But currently there is neither an official image nor Dockerfile for automatic image building. You need to build TensorFlow Serving images manually.
To simplify the deployment, I provide two pre-built TensorFlow Serving example images for testing.
• registry.cn-hangzhou.aliyuncs.com/denverdino/tensorflow-serving: the base image of TensorFlow Serving
• registry.cn-hangzhou.aliyuncs.com/denverdino/inception-serving: the service image with Inception model added based on the above base image
We use the Docker command to start the container named “inception-serving” as the TF Serving server
docker run -d --name inception-serving registry.cn-hangzhou.aliyuncs.com/denverdino/inception-serving
Then we start the “tensorflow-serving” image as the client with the Docker command in an interactive way, and define the container link, allowing access to the “inception-serving” container through the alias of “serving” in the container.
docker run -ti --name client --link inception-serving:serving registry.cn-hangzhou.aliyuncs.com/denverdino/tensorflow-serving
In the client container, we execute the following scripts to facilitate image recognition using the “inception-serving” service.
# persian cat
curl http://f.hiphotos.baidu.com/baike/w%3D268%3Bg%3D0/sign=6268660aafec8a13141a50e6cf38f6b2/32fa828ba61ea8d3c85b36e1910a304e241f58dd.jpg -o persian_cat_image.jpg

/serving/bazel-bin/tensorflow_serving/example/inception_client --server=serving:9000 --image=$PWD/persian_cat_image.jpg

# garfield cat
curl http://a2.att.hudong.com/60/11/01300000010387125853110118750_s.jpg -o garfield_image.jpg

/serving/bazel-bin/tensorflow_serving/example/inception_client --server=serving:9000 --image=$PWD/garfield_image.jpg

Note: The client code “inception_client.py” can access the gRPC service provided by the “inception-serving” container through “serving: 9000”.  
The Inception model can conveniently categorize our cats correctly.
The computing capacity of a TensorFlow Serving service node is limited. A cluster is required in the production environment to achieve load balancing and a high availability. TensorFlow currently provides a Kubernetes-based cluster deployment prototype and supports other container orchestration technologies.
Deploy a TensorFlow Serving distributed cluster using the Container Service
Alibaba Cloud Container Service provides simple but powerful container orchestration capabilities for convenient deployment and management of TensorFlow Serving clusters on the cloud and enables load balancing through Alibaba Cloud SLB.
We can deploy a TensorFlow Serving distributed cluster in one click on Alibaba Cloud using the following docker-compose template.
version: '2'
    image: registry.cn-hangzhou.aliyuncs.com/denverdino/inception-serving
      - 9000:9000
      aliyun.scale: "3"
      aliyun.lb.port_9000: tcp://inception-serving:9000

Note: The extension label of Alibaba Cloud is as follows.
• aliyun.scale indicates three container instances are needed to provide the TensorFlow Serving service.
• aliyun.lb.port_9000 indicates to provide load balancing for the service port 9000 of the container through the SLB named “inception-serving”.
First, we need to create a load balancing instance and edit the name into “inception-serving”.
Then add the listener port TCP/9000 and the corresponding back-end port is 9000.

The orchestration template deployment will be completed in several minutes. The 9000 port of every “serving” container is exposed on the host machine, and corresponding nodes are bound to the “inception-serving” SLB by the Container Service automatically as back-end servers.

We can execute the following commands in the client container we just created in the local machine to send the forecast request to the cloud server on Alibaba Cloud. Note: Please change the gRPC server address in the request to the load balancing instance address.  
/serving/bazel-bin/tensorflow_serving/example/inception_client --server=<SLB_IP>:9000 --image=$PWD/garfield_image.jpg
The execution results are as follows:
D0922 14:31:39.463336540      31 ev_posix.c:101]             Using polling engine: poll
outputs {
  key: "classes"
  value {
    dtype: DT_STRING
    tensor_shape {
      dim {
        size: 1
      dim {
        size: 5
    string_val: "tabby, tabby cat"
    string_val: "Egyptian cat"
    string_val: "tiger cat"
    string_val: "Persian cat"
    string_val: "lynx, catamount"
outputs {
  key: "scores"
  value {
    dtype: DT_FLOAT
    tensor_shape {
      dim {
        size: 1
      dim {
        size: 5
    float_val: 8.45185947418
    float_val: 7.37638807297
    float_val: 7.24321079254
    float_val: 7.21496248245
    float_val: 4.0578494072

E0922 14:31:41.027554353      31 chttp2_transport.c:1810]    close_transport: {"created":"@1474554701.027514401","description":"FD shutdown","file":"src/core/lib/iomgr/ev_poll_posix.c","file_line":427}

Our Garfield is easily identified.
With Alibaba Cloud Container Service, we can quickly test and deploy deep learning applications on the cloud and bring machine learning closer to the public. Alibaba Cloud offers rich infrastructure for machine learning, from elastic computing and load balancing to object storage, logs and monitoring. Container Service can elegantly integrate these capabilities to unleash the power of deep learning applications.
Meanwhile, TensorFlow Serving is very suitable for continuous training and multi-model dynamically adjusted based on real data. It can combine with the DevOps strength of Alibaba Cloud Container Service to simplify and optimize the model testing and releasing processes.
Alibaba Cloud Container Service will also work with the high-performance computing (HPC) team to provide machine learning solutions integrated with CPU acceleration and Docker cluster management on Alibaba Cloud, in a bid to further improve the machine learning efficiency on the cloud end.
[Dave edited the post at Jan 13, 2017 17:07 PM]