All Products
Search
Document Center

TensorFlow Service Request Construction

Last Updated: Jul 21, 2019

Input data description

Elastic Algorithm Service (EAS) provides the TensorFlow processor for you to deploy TensorFlow models. To guarantee the performance of the models, both the input and output data must be serialized by Protocol Buffers. The following example shows how to create a request to call a TensorFlow model that uses the built-in TensorFlow processor.

Examples

We have deployed a testing service in the China (Shanghai) region. The service is accessible to all users in VPC networks in the China (Shanghai) region. The service name is mnist_saved_model_example. No token is specified. You can send requests to the
http://pai-eas-vpc.cn-shanghai.aliyuncs.com/api/predict/mnist_saved_model_example
endpoint to call the service.

1. Obtain the model information

You can send a GET request to obtain the model information. The response contains the signature_name and the name, type, and shape of the model input and output.

  1. $curl http://pai-eas-vpc.cn-shanghai.aliyuncs.com/api/predict/mnist_saved_model_example | python -mjson.tool
  2. {
  3. "inputs": [
  4. {
  5. "name": "images",
  6. "shape": [
  7. -1,
  8. 784
  9. ],
  10. "type": "DT_FLOAT"
  11. }
  12. ],
  13. "outputs": [
  14. {
  15. "name": "scores",
  16. "shape": [
  17. -1,
  18. 10
  19. ],
  20. "type": "DT_FLOAT"
  21. }
  22. ],
  23. "signature_name": "predict_images"
  24. }

This model is a MNIST data classification model. The input data type is DT_FLOAT. The shape of the input is [-1,784]. The first dimension is batch_size. If the request contains only one image, then the batch_size is set to 1. The second dimension is a 784-dimensional vector. When we train this model, we must convert the input data to a one-dimensional vector. Therefore, an image (28\ × 28) is converted to a 784-dimensional vector. If the shape of the input for training the model is [-1, 28, 28], then the input must be converted to a 28\ × 28 dimensional vector. The shape must be set to [-1, 28, 28]. If the shape of the request does not match the shape of the model, the request fails.

2. Use the Python client to call the service

This example shows how to use the Python client to call a TensorFlow service. First you must create a request file in the protobuf format for input. We have provided a Protocol Buffers package for the Python client. You can run the following command to install it:

  1. $ pip install http://eas-data.oss-cn-shanghai.aliyuncs.com/sdk/pai_tf_predict_proto-1.0-py2.py3-none-any.whl

The sample code is as follows:

  1. #! /usr/bin/env python
  2. # -*- coding: UTF-8 -*-
  3. import requests
  4. import cv2
  5. import numpy as np
  6. from pai_tf_predict_proto import tf_predict_pb2
  7. with open('2.jpg', 'rb') as infile:
  8. buf = infile.read()
  9. # Use NumPy to convert bytes to an array
  10. x = np.fromstring(buf, dtype='uint8')
  11. # Decode the array to a 28 × 28 matrix.
  12. img = cv2.imdecode(x, cv2.IMREAD_UNCHANGED)
  13. # Since the service API only supports 784-dimensional vectors, you must reshape the matrix to a 784-dimensional vector.
  14. img = np.reshape(img, 784)
  15. # build request
  16. request = tf_predict_pb2.PredictRequest()
  17. request.signature_name = 'predict_images'
  18. request.inputs['images'].dtype = tf_predict_pb2.DT_FLOAT # Same as the inputs.type of the model
  19. request.inputs['images'].array_shape.dim.extend([1, 784]) # Same as the inputs.shape of the model
  20. request.inputs['images'].float_val.extend(img) # data
  21. # Serialize the protobuf format data to the String type
  22. data = request.SerializeToString()
  23. # Only users in VPC networks in the China (Shanghai) region can call an API whose endpoint starts with pai-eas-vpc.cn-shanghai
  24. url = 'http://pai-eas-vpc.cn-shanghai.aliyuncs.com/api/predict/mnist_saved_model_example'
  25. headers = {"Authorization": 'your-token'} #In this example, no token is required
  26. resp = requests.post(url, data=data, headers=headers)
  27. if resp.status_code ! = 200:
  28. print resp.content
  29. else:
  30. response = predict_pb2.PredictResponse()
  31. response.ParseFromString(content)
  32. print(response)

You can download the MNIST images used in the preceding example from http://eas-data.oss-cn-shanghai.aliyuncs.com/data/test_images.zip

The following shows the output:

  1. outputs {
  2. key: "scores"
  3. value {
  4. dtype: DT_FLOAT
  5. array_shape {
  6. dim: 1
  7. dim: 10
  8. }
  9. float_val: 0.0
  10. float_val: 0.0
  11. float_val: 1.0
  12. float_val: 0.0
  13. float_val: 0.0
  14. float_val: 0.0
  15. float_val: 0.0
  16. float_val: 0.0
  17. float_val: 0.0
  18. float_val: 0.0
  19. }
  20. }

The output lists ten categories and their scores. According to the output, the third image 2.jpg is input. Therefore, all values are 0 except value[2]. The classification result is correct.

3. Use a client for other languages to call the service

If you use a client for a language other than Python, you must generate request code file tf.proto based on the Protocol Buffers package.

  1. syntax = "proto3";
  2. option cc_enable_arenas = true;
  3. option java_package = "com.aliyun.openservices.eas.predict.proto";
  4. option java_outer_classname = "PredictProtos";
  5. enum ArrayDataType {
  6. // Not a legal value for DataType. Used to indicate a DataType field
  7. // has not been set.
  8. DT_INVALID = 0;
  9. // Data types that all computation devices are expected to be
  10. // capable to support.
  11. DT_FLOAT = 1;
  12. DT_DOUBLE = 2;
  13. DT_INT32 = 3;
  14. DT_UINT8 = 4;
  15. DT_INT16 = 5;
  16. DT_INT8 = 6;
  17. DT_STRING = 7;
  18. DT_COMPLEX64 = 8; // Single-precision complex
  19. DT_INT64 = 9;
  20. DT_BOOL = 10;
  21. DT_QINT8 = 11; // Quantized int8
  22. DT_QUINT8 = 12; // Quantized uint8
  23. DT_QINT32 = 13; // Quantized int32
  24. DT_BFLOAT16 = 14; // Float32 truncated to 16 bits. Only for cast ops.
  25. DT_QINT16 = 15; // Quantized int16
  26. DT_QUINT16 = 16; // Quantized uint16
  27. DT_UINT16 = 17;
  28. DT_COMPLEX128 = 18; // Double-precision complex
  29. DT_HALF = 19;
  30. DT_RESOURCE = 20;
  31. DT_VARIANT = 21; // Arbitrary C++ data types
  32. }
  33. // Dimensions of an array
  34. message ArrayShape {
  35. repeated int64 dim = 1 [packed = true];
  36. }
  37. // Protocol buffer representing an array
  38. message ArrayProto {
  39. // Data Type.
  40. ArrayDataType dtype = 1;
  41. // Shape of the array.
  42. ArrayShape array_shape = 2;
  43. // DT_FLOAT.
  44. repeated float float_val = 3 [packed = true];
  45. // DT_DOUBLE.
  46. repeated double double_val = 4 [packed = true];
  47. // DT_INT32, DT_INT16, DT_INT8, DT_UINT8.
  48. repeated int32 int_val = 5 [packed = true];
  49. // DT_STRING.
  50. repeated bytes string_val = 6;
  51. // DT_INT64.
  52. repeated int64 int64_val = 7 [packed = true];
  53. // DT_BOOL.
  54. repeated bool bool_val = 8 [packed = true];
  55. }
  56. // PredictRequest specifies which TensorFlow model to run, as well as
  57. // how inputs are mapped to tensors and how outputs are filtered before
  58. // returning to user.
  59. message PredictRequest {
  60. // A named signature to evaluate. If unspecified, the default signature
  61. // will be used
  62. string signature_name = 1;
  63. // Input tensors.
  64. // Names of input tensors are alias names. The mapping from aliases to real
  65. // input tensor names is expected to be stored as named generic signature
  66. // under the key "inputs" in the model export.
  67. // Each alias listed in a generic signature named "inputs" should be provided
  68. // exactly once in order to run the prediction.
  69. map<string, ArrayProto> inputs = 2;
  70. // Output filter.
  71. // Names specified are alias names. The mapping from aliases to real output
  72. // tensor names is expected to be stored as named generic signature under
  73. // the key "outputs" in the model export.
  74. // Only tensors specified here will be run/fetched and returned, with the
  75. // exception that when none is specified, all tensors specified in the
  76. // named signature will be run/fetched and returned.
  77. repeated string output_filter = 3;
  78. }
  79. // Response for PredictRequest on successful run.
  80. message PredictResponse {
  81. // Output tensors.
  82. map<string, ArrayProto> outputs = 1;
  83. }

For more information about how to use Protocol Buffers, see https://developers.google.com/protocol-buffers/. PredictRequest defines the input format of the TensorFlow service and PredictResponse defines the output format of the service.

Install Protocol Buffers

You can run the following script to install Protocol Buffers:

  1. #/bin/bash
  2. PROTOC_ZIP=protoc-3.3.0-linux-x86_64.zip
  3. curl -OL https://github.com/google/protobuf/releases/download/v3.3.0/$PROTOC_ZIP
  4. unzip -o $PROTOC_ZIP -d ./ bin/protoc
  5. rm -f $PROTOC_ZIP

Generate a request code file

  • Java:
  1. $ bin/protoc --java_out=./ tf.proto

The request code file com/aliyun/openservices/eas/predict/proto/PredictProtos.java is generated in the current path. Import the file to the project.

  • Python:
  1. $ bin/protoc --python_out=./ tf.proto

The request code file tf_pb2.py is generated in the current path. Import the file to the project.

  • C++:
  1. $ bin/protoc --cpp_out=./ tf.proto

The request code files tf.pb.cc and tf.pb.h are generated in the current path. Include tf.pb.h in the code and add tf.pb.cc to the compile list.