We recommend that you use official Elastic Algorithm Service (EAS) SDKs provided by Machine Learning Platform for AI (PAI) to call the services deployed based on models. This reduces the time required for defining the call logic and improves the call stability. This topic describes EAS SDK for Python in detail. This topic also provides demos to describe how to use EAS SDK for Python to call services. In these demos, inputs and outputs are of commonly used types.
Installation command
pip install -U eas-prediction --user
Methods
Class | Method | Detailed information |
---|---|---|
PredictClient | PredictClient(endpoint, service_name) |
|
set_endpoint(endpoint) |
|
|
set_service_name(service_name) |
|
|
set_endpoint_type(endpoint_type) |
|
|
set_token(token) |
|
|
set_retry_count(max_retry_count) |
|
|
set_max_connection_count(max_connection_count) |
|
|
set_timeout(timeout) |
|
|
init() |
Description: Initializes a client object. After all the preceding methods that are
used to set parameters are called, the parameters take effect only after you call
the Init() method.
|
|
predict(request) |
|
|
StringRequest | StringRequest(request_data) |
|
StringResponse | to_string() |
|
TFRequest | TFRequest(signature_name) |
|
add_feed(self, input_name, shape, data_type, content) |
|
|
add_fetch(self, output_name) |
|
|
to_string() |
|
|
TFResponse | get_tensor_shape(output_name) |
|
get_values(output_name) |
|
|
TorchRequest | TorchRequest() |
Description: Creates an object of the TorchRequest class. |
add_feed(self, index, shape, data_type, content) |
|
|
add_fetch(self, output_index) |
|
|
to_string() |
|
|
TorchResponse | get_tensor_shape(output_index) |
|
get_values(output_index) |
|
Demos
- Input and output as stringsIf you use custom processors to deploy models as services, strings are often used to call the services, such as a service deployed based on a Predictive Model Markup Language (PMML) model. The following demo is for your reference:
#!/usr/bin/env python from eas_prediction import PredictClient from eas_prediction import StringRequest if __name__ == '__main__': client = PredictClient('http://182848887922****.cn-shanghai.pai-eas.aliyuncs.com', 'scorecard_pmml_example') client.set_token('YWFlMDYyZDNmNTc3M2I3MzMwYmY0MmYwM2Y2MTYxMTY4NzBkNzdj****') client.init() request = StringRequest('[{"fea1": 1, "fea2": 2}]') for x in range(0, 1000000): resp = client.predict(request) print(resp)
- Call a TensorFlow modelIf you use TensorFlow to deploy models as services, you must use the TFRequest and TFResponse classes to call the services. The following demo is for your reference:
#!/usr/bin/env python from eas_prediction import PredictClient from eas_prediction import StringRequest from eas_prediction import TFRequest if __name__ == '__main__': client = PredictClient('http://182848887922****.cn-shanghai.pai-eas.aliyuncs.com', 'mnist_saved_model_example') client.set_token('YTg2ZjE0ZjM4ZmE3OTc0NzYxZDMyNmYzMTJjZTQ1YmU0N2FjMTAy****') client.init() #request = StringRequest('[{}]') req = TFRequest('predict_images') req.add_feed('images', [1, 784], TFRequest.DT_FLOAT, [1] * 784) for x in range(0, 1000000): resp = client.predict(req) print(resp)
- Use a VPC direct connection channel to call a serviceYou can use a VPC direct connection channel to access only the services that are deployed in the dedicated resource group for EAS. In addition, to use the channel, the dedicated resource group for EAS and the specified vSwitch must be connected to the VPC. For more information, see Dedicated resource groups and VPC direct connection channel. Compared with the regular mode, this mode contains an additional line of code:
client.set_endpoint_type(ENDPOINT_TYPE_DIRECT)
. You can use this mode in high-concurrency and heavy-traffic scenarios. The following demo is for your reference:#!/usr/bin/env python from eas_prediction import PredictClient from eas_prediction import StringRequest from eas_prediction import TFRequest from eas_prediction import ENDPOINT_TYPE_DIRECT if __name__ == '__main__': client = PredictClient('http://pai-eas-vpc.cn-hangzhou.aliyuncs.com', 'mnist_saved_model_example') client.set_token('M2FhNjJlZDBmMzBmMzE4NjFiNzZhMmUxY2IxZjkyMDczNzAzYjFi****') client.set_endpoint_type(ENDPOINT_TYPE_DIRECT) client.init() request = TFRequest('predict_images') request.add_feed('images', [1, 784], TFRequest.DT_FLOAT, [1] * 784) for x in range(0, 1000000): resp = client.predict(request) print(resp)
- Call a PyTorch modelIf you use PyTorch to deploy models as services, you must use the TorchRequest and TorchResponse classes to call the services. The following demo is for your reference:
#!/usr/bin/env python from eas_prediction import PredictClient from eas_prediction import TorchRequest if __name__ == '__main__': client = PredictClient('http://182848887922****.cn-shanghai.pai-eas.aliyuncs.com', 'pytorch_gpu_wl') client.init() req = TorchRequest() req.add_feed(0, [1, 3, 224, 224], TorchRequest.DT_FLOAT, [1] * 150528) # req.add_fetch(0) import time st = time.time() timer = 0 for x in range(0, 10): resp = client.predict(req) timer += (time.time() - st) st = time.time() print(resp.get_tensor_shape(0)) # print(resp) print("average response time: %s s" % (timer / 10) )
- Call a Blade processor-based modelIf you use Blade processors to deploy models as services, you must use the BladeRequest and BladeResponse classes to call the services. The following demo is for your reference:
#!/usr/bin/env python from eas_prediction import PredictClient from eas_prediction import BladeRequest if __name__ == '__main__': client = PredictClient('http://182848887922****.cn-shanghai.pai-eas.aliyuncs.com', 'nlp_model_example') client.init() req = BladeRequest() req.add_feed('input_data', 1, [1, 360, 128], BladeRequest.DT_FLOAT, [0.8] * 85680) req.add_feed('input_length', 1, [1], BladeRequest.DT_INT32, [187]) req.add_feed('start_token', 1, [1], BladeRequest.DT_INT32, [104]) req.add_fetch('output', BladeRequest.DT_FLOAT) import time st = time.time() timer = 0 for x in range(0, 10): resp = client.predict(req) timer += (time.time() - st) st = time.time() # print(resp) # print(resp.get_values('output')) print(resp.get_tensor_shape('output')) print("average response time: %s s" % (timer / 10) )
- Call a Blade processor-based model that is compatible with default TensorFlow methodsYou can use the TFRequest and TFResponse classes to call a Blade processor-based model that is compatible with default TensorFlow methods supported by EAS. The following demo is for your reference:
#!/usr/bin/env python from eas_prediction import PredictClient from eas_prediction.blade_tf_request import TFRequest # Need Importing blade TFRequest if __name__ == '__main__': client = PredictClient('http://182848887922****.cn-shanghai.pai-eas.aliyuncs.com', 'nlp_model_example') client.init() req = TFRequest(signature_name='predict_words') req.add_feed('input_data', [1, 360, 128], TFRequest.DT_FLOAT, [0.8] * 85680) req.add_feed('input_length', [1], TFRequest.DT_INT32, [187]) req.add_feed('start_token', [1], TFRequest.DT_INT32, [104]) req.add_fetch('output') import time st = time.time() timer = 0 for x in range(0, 10): resp = client.predict(req) timer += (time.time() - st) st = time.time() # print(resp) # print(resp.get_values('output')) print(resp.get_tensor_shape('output')) print("average response time: %s s" % (timer / 10) )