We recommend that you use official Elastic Algorithm Service (EAS) SDKs to reduce the amount of time required for defining the call logic and improve call stability. This topic describes EAS SDK for Python and provides commonly used inputs and outputs and demos to show how to use EAS SDK for Python to call services.
Install the SDK
pip install -U eas-prediction --user
Methods
Common parameter description
endpoint: the endpoint of the server.
If you want to call a service in regular mode, set this parameter to the endpoint of the default gateway. Example:
182848887922***.cn-shanghai.pai-eas.aliyuncs.com
.If you want to call a service over a Virtual Private Cloud (VPC) direct connection, set this parameter to the common endpoint of the region. For example, in the China (Shanghai) region, set this parameter to
pai-eas-vpc.cn-shanghai.aliyuncs.com
.
PredictClient
Method | Description |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Description: initializes a client object. After all preceding methods are called, you must call the |
|
|
StringRequest
Method | Description |
|
|
StringResponse
Method | Description |
|
|
TFRequest
Method | Description |
|
|
|
|
|
|
|
|
TFResponse
Method | Description |
|
|
|
|
TorchRequest
Method | Description |
| Description: creates an object of the TorchRequest class. |
|
|
|
|
|
|
TorchResponse
Method | Description |
|
|
|
|
QueueClient
Method | Description |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Watcher
Method | Description |
|
|
| Description: closes a watcher to terminate backend connections. Note Only one watcher can be started for a single client. You must close the watcher before you can start another watcher. |
Demos
Input and output as strings
If you use custom processors to deploy services, strings are often used to call the services, such as the service deployed based on a Predictive Model Markup Language (PMML) model. Demo code:
#!/usr/bin/env python from eas_prediction import PredictClient from eas_prediction import StringRequest if __name__ == '__main__': client = PredictClient('http://182848887922****.cn-shanghai.pai-eas.aliyuncs.com', 'scorecard_pmml_example') client.set_token('YWFlMDYyZDNmNTc3M2I3MzMwYmY0MmYwM2Y2MTYxMTY4NzBkNzdj****') client.init() request = StringRequest('[{"fea1": 1, "fea2": 2}]') for x in range(0, 1000000): resp = client.predict(request) print(resp)
Input and output as tensors
If you use TensorFlow to deploy services, you must use the TFRequest and TFResponse classes to call the services. Demo code:
#!/usr/bin/env python from eas_prediction import PredictClient from eas_prediction import StringRequest from eas_prediction import TFRequest if __name__ == '__main__': client = PredictClient('http://182848887922****.cn-shanghai.pai-eas.aliyuncs.com', 'mnist_saved_model_example') client.set_token('YTg2ZjE0ZjM4ZmE3OTc0NzYxZDMyNmYzMTJjZTQ1YmU0N2FjMTAy****') client.init() #request = StringRequest('[{}]') req = TFRequest('predict_images') req.add_feed('images', [1, 784], TFRequest.DT_FLOAT, [1] * 784) for x in range(0, 1000000): resp = client.predict(req) print(resp)
Use a VPC direct connection channel to call a service
You can use a VPC direct connection channel to access only the services deployed in an EAS dedicated resource group. To use the VPC direct connection channel, the EAS dedicated resource group and the specified vSwitch must reside in the same VPC. For more information how to purchase EAS dedicated resource groups and how to configure network connectivity, see Work with dedicated resource groups and Configure network connectivity. Compared with the regular mode, this mode contains an additional line of code:
client.set_endpoint_type(ENDPOINT_TYPE_DIRECT)
. You can use this mode in high-concurrency and heavy-traffic scenarios. Demo code:#!/usr/bin/env python from eas_prediction import PredictClient from eas_prediction import StringRequest from eas_prediction import TFRequest from eas_prediction import ENDPOINT_TYPE_DIRECT if __name__ == '__main__': client = PredictClient('http://pai-eas-vpc.cn-hangzhou.aliyuncs.com', 'mnist_saved_model_example') client.set_token('M2FhNjJlZDBmMzBmMzE4NjFiNzZhMmUxY2IxZjkyMDczNzAzYjFi****') client.set_endpoint_type(ENDPOINT_TYPE_DIRECT) client.init() request = TFRequest('predict_images') request.add_feed('images', [1, 784], TFRequest.DT_FLOAT, [1] * 784) for x in range(0, 1000000): resp = client.predict(request) print(resp)
Call a PyTorch model
If you use PyTorch to deploy services, you must use the TorchRequest and TorchResponse classes to call the services. Demo code:
#!/usr/bin/env python from eas_prediction import PredictClient from eas_prediction import TorchRequest if __name__ == '__main__': client = PredictClient('http://182848887922****.cn-shanghai.pai-eas.aliyuncs.com', 'pytorch_gpu_wl') client.init() req = TorchRequest() req.add_feed(0, [1, 3, 224, 224], TorchRequest.DT_FLOAT, [1] * 150528) # req.add_fetch(0) import time st = time.time() timer = 0 for x in range(0, 10): resp = client.predict(req) timer += (time.time() - st) st = time.time() print(resp.get_tensor_shape(0)) # print(resp) print("average response time: %s s" % (timer / 10) )
Call a Blade processor-based model
If you use Blade processors to deploy services, you must use the BladeRequest and BladeResponse classes to call the services. Demo code:
#!/usr/bin/env python from eas_prediction import PredictClient from eas_prediction import BladeRequest if __name__ == '__main__': client = PredictClient('http://182848887922****.cn-shanghai.pai-eas.aliyuncs.com', 'nlp_model_example') client.init() req = BladeRequest() req.add_feed('input_data', 1, [1, 360, 128], BladeRequest.DT_FLOAT, [0.8] * 85680) req.add_feed('input_length', 1, [1], BladeRequest.DT_INT32, [187]) req.add_feed('start_token', 1, [1], BladeRequest.DT_INT32, [104]) req.add_fetch('output', BladeRequest.DT_FLOAT) import time st = time.time() timer = 0 for x in range(0, 10): resp = client.predict(req) timer += (time.time() - st) st = time.time() # print(resp) # print(resp.get_values('output')) print(resp.get_tensor_shape('output')) print("average response time: %s s" % (timer / 10) )
Call an EAS Blade processor-based model that is compatible with default TensorFlow methods
You can use the TFRequest and TFResponse classes to call a Blade processor-based model that is compatible with default TensorFlow methods supported by EAS. Demo code:
#!/usr/bin/env python from eas_prediction import PredictClient from eas_prediction.blade_tf_request import TFRequest # Need Importing blade TFRequest if __name__ == '__main__': client = PredictClient('http://182848887922****.cn-shanghai.pai-eas.aliyuncs.com', 'nlp_model_example') client.init() req = TFRequest(signature_name='predict_words') req.add_feed('input_data', [1, 360, 128], TFRequest.DT_FLOAT, [0.8] * 85680) req.add_feed('input_length', [1], TFRequest.DT_INT32, [187]) req.add_feed('start_token', [1], TFRequest.DT_INT32, [104]) req.add_fetch('output') import time st = time.time() timer = 0 for x in range(0, 10): resp = client.predict(req) timer += (time.time() - st) st = time.time() # print(resp) # print(resp.get_values('output')) print(resp.get_tensor_shape('output')) print("average response time: %s s" % (timer / 10) )
Use the queuing service to send and subscribe to data
You can send and query data in a queue, query the state of a queue, and subscribe to data pushed by a queue. In the following demo, a thread pushes data to a queue, and another thread uses a watcher to subscribe to the pushed data. Demo code:
#!/usr/bin/env python from eas_prediction import QueueClient import threading if __name__ == '__main__': endpoint = '182848887922****.cn-shanghai.pai-eas.aliyuncs.com' queue_name = 'test_group.qservice/sink' token = 'YmE3NDkyMzdiMzNmMGM3ZmE4ZmNjZDk0M2NiMDA3OTZmNzc1MTUx****' queue = QueueClient(endpoint, queue_name) queue.set_token(token) queue.init() queue.set_timeout(30000) # truncate all messages in the queue attributes = queue.attributes() if 'stream.lastEntry' in attributes: queue.truncate(int(attributes['stream.lastEntry']) + 1) count = 100 # create a thread to send messages to the queue def send_thread(): for i in range(count): index, request_id = queue.put('[{}]') print('send: ', i, index, request_id) # create a thread to watch messages from the queue def watch_thread(): watcher = queue.watch(0, 5, auto_commit=True) i = 0 for x in watcher.run(): print('recv: ', i, x.index, x.tags['requestId']) i += 1 if i == count: break watcher.close() thread1 = threading.Thread(target=watch_thread) thread2 = threading.Thread(target=send_thread) thread1.start() thread2.start() thread1.join() thread2.join()