Call EAS services using the official Python SDK with support for TensorFlow, PyTorch, string-based models, and VPC direct connections.
Installation
pip install -U eas-prediction --user
API reference
Common parameters
endpoint: server endpoint.
-
For regular mode, use the default gateway endpoint. Example:
182848887922***.cn-shanghai.pai-eas.aliyuncs.com. -
For VPC direct connection, use the regional common endpoint. For example, in China (Shanghai):
pai-eas-vpc.cn-shanghai.aliyuncs.com.
PredictClient
|
Method |
Description |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Initializes the client object. Call this method after all configuration methods to make parameters take effect. |
|
|
|
StringRequest
|
Method |
Description |
|
|
|
StringResponse
|
Method |
Description |
|
|
|
TFRequest
|
Method |
Description |
|
|
|
|
|
|
|
|
|
|
|
|
TFResponse
|
Method |
Description |
|
|
|
|
|
|
TorchRequest
|
Method |
Description |
|
|
Creates a TorchRequest object. |
|
|
|
|
|
|
|
|
|
TorchResponse
|
Method |
Description |
|
|
|
|
|
|
QueueClient
|
Method |
Description |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Watcher
|
Method |
Description |
|
|
|
|
|
Closes the watcher and terminates backend connections. Note
Only one watcher can run per client. Close the current watcher before starting another. |
Examples
-
String input and output
For services deployed with custom processors (e.g., PMML models), use strings for service calls:
#!/usr/bin/env python from eas_prediction import PredictClient from eas_prediction import StringRequest if __name__ == '__main__': client = PredictClient('http://182848887922****.cn-shanghai.pai-eas.aliyuncs.com', 'scorecard_pmml_example') client.set_token('YWFlMDYyZDNmNTc3M2I3MzMwYmY0MmYwM2Y2MTYxMTY4NzBkNzdj****') client.init() request = StringRequest('[{"fea1": 1, "fea2": 2}]') for x in range(0, 1000000): resp = client.predict(request) print(resp) -
Tensor input and output
For TensorFlow services, use TFRequest and TFResponse classes:
#!/usr/bin/env python from eas_prediction import PredictClient from eas_prediction import StringRequest from eas_prediction import TFRequest if __name__ == '__main__': client = PredictClient('http://182848887922****.cn-shanghai.pai-eas.aliyuncs.com', 'mnist_saved_model_example') client.set_token('YTg2ZjE0ZjM4ZmE3OTc0NzYxZDMyNmYzMTJjZTQ1YmU0N2FjMTAy****') client.init() #request = StringRequest('[{}]') req = TFRequest('predict_images') req.add_feed('images', [1, 784], TFRequest.DT_FLOAT, [1] * 784) for x in range(0, 1000000): resp = client.predict(req) print(resp) -
VPC direct connection
VPC direct connection is available only for services in EAS dedicated resource groups. The resource group and vSwitch must be in the same VPC. For setup, see Work with dedicated resource groups and Network access configuration. This mode adds
client.set_endpoint_type(ENDPOINT_TYPE_DIRECT)and is recommended for high-concurrency scenarios:#!/usr/bin/env python from eas_prediction import PredictClient from eas_prediction import StringRequest from eas_prediction import TFRequest from eas_prediction import ENDPOINT_TYPE_DIRECT if __name__ == '__main__': client = PredictClient('http://pai-eas-vpc.cn-hangzhou.aliyuncs.com', 'mnist_saved_model_example') client.set_token('M2FhNjJlZDBmMzBmMzE4NjFiNzZhMmUxY2IxZjkyMDczNzAzYjFi****') client.set_endpoint_type(ENDPOINT_TYPE_DIRECT) client.init() request = TFRequest('predict_images') request.add_feed('images', [1, 784], TFRequest.DT_FLOAT, [1] * 784) for x in range(0, 1000000): resp = client.predict(request) print(resp) -
PyTorch model
For PyTorch services, use TorchRequest and TorchResponse classes:
#!/usr/bin/env python from eas_prediction import PredictClient from eas_prediction import TorchRequest if __name__ == '__main__': client = PredictClient('http://182848887922****.cn-shanghai.pai-eas.aliyuncs.com', 'pytorch_gpu_wl') client.init() req = TorchRequest() req.add_feed(0, [1, 3, 224, 224], TorchRequest.DT_FLOAT, [1] * 150528) # req.add_fetch(0) import time st = time.time() timer = 0 for x in range(0, 10): resp = client.predict(req) timer += (time.time() - st) st = time.time() print(resp.get_tensor_shape(0)) # print(resp) print("average response time: %s s" % (timer / 10) ) -
Blade processor-based model
For Blade processor services, use BladeRequest and BladeResponse classes:
#!/usr/bin/env python from eas_prediction import PredictClient from eas_prediction import BladeRequest if __name__ == '__main__': client = PredictClient('http://182848887922****.cn-shanghai.pai-eas.aliyuncs.com', 'nlp_model_example') client.init() req = BladeRequest() req.add_feed('input_data', 1, [1, 360, 128], BladeRequest.DT_FLOAT, [0.8] * 85680) req.add_feed('input_length', 1, [1], BladeRequest.DT_INT32, [187]) req.add_feed('start_token', 1, [1], BladeRequest.DT_INT32, [104]) req.add_fetch('output', BladeRequest.DT_FLOAT) import time st = time.time() timer = 0 for x in range(0, 10): resp = client.predict(req) timer += (time.time() - st) st = time.time() # print(resp) # print(resp.get_values('output')) print(resp.get_tensor_shape('output')) print("average response time: %s s" % (timer / 10) ) -
Blade with TensorFlow compatibility
For EAS Blade models compatible with TensorFlow, use TFRequest and TFResponse classes:
#!/usr/bin/env python from eas_prediction import PredictClient from eas_prediction.blade_tf_request import TFRequest # Need Importing blade TFRequest if __name__ == '__main__': client = PredictClient('http://182848887922****.cn-shanghai.pai-eas.aliyuncs.com', 'nlp_model_example') client.init() req = TFRequest(signature_name='predict_words') req.add_feed('input_data', [1, 360, 128], TFRequest.DT_FLOAT, [0.8] * 85680) req.add_feed('input_length', [1], TFRequest.DT_INT32, [187]) req.add_feed('start_token', [1], TFRequest.DT_INT32, [104]) req.add_fetch('output') import time st = time.time() timer = 0 for x in range(0, 10): resp = client.predict(req) timer += (time.time() - st) st = time.time() # print(resp) # print(resp.get_values('output')) print(resp.get_tensor_shape('output')) print("average response time: %s s" % (timer / 10) ) -
Queue service for data streaming
Send and query data in queues, query queue state, and subscribe to pushed data. This example demonstrates one thread pushing data to a queue while another thread uses a watcher to subscribe:
#!/usr/bin/env python from eas_prediction import QueueClient import threading if __name__ == '__main__': endpoint = '182848887922****.cn-shanghai.pai-eas.aliyuncs.com' queue_name = 'test_group.qservice/sink' token = 'YmE3NDkyMzdiMzNmMGM3ZmE4ZmNjZDk0M2NiMDA3OTZmNzc1MTUx****' queue = QueueClient(endpoint, queue_name) queue.set_token(token) queue.init() queue.set_timeout(30000) # truncate all messages in the queue attributes = queue.attributes() if 'stream.lastEntry' in attributes: queue.truncate(int(attributes['stream.lastEntry']) + 1) count = 100 # create a thread to send messages to the queue def send_thread(): for i in range(count): index, request_id = queue.put('[{}]') print('send: ', i, index, request_id) # create a thread to watch messages from the queue def watch_thread(): watcher = queue.watch(0, 5, auto_commit=True) i = 0 for x in watcher.run(): print('recv: ', i, x.index, x.tags['requestId']) i += 1 if i == count: break watcher.close() thread1 = threading.Thread(target=watch_thread) thread2 = threading.Thread(target=send_thread) thread1.start() thread2.start() thread1.join() thread2.join()