Elastic Algorithm Service (EAS) allows you to call a service over the virtual private cloud (VPC) direct connection channel by using the official SDK for Python or custom call logic. This topic describes these call methods in detail.

How it works

The following figure shows the traces for calling a service over the VPC direct connection channel, a public endpoint, and a VPC endpoint. TracesAfter you enable the VPC direct connection channel feature for the resource group where a service runs, you do not need to use gateways to access the service from your VPC. Requests from your VPC are directly sent to the EAS instance of the service, without load balancing at Layer 4 or network forwarding at Layer 7. In addition, the built-in remote procedure call (RPC) technology of EAS implements the HTTP-related protocol stack. This greatly improves performance and reduces latency for the access to services with high queries per second (QPS), such as image services.

Prerequisites

  • A dedicated resource group is purchased. You can enable the VPC direct connection channel feature only for services that are deployed in dedicated resource groups. For more information, see Activation and purchase.
  • The VPC direct connection channel feature is enabled for the dedicated resource group before you deploy a service. For more information, see VPC direct connection channel.

Call methods

To facilitate service calls, EAS provides the following methods to call services over the VPC direct connection channel:
  • Use the official SDK for Python

    EAS encapsulates the call logic and provides the official SDK for Python. You can use this SDK to call services over the VPC direct connection channel.

  • Use custom call logic

    We recommend that you use the official SDK to call services. This reduces the time that is required for defining call logic and improves the call stability. If you need to use other languages or custom call logic, you can follow the method that is provided in the Use custom call logic section. To implement the custom call logic, you must construct service requests based on specific frameworks. For more information, see Construct requests based on a universal processor.

Use the official SDK for Python

To use the official SDK for Python to call a service, perform the following steps:
  1. Install the SDK.
    pip install -U eas-prediction --user
    For more information about how to use the SDK for Python, visit GitHub.
  2. Compile a call program.
    Take a program that uses strings as input and output as an example. For information about sample programs with other input and output formats, such as TensorFlow and PyTorch programs, visit GitHub.
    #!/usr/bin/env python
    from eas_prediction import PredictClient
    from eas_prediction import StringRequest
    from eas_prediction import TFRequest
    from eas_prediction import ENDPOINT_TYPE_DIRECT
    if __name__ == '__main__':
        client = PredictClient('http://pai-eas-vpc.cn-shanghai.aliyuncs.com', 'mnist_saved_model_example')
        client.set_token('M2FhNjJlZDBmMzBmMzE4NjFiNzZhMmUxY2IxZjkyMDczNzAzYjFi****')
        client.set_endpoint_type(ENDPOINT_TYPE_DIRECT)  # Indicates that the service is accessed over a VPC direct connection channel. 
        client.init()
        #request = StringRequest('[{}]')
        req = TFRequest('predict_images')
        req.add_feed('images', [1, 784], TFRequest.DT_FLOAT, [1] * 784)
        for x in range(0, 1000000):
            resp = client.predict(req)
            print(resp)
    The input of the client = PredictClient() function is the VPC endpoint that is used to call the service. For example, if the VPC endpoint is http://166408185518****.vpc.cn-hangzhou.pai-eas.aliyuncs.com/api/predict/heart_predict_online, the format that is used to call the PredictClient() function is client = PredictClient('http://166408185518****.vpc.cn-hangzhou.pai-eas.aliyuncs.com','heart_predict_online').

Use custom call logic

If you want to use other languages or custom call logic, you can use the following method to call a service over the VPC direct connection channel by sending HTTP requests. EAS provides the service discovery feature. In a VPC, you can use the URLs in the following table to obtain the backend addresses of the service.
Region URL
China (Shanghai) http://pai-eas-vpc.cn-shanghai.aliyuncs.com/exported/apis/eas.alibaba-inc.k8s.io/v1/upstreams/
China (Beijing) http://pai-eas-vpc.cn-beijing.aliyuncs.com/exported/apis/eas.alibaba-inc.k8s.io/v1/upstreams/
China (Hangzhou) http://pai-eas-vpc.cn-hangzhou.aliyuncs.com/exported/apis/eas.alibaba-inc.k8s.io/v1/upstreams/
The following code provides an example on how to access the %mnist_saved_model_example; service in the China (Hangzhou) region. This service has two instances.
$curl http://pai-eas-vpc.cn-shanghai.aliyuncs.com/exported/apis/eas.alibaba-inc.k8s.io/v1/upstreams/mnist_saved_model_example
The following code provides an example on the obtained backend addresses of the service:
{
  "correlative": [
    "mnist_saved_model_example"
  ],
  "endpoints": {
    "items": [
      {
        "app": "mnist-saved-model-example",
        "ip": "172.16.XX.XX",
        "port": 50000,
        "weight": 100
      },
      {
        "app": "mnist-saved-model-example",
        "ip": "172.16.XX.XX",
        "port": 50000,
        "weight": 100
      }
    ]
  }
}
As shown in the preceding code, the client can obtain the IP addresses, port numbers, and weights of the two backend instances of the service. You can use the weighted round robin (WRR) algorithm to obtain the information of an instance and access the instance over the VPC direct connection channel before each service call.
Note You must regularly synchronize the endpoint list from the server side to the on-premises client. Before each request is sent, an instance is randomly selected by using the WRR algorithm based on the on-premises cache. If you obtain the endpoint list from the server side before each request, the access performance is significantly reduced.
When a failover occurs due to service updates or node exceptions, some instances may be unavailable. Therefore, make sure that the client can automatically retry the request when a request fails. This prevents failed instances from being accessed during the delay of being removed from the instance list, which may reduce service quality.