All Products
Search
Document Center

Platform For AI:Call a service over the VPC direct connection channel

Last Updated:Mar 28, 2024

Elastic Algorithm Service (EAS) of Platform for AI (PAI) allows you to call a service over the virtual private cloud (VPC) direct connection channel by using the official SDK for Python or custom call logic. This topic describes the call methods.

How it works

The following figure shows the traces for calling a service over VPC direct connection, a public endpoint, and a VPC endpoint.

image

After you enable the VPC direct connection feature for the resource group in which a service runs, EAS automatically associates a secondary elastic network interface (ENI) with the security group and connects the network between your VPC and the EAS service instance. You do not need to use gateways to access the service from your VPC. Requests from your VPC are directly sent to the EAS service instance, without load balancing at Layer 4 or network forwarding at Layer 7. The built-in remote procedure call (RPC) technology of EAS supports HTTP protocol stacks. This significantly improves performance and reduces the access latency of services with high queries per second (QPS), such as image services.

Prerequisites

If you want to deploy a service to a dedicated resource group, make sure that the VPC direct connection feature is enabled for the resource group. For more information, see Configure network connectivity.

Important

Security groups control the inbound and outbound traffic of ECS instances and the network communication between your ECS instances and EAS service instances. By default, instances in a basic security group communicate over an internal network. When you configure the VPC direct connection feature, you can select the security group to which the ECS instances that you want to access belong to allow the instances to communicate over an internal network. If you want to use different security groups, you need to configure security group rules to allow ECS instances to communicate with each other.

Call methods

To facilitate service calls, EAS provides the following methods to call services over VPC direct connection:

  • Use the official SDK for Python

    EAS encapsulates the call logic and provides the official SDK for Python. You can use this SDK to call services over VPC direct connection.

  • Use custom call logic

    We recommend that you use the official SDK to call services. This reduces the time to define call logic and improves the call stability. If you want to use other languages or custom call logic, you can use the method that is provided in the Use custom call logic section. To implement custom call logic, you must construct service requests based on specific frameworks. For more information, see Construct a request for a TensorFlow service.

Use the official SDK for Python

SDK for Python

To use the official SDK for Python to call the service, perform the following steps:

  1. Install the SDK.

    pip install -U eas-prediction --user

    For more information about how to use the SDK for Python, see SDK for Python.

  2. Compile a call program.

    In the following example, a program that uses strings as input and output is used. For information about sample programs that use other input and output formats, such as TensorFlow or PyTorch programs, see SDK for Python.

    #!/usr/bin/env python
    from eas_prediction import PredictClient
    from eas_prediction import StringRequest
    from eas_prediction import TFRequest
    from eas_prediction import ENDPOINT_TYPE_DIRECT
    if __name__ == '__main__':
        client = PredictClient('http://pai-eas-vpc.cn-shanghai.aliyuncs.com', 'mnist_saved_model_example')
        # Replace the value with the service token. Click Invocation Method in the Service Type column of the service to obtain the token. 
        client.set_token('M2FhNjJlZDBmMzBmMzE4NjFiNzZhMmUxY2IxZjkyMDczNzAzYjFi****')
        client.set_endpoint_type(ENDPOINT_TYPE_DIRECT)  # Indicates that the service is accessed over a VPC direct connection channel. 
        client.init()
        # request = StringRequest('[{}]')
        req = TFRequest('predict_images')
        req.add_feed('images', [1, 784], TFRequest.DT_FLOAT, [1] * 784)
        for x in range(0, 1000000):
            resp = client.predict(req)
            print(resp)
    

    The input of the client = PredictClient() function is the VPC endpoint that is used to call the service, which is specified by the endpoint parameter. The service name is specified by the service_name parameter. The VPC direct connection endpoint is bound to the region. The endpoint is in the pai-eas-vpc.{RegionId}.aliyuncs.com format. For example, the VPC direct connection endpoint in the China (Shanghai) region is pai-eas-vpc.cn-shanghai.aliyuncs.com.

SDK for Java

To use the official SDK for Java to call the service, perform the following steps:

  1. Add dependencies. To obtain the latest version of the EAS SDK, see MVN Repository.

    <dependency>
      <groupId>com.aliyun.openservices.eas</groupId>
      <artifactId>eas-sdk</artifactId>
      <version>2.0.13</version>
    </dependency>

    For more information about how to use the SDK for Java, see SDK for Java.

  2. Compile a call program.

    import com.aliyun.openservices.eas.predict.http.PredictClient;
    import com.aliyun.openservices.eas.predict.http.HttpConfig;
    
    public class TestString {
        public static void main(String[] args) throws Exception {
            // To ensure that the client object is shared as expected, create and initialize the client object when you start the service instead of creating a new client object for each request. 
            PredictClient client = new PredictClient(new HttpConfig());
            // Replace the value with the service token. Click Invocation Method in the Service Type column of the service to obtain the token. 
            client.setToken("YWFlMDYyZDNmNTc3M2I3MzMwYmY0MmYwM2Y2MTYxMTY4NzBkNzdj****");
            // Configure the domain name of pai-eas-vpc.{region_id}.aliyuncs.com by using the setDirectEndpoint method for connections over VPC. For example, the VPC direct connection endpoint in the China (Shanghai) region is cn-shanghai. 
            client.setDirectEndpoint("pai-eas-vpc.cn-shanghai.aliyuncs.com");
            // Replace the value with the name of the service. 
            client.setModelName("scorecard_pmml_example");
    
            // Define the input string. 
            String request = "[{\"money_credit\": 3000000}, {\"money_credit\": 10000}]";
            System.out.println(request);
    
            // Return a string by using EAS. 
            try {
                String response = client.predict(request);
                System.out.println(response);
            } catch (Exception e) {
                e.printStackTrace();
            }
    
            // Close the client. 
            client.shutdown();
            return;
        }
    }

SDK for Go

You do not need to install EAS SDK for Go in advance. The SDK is automatically downloaded from GitHub by the package manager of the GO language during code compilation. For information about how to use the SDK, see SDK for Go.

The following sample code provides an example on how to use the official SDK for GO:

package main

import (
        "fmt"
        "github.com/pai-eas/eas-golang-sdk/eas"
)

func main() {
    // Configure the domain name of pai-eas-vpc.{region_id}.aliyuncs.com. For example, the VPC direct connection endpoint in the China (Shanghai) region is cn-shanghai. Replace the values with the region in which the service resides and the name of the service. 
    client := eas.NewPredictClient("pai-eas-vpc.cn-shanghai.aliyuncs.com", "scorecard_pmml_example")
    // Replace the value with the service token. Click Invocation Method in the Service Type column of the service to obtain the token. 
    client.SetToken("YWFlMDYyZDNmNTc3M2I3MzMwYmY0MmYwM2Y2MTYxMTY4NzBkNzdj****")
    client.SetEndpointType(eas.EndpointTypeDirect)
    client.Init()
    req := "[{\"fea1\": 1, \"fea2\": 2}]"
    for i := 0; i < 100; i++ {
        resp, err := client.StringPredict(req)
        if err != nil {
            fmt.Printf("failed to predict: %v\n", err.Error())
        } else {
            fmt.Printf("%v\n", resp)
        }
    }
}

Use custom call logic

If you want to use other languages or custom call logic, you can use the following method to call a service over VPC direct connection by sending HTTP requests. EAS provides the service discovery feature. In a VPC, you can use the URLs in the following table to obtain the backend addresses of the service.

Region

Endpoint

China (Shanghai)

http://pai-eas-vpc.cn-shanghai.aliyuncs.com/exported/apis/eas.alibaba-inc.k8s.io/v1/upstreams/

China (Beijing)

http://pai-eas-vpc.cn-beijing.aliyuncs.com/exported/apis/eas.alibaba-inc.k8s.io/v1/upstreams/

China (Hangzhou)

http://pai-eas-vpc.cn-hangzhou.aliyuncs.com/exported/apis/eas.alibaba-inc.k8s.io/v1/upstreams/

The following sample code provides an example on how to access the mnist_saved_model_example service that resides in the China (Hangzhou) region. This service has two instances.

$curl http://pai-eas-vpc.cn-shanghai.aliyuncs.com/exported/apis/eas.alibaba-inc.k8s.io/v1/upstreams/mnist_saved_model_example

The following sample code provides a sample list of the obtained backend addresses of the service:

{
  "correlative": [
    "mnist_saved_model_example"
  ],
  "endpoints": {
    "items": [
      {
        "app": "mnist-saved-model-example",
        "ip": "172.16.XX.XX",
        "port": 50000,
        "weight": 100
      },
      {
        "app": "mnist-saved-model-example",
        "ip": "172.16.XX.XX",
        "port": 50000,
        "weight": 100
      }
    ]
  }
}

As shown in the preceding code, the client can obtain the IP addresses, port numbers, and weights of the two backend instances of the service. You can use the weighted round robin (WRR) algorithm to obtain information about an instance and access the instance over the VPC direct connection channel before each service call.

Note

You must regularly synchronize the endpoint list from the server side to the on-premises client. Before each request is sent, an instance is randomly selected by using the WRR algorithm based on the on-premises cache. If you obtain the endpoint list from the server side before each request, access performance is significantly reduced.

When a failover occurs due to service updates or node exceptions, specific instances may be unavailable. If the preceding issue occurs, make sure that the client can automatically resend requests after a request fails. This prevents access to failed instances until the instances are removed from the instance list, which may affect service quality.

For more information, see SDK for Python.

References

For information about how to call services, see Overview.