We recommend that you use official Elastic Algorithm Service (EAS) SDKs provided by Machine Learning Platform for AI (PAI) to call the services deployed based on models. This reduces the time required for defining the call logic and improves the call stability. This topic describes EAS SDK for Java in detail. This topic also provides demos to describe how to use EAS SDK for Java to call services. In the first demo, the input and output are strings. In the second demo, the input and output are tensors.

Add dependencies

If you write code in Java, Maven is required to manage projects. You must add the eas-sdk dependency for the client to the pom.xml file. The latest version of the eas-sdk dependency is 2.0.3. Add the following code to the pom.xml file:
<dependency>
  <groupId>com.aliyun.openservices.eas</groupId>
  <artifactId>eas-sdk</artifactId>
  <version>2.0.3</version>
</dependency>

Methods

Class Method Detailed information
PredictClient PredictClient(HttpConfig httpConfig)
  • Description: Creates a client object of the PredictClient class.
  • Parameter: httpConfig: the object to be created. It is of the HttpConfig class.
void setToken(String token)
  • Description: Sets the token parameter for an HTTP request.
  • Parameter: token: the token that is used for service authentication.
void setModelName(String modelName)
  • Description: Specifies the name of the model of the online prediction service to be called.
  • Parameter: modelName: the model name.
void setEndpoint(String endpoint)
  • Description: Specifies the host and port of the service, in the "host:port" format.
  • Parameter: endpoint: the endpoint to which messages are sent.
void setDirectEndpoint(String endpoint)
  • Description: Specifies the endpoint that is used to access the service by using a Virtual Private Cloud (VPC) direct connection channel. Endpoint example: pai-eas-vpc.cn-shanghai.aliyuncs.com.
  • Parameter: endpoint: the endpoint of the service.
void setRetryCount(boolean int retryCount)
  • Description: Sets the maximum number of retries allowed after a request failure.
  • Parameter: retryCount: the maximum number of retries allowed after a request failure.
void setContentType(String contentType)
  • Description: Sets the data stream type for the HTTP client. By default, the data stream type is set to "application/octet-stream".
  • Parameter: contentType: the type of the data stream to be sent.
void createChildClient(String token, String endpoint, String modelName)
  • Description: Creates a child client object. The child client object shares the thread pool with the parent client object. This method can be called to make predictions in multi-threaded mode.
  • Parameters:
    • token: the token that is used for service authentication.
    • endpoint: the endpoint of the service.
    • modelName: the name of the service.
void predict(TFRequest runRequest)
  • Description: Sends a TensorFlow request to the online prediction service.
  • Parameter: runRequest: the object of the TensorFlow request.
void predict(String requestContent)
  • Description: Sends a request to the online prediction service by using a string as the request content.
  • Parameter: requestContent: the string used as the request content.
void predict(byte[] requestContent)
  • Description: Sends a request to the service by using a byte array as the request content.
  • Parameter: requestContent: the byte array used as the request content.
HttpConfig void setIoThreadNum(int ioThreadNum)
  • Description: Sets the number of I/O threads that are used to send HTTP requests. By default, two I/O threads are used.
  • Parameter: ioThreadNum: the number of I/O threads that are used to send HTTP requests.
void setReadTimeout(int readTimeout)
  • Description: Sets the timeout period of a socket read. Default value: 5000, which indicates 5s.
  • Parameter: readTimeout: the timeout period for reading request information.
void setConnectTimeout(int connectTimeout)
  • Description: Sets the timeout period of a socket connection. Default value: 5000, which indicates 5s.
  • Parameter: connectTimeout: the connection timeout period of a request.
void setMaxConnectionCount(int maxConnectionCount)
  • Description: Sets the maximum number of connections allowed. Default value: 1000.
  • Parameter: maxConnectionCount: the maximum number of connections allowed in the connection pool of the client.
void setMaxConnectionPerRoute(int maxConnectionPerRoute)
  • Description: Sets the maximum number of default connections allowed on each route. Default value: 1000.
  • Parameter: maxConnectionPerRoute:the maximum number of default connections allowed on each route.
void setKeepAlive(boolean keepAlive)
  • Description: Specifies whether to enable HTTP keep-alive for HTTP-based connections.
  • Parameter: keepAlive: specifies whether to enable the keep-alive mechanism for HTTP-based connections. Default value: true.
int getErrorCode() Returns the error code of the last call.
string getErrorMessage() Returns the error message of the last call.
TFRequest void setSignatureName(String value)
  • Description: Specifies the name of the SignatureDef if the model of the online prediction service to be called is a TensorFlow model in the SavedModel format.
  • Parameter: the name of the SignatureDef of the TensorFlow model.
void addFetch(String value)
  • Description: Specifies the alias of the output tensor to be exported of the TensorFlow model.
  • Parameter: value: the alias of the output tensor to be exported.
void addFeed(String inputName, TFDataType dataType, long[]shape, ?[]content)
  • Description: Specifies the input tensor of the TensorFlow model of the online prediction service to be called.
  • Parameters:
    • inputName: the alias of the input tensor.
    • dataType: the data type of the input tensor.
    • shape: the shape of the input tensor.
    • content: the data of the input tensor. Specify the value in the form of a one-dimensional array.

      If the data type of the input tensor is DT_FLOAT, DT_COMPLEX64, DT_BFLOAT16, or DT_HALF, the elements in the one-dimensional array are of the FLOAT type. If the data type of the input tensor is DT_COMPLEX64, every two adjacent elements of the FLOAT type in the one-dimensional array represent the real part and imaginary part of a complex number.

      If the data type of the input tensor is DT_DOUBLE or DT_COMPLEX128, the elements in the one-dimensional array are of the DOUBLE type. If the data type of the input tensor is DT_COMPLEX128, every two adjacent elements of the DOUBLE type in the one-dimensional array represent the real part and imaginary part of a complex number.

      If the data type of the input tensor is DT_INT32, DT_UINT8, DT_INT16, DT_INT8, DT_QINT8, DT_QUINT8, DT_QINT32, DT_QINT16, DT_QUINT16, or DT_UINT16, the elements in the one-dimensional array are of the INT type.

      If the data type of the input tensor is DT_INT64, the elements in the one-dimensional array are of the LONG type.

      If the data type of the input tensor is DT_STRING, the elements in the one-dimensional array are of the STRING type.

      If the data type of the input tensor is DT_BOOL, the elements in the one-dimensional array are of the BOOLEAN type.

TFResponse List<Long> getTensorShape(String outputName)
  • Description: Queries the shape of the output tensor identified by the specified alias.
  • Parameter: outputName: the alias of the output tensor whose shape you want to query.
  • Return value: a one-dimensional array that represents the shape of the output tensor.
List<Float> getFloatVals(String outputName)
  • Description: Queries the data of the specified output tensor whose data type is DT_FLOAT, DT_COMPLEX64, DT_BFLOAT16, or DT_HALF.
  • Parameter: outputName: the alias of the output tensor whose data to be queried is of the FLOAT type.
  • Return value: a one-dimensional array converted from the retrieved output tensor data.
List<Double> getDoubleVals(String outputName)
  • Description: Queries the data of the specified output tensor whose data type is DT_DOUBLE or DT_COMPLEX128.
  • Parameter: outputName: the alias of the output tensor whose data to be queried is of the DOUBLE type.
  • Return value: a one-dimensional array converted from the retrieved output tensor data.
List<Integer> getIntVals(String outputName)
  • Description: Queries the data of the specified output tensor whose data type is DT_INT32, DT_UINT8, DT_INT16, DT_INT8, DT_QINT8, DT_QUINT8, DT_QINT32, DT_QINT16, DT_QUINT16, or DT_UINT16.
  • Parameter: outputName: the alias of the output tensor whose data to be queried is of the INT type.
  • Return value: a one-dimensional array converted from the retrieved output tensor data.
List<String> getStringVals(String outputName)
  • Description: Queries the data of the specified output tensor whose data type is DT_STRING.
  • Parameter: outputName: the alias of the output tensor whose data to be queried is of the STRING type.
  • Return value: a one-dimensional array converted from the retrieved output tensor data.
List<Long> getInt64Vals(String outputName)
  • Description: Queries the data of the specified output tensor whose data type is DT_INT64.
  • Parameter: outputName: the alias of the output tensor whose data to be queried is of the INT64 type.
  • Return value: a one-dimensional array converted from the retrieved output tensor data.
List<Boolean> getBoolVals(String outputName)
  • Description: Queries the data of the specified output tensor whose data type is DT_BOOL.
  • Parameter: outputName: the alias of the output tensor whose data to be queried is of the BOOL type.
  • Return value: a one-dimensional array converted from the retrieved output tensor data.

Demos

Input and output as strings

If you use custom processors to deploy models as services, strings are often used to call the services, such as a service deployed based on a Predictive Model Markup Language (PMML) model. The following demo is for your reference:
import com.aliyun.openservices.eas.predict.http.PredictClient;
import com.aliyun.openservices.eas.predict.http.HttpConfig;

public class Test_String {
    public static void main(String[] args) throws Exception{
    // Start and initialize the client. A single client object must be shared by multiple requests. You cannot create a client object each time you send a request. 
        PredictClient client = new PredictClient(new HttpConfig());
        client.setToken("YWFlMDYyZDNmNTc3M2I3MzMwYmY0MmYwM2Y2MTYxMTY4NzBkNzdj****");                         
        // To use a direct connection channel, call the setDirectEndpoint method. 
        // Example: client.setDirectEndpoint("pai-eas-vpc.cn-shanghai.aliyuncs.com");
        // You must enable the direct connection channel feature on the EAS-Model Serving page of the PAI console. This feature can provide a source vSwitch to access EAS. After the direct connection channel feature is enabled, you can call a service deployed in EAS by using the server load balancing algorithm without using a gateway. This way, better stability and performance can be achieved. 
        // Note: If you want to call a service by using a gateway, the endpoint that you use must start with your user ID. To obtain the endpoint, find the service that you want to call on the EAS-Model Serving page, and click Invoke Intro in the Service Method column. In the Invoke Intro dialog box, you can view the endpoint. If you want to call a service by using a direct connection channel, the endpoint that you use must be in the pai-eas-vpc.{Region ID}.aliyuncs.com format. 
        client.setEndpoint("182848887922****.vpc.cn-shanghai.pai-eas.aliyuncs.com");
        client.setModelName("scorecard_pmml_example");

        // Define the input string. 
        String request = "[{\"money_credit\": 3000000}, {\"money_credit\": 10000}]";
        System.out.println(request);

        // Return a string by using EAS. 
        try {
            String response = client.predict(request);
            System.out.println(response);
        } catch(Exception e) {
            e.printStackTrace();
        }       
        return;
    }
}
The preceding demo shows that you must perform the following steps to call a service by using EAS SDK for Java:
  1. Call the PredictClient method to create a client object. If multiple services are involved, create multiple client objects.
  2. Set the token, endpoint, and modelname parameters for the client object.
  3. Define the request content of the STRING type as the input, and call the client.predict method to send the HTTP request. Then, the system returns the response.

Input and output as tensors

If you use TensorFlow to deploy models as services, you must use the TFRequest and TFResponse classes to call the services. The following demo is for your reference:
import java.util.List;
import com.aliyun.openservices.eas.predict.http.PredictClient;
import com.aliyun.openservices.eas.predict.http.HttpConfig;
import com.aliyun.openservices.eas.predict.request.TFDataType;
import com.aliyun.openservices.eas.predict.request.TFRequest;
import com.aliyun.openservices.eas.predict.response.TFResponse;

public class Test_TF {
    public static TFRequest buildPredictRequest() {
        TFRequest request = new TFRequest();
        request.setSignatureName("predict_images");
        float[] content = new float[784];
        for (int i = 0; i < content.length; i++)
            content[i] = (float)0.0;
        request.addFeed("images", TFDataType.DT_FLOAT, new long[]{1, 784}, content);
        request.addFetch("scores");
        return request;
    }

    public static void main(String[] args) throws Exception{
        PredictClient client = new PredictClient(new HttpConfig());
        
        // To use a direct connection channel, call the setDirectEndpoint method. 
        // Example: client.setDirectEndpoint("pai-eas-vpc.cn-shanghai.aliyuncs.com");
        // You must enable the direct connection channel feature on the EAS-Model Serving page of the PAI console. This feature can provide a source vSwitch to access EAS. After the direct connection channel feature is enabled, you can call a service deployed in EAS by using the server load balancing algorithm without using a gateway. This way, better stability and performance can be achieved. 
        // Note: If you want to call a service by using a gateway, the endpoint that you use must start with your user ID. To obtain the endpoint, find the service that you want to call on the EAS-Model Serving page, and click Invoke Intro in the Service Method column. In the Invoke Intro dialog box, you can view the endpoint. If you want to call a service by using a direct connection channel, the endpoint that you use must be in the pai-eas-vpc.{Region ID}.aliyuncs.com format. 
        client.setEndpoint("1828488879222746.vpc.cn-shanghai.pai-eas.aliyuncs.com");
        client.setModelName("mnist_saved_model_example");
        client.setToken("YTg2ZjE0ZjM4ZmE3OTc0NzYxZDMyNmYzMTJjZTQ1YmU0N2FjMTAy****");
        long startTime = System.currentTimeMillis();
        for (int i = 0; i < 1000; i++) {
            try {
                TFResponse response = client.predict(buildPredictRequest());
                List<Float> result = response.getFloatVals("scores");
                System.out.print("Predict Result: [");
                for (int j = 0; j < result.size(); j++) {
                    System.out.print(result.get(j).floatValue());
                    if (j != result.size() -1)
                        System.out.print(", ");
                }
                System.out.print("]\n");
            } catch(Exception e) {
                e.printStackTrace();
            }
        }
        long endTime = System.currentTimeMillis();
        System.out.println("Spend Time: " + (endTime - startTime) + "ms");       
    }
}
The preceding demo shows that you must perform the following steps to call a service whose model is a TensorFlow model by using EAS SDK for Java:
  1. Call the PredictClient method to create a client object. If multiple services are involved, create multiple client objects.
  2. Set the token, endpoint, and modelname parameters for the client object.
  3. Encapsulate the input by using the TFRequest class and the output by using the TFResponse class.