All Products
Search
Document Center

Platform For AI:SDK for Java

Last Updated:Dec 19, 2023

Official Elastic Algorithm Service (EAS) SDKs are provided to call services deployed based on their models. EAS SDKs can help reduce the amount of time required to define the call logic and improve call stability. This topic describes EAS SDK for Java. This topic also provides demos to describe how to use EAS SDK for Java to call services. In the first demo, the input and output are strings. In the second demo, the input and output are tensors.

Add dependencies

If you write code in Java, Maven is required to manage projects. You need to add the eas-sdk dependency for the client to the pom.xml file. The latest version of the eas-sdk dependency is 2.0.11. Add the following code to the pom.xml file:

<dependency>
  <groupId>com.aliyun.openservices.eas</groupId>
  <artifactId>eas-sdk</artifactId>
  <version>2.0.11</version>
</dependency>

V2.0.5 and later support the QueueService client and support asynchronous queue services with multiple priorities. To use this feature and prevent version conflicts in the dependency package, you must add the following dependency packages and modify the dependency packages based on the following example.

<dependency>
    <groupId>org.java-websocket</groupId>
    <artifactId>Java-WebSocket</artifactId>
    <version>1.5.1</version>
</dependency>
<dependency>
    <groupId>org.apache.commons</groupId>
    <artifactId>commons-lang3</artifactId>
    <version>3.1</version>
</dependency>

Methods

Class

Method

Details

PredictClient

PredictClient(HttpConfig httpConfig)

  • Description: Creates a client object of the PredictClient class.

  • Parameter: httpConfig specifies that the object to be created is of the HttpConfig class.

void setToken(String token)

  • Description: Sets the token parameter for an HTTP request.

  • Parameter: token indicates the token that is used for service authentication.

void setModelName(String modelName)

  • Description: Specifies the name of the model of the online prediction service to be called.

  • Parameter: modelName indicates the model name.

void setEndpoint(String endpoint)

  • Description: Specifies the host and port of the service, in the "host:port" format.

  • Parameter: endpoint indicates the endpoint to which messages are sent.

void setDirectEndpoint(String endpoint)

  • Description: Specifies the endpoint that is used to access the service by using a virtual private cloud (VPC) direct connection channel. Endpoint example: pai-eas-vpc.cn-shanghai.aliyuncs.com.

  • Parameter: endpoint indicates the endpoint of the service.

void setRetryCount(boolean int retryCount)

  • Description: Sets the maximum number of retries allowed after a request failure.

  • Parameter: retryCount indicates the maximum number of retries allowed after a request failure.

void setContentType(String contentType)

  • Description: Sets the data stream type for the HTTP client. By default, the data stream type is set to "application/octet-stream".

  • Parameter: contentType indicates the type of the data stream to be sent.

void createChildClient(String token, String endpoint, String modelName)

  • Description: Creates a child client object. The child client object shares the thread pool with the parent client object. This method can be called to perform predictions in the multi-threaded mode.

  • Parameters:

    • token: the token that is used for service authentication.

    • endpoint: the endpoint of the service.

    • modelName: the name of the service.

void predict(TFRequest runRequest)

  • Description: Sends a TensorFlow request to the online prediction service.

  • Parameter: runRequest indicates the object of the TensorFlow request.

void predict(String requestContent)

  • Description: Sends a request to the online prediction service by using a string as the request content.

  • Parameter: requestContent indicates the string used as the request content.

void predict(byte[] requestContent)

  • Description: Sends a request to the service by using a byte array as the request content.

  • Parameter: requestContent indicates the byte array used as the request content.

HttpConfig

void setIoThreadNum(int ioThreadNum)

  • Description: Sets the number of I/O threads that are used to send HTTP requests. By default, two I/O threads are used.

  • Parameter: ioThreadNum indicates the number of I/O threads that are used to send HTTP requests.

void setReadTimeout(int readTimeout)

  • Description: Sets the timeout period of a socket read. Default value: 5000, which is equivalent to 5s.

  • Parameter: readTimeout indicates the timeout period for reading request information.

void setConnectTimeout(int connectTimeout)

  • Description: Sets the timeout period of a socket connection. Default value: 5000, which indicates 5s.

  • Parameter: connectTimeout indicates the connection timeout period of a request.

void setMaxConnectionCount(int maxConnectionCount)

  • Description: Sets the maximum number of connections allowed. Default value: 1000.

  • Parameter: maxConnectionCount indicates the maximum number of connections allowed in the connection pool of the client.

void setMaxConnectionPerRoute(int maxConnectionPerRoute)

  • Description: Sets the maximum number of default connections allowed on each route. Default value: 1000.

  • Parameter: maxConnectionPerRoute indicates the maximum number of default connections allowed on each route.

void setKeepAlive(boolean keepAlive)

  • Description: Specifies whether to enable HTTP keep-alive for HTTP-based connections.

  • Parameter: keepAlive indicates whether to enable the keep-alive mechanism for HTTP-based connections. Default value: true.

int getErrorCode()

Returns the error code of the last call.

string getErrorMessage()

Returns the error message of the last call.

TFRequest

void setSignatureName(String value)

  • Description: Specifies the name of the SignatureDef if the model of the online prediction service to be called is a TensorFlow model in the SavedModel format.

  • Parameter: the name of the SignatureDef of the TensorFlow model.

void addFetch(String value)

  • Description: Specifies the alias of the output tensor of the TensorFlow model that you want to export.

  • Parameter: value indicates the alias of the output tensor to be exported.

void addFeed(String inputName, TFDataType dataType, long[]shape, ?[]content)

  • Description: Specifies the input tensor of the TensorFlow model of the online prediction service to be called.

  • Parameters:

    • inputName: the alias of the input tensor.

    • dataType: the data type of the input tensor.

    • shape: the shape of the input tensor.

    • content: the data of the input tensor. Specify the value in the form of a one-dimensional array.

      If the data type of the input tensor is DT_FLOAT, DT_COMPLEX64, DT_BFLOAT16, or DT_HALF, the elements in the one-dimensional array are of the FLOAT type. If the data type of the input tensor is DT_COMPLEX64, every two adjacent elements of the FLOAT type in the one-dimensional array represent the real part and imaginary part of a complex number.

      If the data type of the input tensor is DT_DOUBLE or DT_COMPLEX128, the elements in the one-dimensional array are of the DOUBLE type. If the data type of the input tensor is DT_COMPLEX128, every two adjacent elements of the DOUBLE type in the one-dimensional array represent the real part and imaginary part of a complex number.

      If the data type of the input tensor is DT_INT32, DT_UINT8, DT_INT16, DT_INT8, DT_QINT8, DT_QUINT8, DT_QINT32, DT_QINT16, DT_QUINT16, or DT_UINT16, the elements in the one-dimensional array are of the INT type.

      If the data type of the input tensor is DT_INT64, the elements in the one-dimensional array are of the LONG type.

      If the data type of the input tensor is DT_STRING, the elements in the one-dimensional array are of the STRING type.

      If the data type of the input tensor is DT_BOOL, the elements in the one-dimensional array are of the BOOLEAN type.

TFResponse

List<Long> getTensorShape(String outputName)

  • Description: Queries the shape of the output tensor identified by the specified alias.

  • Parameter: outputName indicates the alias of the output tensor whose shape you want to query.

  • Return value: a one-dimensional array that represents the shape of the output tensor.

List<Float> getFloatVals(String outputName)

  • Description: Queries the data of the specified output tensor whose data type is DT_FLOAT, DT_COMPLEX64, DT_BFLOAT16, or DT_HALF.

  • Parameter: outputName indicates the alias of the output tensor whose data to be queried is of the FLOAT type.

  • Return value: a one-dimensional array converted from the retrieved output tensor data.

List<Double> getDoubleVals(String outputName)

  • Description: Queries the data of the specified output tensor whose data type is DT_DOUBLE or DT_COMPLEX128.

  • Parameter: outputName indicates the alias of the output tensor whose data to be queried is of the DOUBLE type.

  • Return value: a one-dimensional array converted from the retrieved output tensor data.

List<Integer> getIntVals(String outputName)

  • Description: Queries the data of the specified output tensor whose data type is DT_INT32, DT_UINT8, DT_INT16, DT_INT8, DT_QINT8, DT_QUINT8, DT_QINT32, DT_QINT16, DT_QUINT16, or DT_UINT16.

  • Parameter: outputName indicates the alias of the output tensor whose data to be queried is of the INT type.

  • Return value: a one-dimensional array converted from the retrieved output tensor data.

List<String> getStringVals(String outputName)

  • Description: Queries the data of the specified output tensor whose data type is DT_STRING.

  • Parameter: outputName indicates the alias of the output tensor whose data to be queried is of the STRING type.

  • Return value: a one-dimensional array converted from the retrieved output tensor data.

List<Long> getInt64Vals(String outputName)

  • Description: Queries the data of the specified output tensor whose data type is DT_INT64.

  • Parameter: outputName indicates the alias of the output tensor whose data to be queried is of the INT64 type.

  • Return value: a one-dimensional array converted from the retrieved output tensor data.

List<Boolean> getBoolVals(String outputName)

  • Description: Queries the data of the specified output tensor whose data type is DT_BOOL.

  • Parameter: outputName indicates the alias of the output tensor whose data to be queried is of the BOOL type.

  • Return value: a one-dimensional array converted from the retrieved output tensor data.

QueueClient

QueueClient(String endpoint, String queueName, String token, HttpConfig httpConfig, QueueUser user)

  • Description: Creates a client object of the QueueClient class.

  • Parameters:

    • endpoint: the endpoint of the server.

    • queueName: the name of the service.

    • Parameter: token: the token for service access.

    • httpConfig: the configuration of the service request.

    • user: sets UserId and GroupName. By default, the value of UserId is a random number, and the value of GroupName is eas.

JSONObject attributes()

  • Description: obtains the details of the queue service.

  • Return value: the queue service information of the JSONObject type, which contains the following fields:

    • meta.maxPayloadBytes: the maximum size of each data item in the queue.

    • meta.name: the name of the queue.

    • stream.approxMaxLength: the maximum number of data items that can be stored in the queue.

    • stream.firstEntry: the Index of the first data item in the queue.

    • stream.lastEntry: the Index of the last data item in the queue.

    • stream.length: the number of data items currently stored in the queue.

Pair<Long, String> put(byte[] data, long priority, Map<String, String> tags)

  • Description: writes data to the queue service.

  • Parameters:

    • data: data of the Byte[] type.

    • priority:: the priority of the data. The default value is 0, which indicates that the data is not prioritized. A value of 1 indicates high-priority data.

    • tags: custom parameter.

  • Return value: <data index, request Id> of the Pair<Long, String> type. The return value is an ordered pair consisting of two elements. The first element is a data index of the Long type, and the second element is a request ID of the string type.

DataFrame[] get(long index, long length, long timeout, boolean autoDelete, Map<String, String> tags)

  • Description: obtains data from the queue service.

  • Parameters:

    • index: the start index from which to retrieve data. If the value is -1, the most recent data is retrieved.

    • length: the amount of data obtained.

    • timeout: the timeout period. Unit: seconds.

    • autoDelete: whether the data is automatically deleted from the queue after it is obtained.

    • tags: custom parameter. Example: RequestID.

  • Return value: DataFrame data class.

void truncate(Long index)

  • Description: deletes all data whose index is less than the specified index in the queue service.

String delete(Long index)

  • Feature: deletes the data of a specified index from the queue service.

  • Return value: OK is returned if the data is deleted.

WebSocketWatcher watch(long index, long window, boolean indexOnly, boolean autoCommit, Map<String, String> tags)

  • Description: subscribes to the queue service.

  • Parameters:

    • index: the start index from which to retrieve data. If the value is -1, the most recent data is retrieved.

    • window: the size of the sending window, which is the maximum length of uncommitted data. When the queue service sends window DataFrame data but the client does not commit, the queue service stops sending.

    • indexOnly: the returned DataFrame contains only the index and tags without specific data to save bandwidth.

    • autoCommit: automatically commits data after the data is sent to avoid calling the commit operation. When the autoCommit is set to true, the window parameter is ignored.

    • tags: custom subscription parameter.

  • Return value: The returned data is of the WebSocketWatcher type and is used to obtain the subscription data. For more information, see QueueService client.

String commit(Long index) or String commit(Long[] index)

  • Description: confirms that the data is consumed and deletes the data in the queue service.

  • Return value: OK is returned if the commit is successful.

void end(boolean force)

Description: disables the queue service.

DataFrame

byte[] getData()

  • Description: obtains data values.

  • Return value: data value of the Byte[] type.

long getIndex()

  • Description: obtains the data index.

  • Return value: data index of the Long type.

Map<String, String> getTags()

  • Description: obtains data tags.

  • Return value: data tags of the Map<String, String> type, which can be used to obtain the RequestID.

Demos

Input and output as strings

In most cases when you use custom processors to deploy models as services, strings are used to call the services, such as a service deployed based on a Predictive Model Markup Language (PMML) model. The following demo is for your reference:

import com.aliyun.openservices.eas.predict.http.PredictClient;
import com.aliyun.openservices.eas.predict.http.HttpConfig;

public class TestString {
    public static void main(String[] args) throws Exception {
        // Start and initialize the client. A single client object must be shared by multiple requests. You cannot create a client object each time you send a request. 
        PredictClient client = new PredictClient(new HttpConfig());
        client.setToken("YWFlMDYyZDNmNTc3M2I3MzMwYmY0MmYwM2Y2MTYxMTY4NzBkNzdj****");
        // To use a direct connection channel, call the setDirectEndpoint method.
        // Example: client.setDirectEndpoint("pai-eas-vpc.cn-shanghai.aliyuncs.com");
        // You must enable the direct connection channel feature on the EAS page of the PAI console. This feature can provide a source vSwitch to access EAS. After you enable the direct connection channel feature, you can call services deployed in EAS by using the server load balancing algorithm without the need to use a gateway. This way, improved stability and performance are achieved. 
        // Note: If you want to call a service by using a gateway, the endpoint that you use must start with your user ID. To obtain the endpoint, find the service that you want to call on the EAS page, and click Invocation Method in the Service Type column. In the Invocation Method dialog box, you can view the endpoint. If you want to call a service by using a direct connection channel, the endpoint that you use must be in the pai-eas-vpc.{region_id}.aliyuncs.com format. 
        client.setEndpoint("182848887922****.vpc.cn-shanghai.pai-eas.aliyuncs.com");
        client.setModelName("scorecard_pmml_example");

        // Define the input string.
        String request = "[{\"money_credit\": 3000000}, {\"money_credit\": 10000}]";
        System.out.println(request);

        // Return a string by using EAS.
        try {
            String response = client.predict(request);
            System.out.println(response);
        } catch (Exception e) {
            e.printStackTrace();
        }

        // Shut down the client.
        client.shutdown();
        return;
    }
}

The preceding demo shows that you must perform the following steps to call a service by using EAS SDK for Java:

  1. Call the PredictClient method to create a client object. If multiple services are involved, create multiple client objects.

  2. Set the token, endpoint, and modelname parameters for the client object.

  3. Define the request content of the STRING type as the input, and call the client.predict method to send the HTTP request. Then, the system returns the response.

Input and output as tensors

If you use TensorFlow to deploy models as services, you must use the TFRequest and TFResponse classes to call the services. The following demo is for your reference:

import java.util.List;

import com.aliyun.openservices.eas.predict.http.PredictClient;
import com.aliyun.openservices.eas.predict.http.HttpConfig;
import com.aliyun.openservices.eas.predict.request.TFDataType;
import com.aliyun.openservices.eas.predict.request.TFRequest;
import com.aliyun.openservices.eas.predict.response.TFResponse;

public class TestTF {
    public static TFRequest buildPredictRequest() {
        TFRequest request = new TFRequest();
        request.setSignatureName("predict_images");
        float[] content = new float[784];
        for (int i = 0; i < content.length; i++) {
            content[i] = (float) 0.0;
        }
        request.addFeed("images", TFDataType.DT_FLOAT, new long[]{1, 784}, content);
        request.addFetch("scores");
        return request;
    }

    public static void main(String[] args) throws Exception {
        PredictClient client = new PredictClient(new HttpConfig());

        // To use a direct connection channel, call the setDirectEndpoint method. 
        // Example: client.setDirectEndpoint("pai-eas-vpc.cn-shanghai.aliyuncs.com");
        // You must enable the direct connection channel feature on the EAS page of the PAI console. This feature can provide a source vSwitch to access EAS. After you enable the direct connection channel feature, you can call services deployed in EAS by using the server load balancing algorithm without the need to use a gateway. This way, improved stability and performance are achieved. 
        // Note: If you want to call a service by using a gateway, the endpoint that you use must start with your user ID. To obtain the endpoint, find the service that you want to call on the EAS page, and click Invocation Method in the Service Type column. In the Invocation Method dialog box, you can view the endpoint. If you want to call a service by using a direct connection channel, the endpoint that you use must be in the pai-eas-vpc.{region_id}.aliyuncs.com format. 
        client.setEndpoint("182848887922****.vpc.cn-shanghai.pai-eas.aliyuncs.com");
        client.setModelName("mnist_saved_model_example");
        client.setToken("YTg2ZjE0ZjM4ZmE3OTc0NzYxZDMyNmYzMTJjZTQ1YmU0N2FjMTAy****");
        long startTime = System.currentTimeMillis();
        int count = 1000;
        for (int i = 0; i < count; i++) {
            try {
                TFResponse response = client.predict(buildPredictRequest());
                List<Float> result = response.getFloatVals("scores");
                System.out.print("Predict Result: [");
                for (int j = 0; j < result.size(); j++) {
                    System.out.print(result.get(j).floatValue());
                    if (j != result.size() - 1) {
                        System.out.print(", ");
                    }
                }
                System.out.print("]\n");
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
        long endTime = System.currentTimeMillis();
        System.out.println("Spend Time: " + (endTime - startTime) + "ms");
        client.shutdown();
    }
}

The preceding demo shows that you must perform the following steps to call a service whose model is a TensorFlow model by using EAS SDK for Java:

  1. Call the PredictClient method to create a client object. If multiple services are involved, create multiple client objects.

  2. Set the token, endpoint, and modelname parameters for the client object.

  3. Encapsulate the input by using the TFRequest class and the output by using the TFResponse class.

QueueService client

You can use the Queue Service feature by using the QueueClient operations. The following demo is for your reference:

import com.alibaba.fastjson.JSONObject;
import com.aliyun.openservices.eas.predict.http.HttpConfig;
import com.aliyun.openservices.eas.predict.http.QueueClient;
import com.aliyun.openservices.eas.predict.queue_client.QueueUser;
import com.aliyun.openservices.eas.predict.queue_client.WebSocketWatcher;

public class DemoWatch {
    public static void main(String[] args) throws Exception {
        /**Create a queue service client.*/
        String queueEndpoint = "18*******.cn-hangzhou.pai-eas.aliyuncs.com";
        String inputQueueName = "test_qservice";
        String sinkQueueName = "test_qservice/sink";
        String queueToken = "test-token";

        /**Input queue. Add data to the input queue. The inference service automatically reads the request data from the input queue.*/
        QueueClient inputqueue =
            new QueueClient(queueEndpoint, inputQueueName, queueToken, new HttpConfig(), new QueueUser());
        /**Output queue. The inference service processes the input data and writes the results to the output queue.*/
        QueueClient sinkqueue =
            new QueueClient(queueEndpoint, sinkQueueName, queueToken, new HttpConfig(), new QueueUser());
        /**Clear queue data!!! Proceed with caution*/
        inputqueue.clear();
        sinkqueue.clear();

        /**Add data to the input queue.*/
      	int count = 10;
        for (int i = 0; i < count; ++i) {
            String data = Integer.toString(i);
            input_queue.put(data.getBytes(), null);
            /**The queue service supports multi-priority queues. You can use the put function to set the data priority. The default priority is 0.*/
            //  inputQueue.put(data.getBytes(), 0L, null);
        }

        /**Subscribe to the data of the output queue by using the watch function. The window size is 5.*/
        WebSocketWatcher watcher = sink_queue.watch(0L, 5L, false, true, null);
        /**You can use the WatchConfig parameter to specify the number of retries, retry interval (in seconds), and whether to limit the maximum number of retries. If you do not specify the WatchConfig parameter, the default number of retries is 3, and the retry interval is 5. */
        //  WebSocketWatcher watcher = sink_queue.watch(0L, 5L, false, true, null, new WatchConfig(3, 1));
        //  WebSocketWatcher watcher = sink_queue.watch(0L, 5L, false, true, null, new WatchConfig(true, 10));

        /**Obtain output data. * /
        for (int i = 0; i < count; ++i) {
            try {
                /**The getDataFrame function is used to obtain the DataFrame data class. If no data is available, the function is blocked.*/
                byte[] data = watcher.getDataFrame().getData();
                System.out.println("[watch] data = " + new String(data));
            } catch (RuntimeException ex) {
                System.out.println("[watch] error = " + ex.getMessage());
                break;
            }
        }
        /**Disable a watcher object. Each client instance can have only one watcher object. If you do not disable a watcher object, an error is reported when you rerun the instance. * /
        watcher.close();

        Thread.sleep(2000);
        JSONObject attrs = sink_queue.attributes();
        System.out.println(attrs.toString());

        /**Shut down the client.*/
        inputqueue.shutdown();
        sinkqueue.shutdown();
    }
}

The preceding demo shows that you must perform the following steps to call a service by using EAS SDK for Java:

  1. Creates a queue service client object through the QueueClient operations. If you create an inference service, you must create both input queue and output queue objects.

  2. Use the put() function to send data to the input queue, and use the watch() function to subscribe to data from the output queue.

    Note

    Data transmission and subscription can occur in different threads. However, in this example, they are completed in the same thread, data transmission by the put function and subscription by the watch function.

Compress request data

If the amount of request data is large, EAS allows you to compress the data before sending the data to the server. Zlib and Gzip compression formats are supported. To compress data, you need to specify the rpc.decompressor parameter in the service configuration.

Sample service configuration:

"metadata": {
  "rpc": {
    "decompressor": "zlib"
  }
}

The following code provides an example on how to call the SDK:

package com.aliyun.openservices.eas.predict;
import com.aliyun.openservices.eas.predict.http.Compressor;
import com.aliyun.openservices.eas.predict.http.PredictClient;
import com.aliyun.openservices.eas.predict.http.HttpConfig;
public class TestString {
    public static void main(String[] args) throws Exception{
    	  // Start and initialize the client. 
        PredictClient client = new PredictClient(new HttpConfig());
        client.setEndpoint("18*******.cn-hangzhou.pai-eas.aliyuncs.com");
        client.setModelName("echo_compress");
        client.setToken("YzZjZjQwN2E4NGRkMDMxNDk5NzhhZDcwZDBjOTZjOGYwZDYxZGM2****");
        // or use Compressor.Gzip. 
        client.setCompressor(Compressor.Zlib);  
        // Specify the input string. 
        String request = "[{\"money_credit\": 3000000}, {\"money_credit\": 10000}]";
        System.out.println(request);
        // Return a string by using EAS. 
        String response = client.predict(request);
        System.out.println(response);
        // Shut down the client. 
        client.shutdown();
        return;
    }
}