EAS SDK for Go lets you call inference services deployed on Elastic Algorithm Service (EAS) from Go applications. The SDK handles connection management and retry logic, freeing you to focus on your inference workflow.
The SDK supports three request types — string, TensorFlow tensor, and PyTorch tensor — and an asynchronous queuing mode for high-throughput scenarios.
Prerequisites
Before you begin, make sure you have:
A service deployed on EAS. Note the service endpoint and service name.
The service token for authentication.
Installation
No manual installation is required. The Go package manager downloads the SDK from GitHub automatically when you compile your code. The import path is:
import "github.com/pai-eas/eas-golang-sdk/eas"To customize the SDK, download the source from eas-golang-sdk and modify it locally.
Choose a client
EAS SDK for Go provides two clients depending on your use case:
| Client | Use when |
|---|---|
PredictClient | Calling a service synchronously — string, TensorFlow, or PyTorch input |
QueueClient | Sending and receiving data through an asynchronous inference queue |
For most services, start with PredictClient. Use QueueClient when your service is deployed as an asynchronous inference service with input and output queues.
Quick start
The following example calls a service that accepts string input and returns a string response — a common pattern for services built with custom processors or Predictive Model Markup Language (PMML) models.
package main
import (
"fmt"
"github.com/pai-eas/eas-golang-sdk/eas"
)
func main() {
// Initialize the client with your service endpoint and service name
client := eas.NewPredictClient("182848887922****.cn-shanghai.pai-eas.aliyuncs.com", "scorecard_pmml_example")
client.SetToken("YWFlMDYyZDNmNTc3M2I3MzMwYmY0MmYwM2Y2MTYxMTY4NzBkNzdj****")
client.Init()
// Send 100 prediction requests
req := "[{\"fea1\": 1, \"fea2\": 2}]"
for i := 0; i < 100; i++ {
resp, err := client.StringPredict(req)
if err != nil {
fmt.Printf("failed to predict: %v\n", err.Error())
} else {
fmt.Printf("%v\n", resp)
}
}
}Replace the endpoint, service name, and token with your actual values.
Call Init() after setting all configuration parameters. Parameters do not take effect until Init() is called.
PredictClient reference
Configuration methods
All setter methods must be called before Init().
| Method | Description | Default |
|---|---|---|
NewPredictClient(endpoint string, serviceName string) *PredictClient | Creates a client for the specified endpoint and service name. | — |
SetEndpoint(endpointName string) | Sets the server endpoint. | — |
SetServiceName(serviceName string) | Sets the service name. | — |
SetEndpointType(endpointType string) | Sets the gateway type. Valid values: "DEFAULT" (default gateway), "DIRECT" (Virtual Private Cloud (VPC) direct connection). | "DEFAULT" |
SetToken(token string) | Sets the service token for authentication. | — |
SetHttpTransport(transport *http.Transport) | Sets the HTTP transport configuration. | — |
SetRetryCount(max_retry_count int) | Sets the maximum number of retries after a request failure. Do not set this to 0 — the client must retry on server process errors, server errors, and closed persistent connections. | 5 |
SetTimeout(timeout int) | Sets the request timeout in milliseconds. | 5000 |
Init() | Initializes the client. Must be called after all configuration methods. | — |
Prediction methods
| Method | Description |
|---|---|
StringPredict(request string) string | Sends a string request and returns a string response. Use for PMML models and custom processors. |
TFPredict(request TFRequest) TFResponse | Sends a TensorFlow prediction request. |
TorchPredict(request TorchRequest) TorchResponse | Sends a PyTorch prediction request. |
Predict(request Request) Response | Generic method accepting string, TFRequest, or TorchRequest. Returns the corresponding response type. |
Call a TensorFlow service
Use TFRequest and TFResponse to call services backed by TensorFlow models.
package main
import (
"fmt"
"github.com/pai-eas/eas-golang-sdk/eas"
)
func main() {
client := eas.NewPredictClient("182848887922****.cn-shanghai.pai-eas.aliyuncs.com", "mnist_saved_model_example")
client.SetToken("YTg2ZjE0ZjM4ZmE3OTc0NzYxZDMyNmYzMTJjZTQ1YmU0N2FjMTAy****")
client.Init()
// Build a TensorFlow request: signature name, input tensor alias, shape, and data
tfreq := eas.TFRequest{}
tfreq.SetSignatureName("predict_images")
// Feed a float32 tensor named "images" with shape [1, 784]
tfreq.AddFeedFloat32("images", []int64{1, 784}, make([]float32, 784))
for i := 0; i < 100; i++ {
resp, err := client.TFPredict(tfreq)
if err != nil {
fmt.Printf("failed to predict: %v", err)
} else {
fmt.Printf("%v\n", resp)
}
}
}TFRequest methods
| Method | Description |
|---|---|
TFRequest(signatureName string) | Creates a request for the specified model signature. |
AddFeed<Type>(inputName string, shape []int64, content []<Type>) | Sets an input tensor. inputName is the tensor alias, shape is the tensor dimensions, and content is a one-dimensional array of data. Supported types: Int32, Int64, Float32, Float64, String, Bool. Example: AddFeedFloat32(). For other data types, construct in protocol buffer (PB) format. |
AddFetch(outputName string) | Specifies an output tensor to retrieve by alias. Optional for SavedModel format (all tensors exported if omitted); required for frozen models. |
TFResponse methods
| Method | Description |
|---|---|
GetTensorShape(outputName string) []int64 | Returns the shape of the output tensor as an array of dimensions. |
Get<Type>Val(outputName string) []<Type> | Returns the output tensor data as a one-dimensional array. Combine with GetTensorShape() to reconstruct multi-dimensional data. Supported types: Float, Double, Int, Int64, String, Bool. Example: GetFloatVal(). |
Call a PyTorch service
Use TorchRequest and TorchResponse to call services backed by PyTorch models.
package main
import (
"fmt"
"github.com/pai-eas/eas-golang-sdk/eas"
)
func main() {
client := eas.NewPredictClient("182848887922****.cn-shanghai.pai-eas.aliyuncs.com", "pytorch_resnet_example")
client.SetTimeout(500)
client.SetToken("ZjdjZDg1NWVlMWI2NTU5YzJiMmY5ZmE5OTBmYzZkMjI0YjlmYWVl****")
client.Init()
// Feed a float32 tensor at index 0 with shape [1, 3, 224, 224] (batch=1, RGB channels, 224x224 image)
req := eas.TorchRequest{}
req.AddFeedFloat32(0, []int64{1, 3, 224, 224}, make([]float32, 150528))
req.AddFetch(0)
for i := 0; i < 10; i++ {
resp, err := client.TorchPredict(req)
if err != nil {
fmt.Printf("failed to predict: %v", err)
} else {
fmt.Println(resp.GetTensorShape(0), resp.GetFloatVal(0))
}
}
}TorchRequest methods
| Method | Description |
|---|---|
TorchRequest() | Creates a PyTorch request object. |
AddFeed<Type>(index int, shape []int64, content []<Type>) | Sets an input tensor by index. Supported types: Int32, Int64, Float32, Float64. Example: AddFeedFloat32(). For other data types, construct in protocol buffer (PB) format. |
AddFetch(outputIndex int) | Specifies an output tensor to retrieve by index. Optional — if not called, all output tensors are returned. |
TorchResponse methods
| Method | Description |
|---|---|
GetTensorShape(outputIndex int) []int64 | Returns the shape of the output tensor as an array of dimensions. |
Get<Type>Val(outputIndex int) []<Type> | Returns the output tensor data as a one-dimensional array. Supported types: Float, Double, Int, Int64. Example: GetFloatVal(). |
Use VPC direct connection
VPC direct connection reduces latency in high-concurrency and heavy-traffic scenarios. It is only available for services deployed in a dedicated resource group for EAS, and requires the resource group and the specified vSwitch to be connected to the same Virtual Private Cloud (VPC).
For setup instructions, see Work with EAS resource groups and Configure network connectivity.
The only difference from the standard mode is one additional line: client.SetEndpointType(eas.EndpointTypeDirect).
package main
import (
"fmt"
"github.com/pai-eas/eas-golang-sdk/eas"
)
func main() {
// Use the VPC endpoint for direct connection
client := eas.NewPredictClient("pai-eas-vpc.cn-shanghai.aliyuncs.com", "scorecard_pmml_example")
client.SetToken("YWFlMDYyZDNmNTc3M2I3MzMwYmY0MmYwM2Y2MTYxMTY4NzBkNzdj****")
client.SetEndpointType(eas.EndpointTypeDirect)
client.Init()
req := "[{\"fea1\": 1, \"fea2\": 2}]"
for i := 0; i < 100; i++ {
resp, err := client.StringPredict(req)
if err != nil {
fmt.Printf("failed to predict: %v\n", err.Error())
} else {
fmt.Printf("%v\n", resp)
}
}
}Configure HTTP transport
Use SetHttpTransport() to tune connection pool size and timeouts for your traffic pattern.
package main
import (
"fmt"
"github.com/pai-eas/eas-golang-sdk/eas"
"net/http"
"time"
)
func main() {
client := eas.NewPredictClient("pai-eas-vpc.cn-shanghai.aliyuncs.com", "network_test")
client.SetToken("MDAwZDQ3NjE3OThhOTI4ODFmMjJiYzE0MDk1NWRkOGI1MmVhMGI0****")
client.SetEndpointType(eas.EndpointTypeDirect)
client.SetHttpTransport(&http.Transport{
MaxConnsPerHost: 300,
TLSHandshakeTimeout: 100 * time.Millisecond,
ResponseHeaderTimeout: 200 * time.Millisecond,
ExpectContinueTimeout: 200 * time.Millisecond,
})
}Use the queuing service
When a service is deployed as an asynchronous inference service, EAS automatically creates an input queue and an output queue:
Input queue:
<domain>/api/predict/<service_name>Output queue:
<domain>/api/predict/<service_name>/sink
Use QueueClient to send data to the input queue and subscribe to results from the output queue. In the example below, one goroutine continuously puts data into the queue while another subscribes using a watcher and commits each message after receiving it.
const (
QueueEndpoint = "182848887922****.cn-shanghai.pai-eas.aliyuncs.com"
// If the EAS service name is test_qservice:
// input queue: test_qservice
// output queue: test_qservice/sink
QueueName = "test_qservice"
QueueToken = "YmE3NDkyMzdiMzNmMGM3ZmE4ZmNjZDk0M2NiMDA3OTZmNzc1MTUx****"
)
queue, err := NewQueueClient(QueueEndpoint, QueueName, QueueToken)
// Clear all existing messages before starting
attrs, err := queue.Attributes()
if index, ok := attrs["stream.lastEntry"]; ok {
idx, _ := strconv.ParseUint(index, 10, 64)
queue.Truncate(context.Background(), idx+1)
}
ctx, cancel := context.WithCancel(context.Background())
// Producer: send messages to the queue every microsecond
go func() {
i := 0
for {
select {
case <-time.NewTicker(time.Microsecond * 1).C:
_, _, err := queue.Put(context.Background(), []byte(strconv.Itoa(i)), types.Tags{})
if err != nil {
fmt.Printf("Error occured, retry to handle it: %v\n", err)
}
i += 1
case <-ctx.Done():
break
}
}
}()
// Consumer: subscribe to the queue with a window of 5 messages, manual commit
watcher, err := queue.Watch(context.Background(), 0, 5, false, false)
if err != nil {
fmt.Printf("Failed to create a watcher to watch the queue: %v\n", err)
return
}
// Read 100 messages and commit each one after processing
for i := 0; i < 100; i++ {
df := <-watcher.FrameChan()
err := queue.Commit(context.Background(), df.Index.Uint64())
if err != nil {
fmt.Printf("Failed to commit index: %v(%v)\n", df.Index, err)
}
}
// Shut down
watcher.Close()
cancel()QueueClient methods
| Method | Description |
|---|---|
NewQueueClient(endpoint, queueName, token string) (*QueueClient, error) | Creates a queue client for the specified endpoint, queue name, and token. |
Put(ctx context.Context, data []byte, tags types.Tags) (index uint64, requestId string, err error) | Writes a data record to the queue. Returns the record's index and a request ID that can be used to retrieve the record later. |
GetByIndex(ctx context.Context, index uint64) (dfs []types.DataFrame, err error) | Retrieves and removes a record by its index. |
GetByRequestId(ctx context.Context, requestId string) (dfs []types.DataFrame, err error) | Retrieves and removes a record by its request ID. |
Get(ctx context.Context, index uint64, length int, timeout time.Duration, autoDelete bool, tags types.Tags) (dfs []types.DataFrame, err error) | Retrieves records starting from index. Returns up to length records within timeout. Set autoDelete to false to allow repeated retrieval; use Del() to remove records manually. Filter by tags to retrieve only tagged records. GetByIndex() and GetByRequestId() are wrappers around this method. |
Del(ctx context.Context, indexes ...uint64) | Deletes records matching the specified index values. |
Truncate(ctx context.Context, index uint64) error | Removes all records before index, retaining only records after index. |
Attributes() (attrs types.Attributes, err error) | Returns queue attributes, including total record count and current record count. All attribute keys and values are strings. |
Watch(ctx context.Context, index, window uint64, indexOnly bool, autocommit bool) (watcher types.Watcher, err error) | Subscribes to the queue starting from index. The server pushes at most window unacknowledged records at a time — once N records are committed, N more are pushed, keeping concurrency bounded. Set autocommit to false and call Commit() manually after each record is processed; uncommitted records are re-delivered to other instances if the current instance fails. |
Commit(ctx context.Context, indexes ...uint64) error | Marks records as processed. Committed records can be removed from the queue. |
types.Watcher methods
| Method | Description |
|---|---|
FrameChan() <-chan types.DataFrame | Returns a channel from which you can read records pushed by the server. The channel can be read repeatedly. |
Close() | Stops the watcher and closes backend connections. Only one watcher can be active per client — close the current watcher before creating another. |