All Products
Search
Document Center

Platform For AI:Call PAI-EAS inference services over gRPC

Last Updated:May 15, 2026

PAI-EAS supports high-performance model inference using the gRPC protocol. Compared to HTTP/JSON, gRPC offers more efficient serialization and better streaming support, making it well-suited for latency-sensitive workloads or scenarios that require streaming inference.

Prerequisites

Before you begin, ensure you meet the following requirements:

  1. You have deployed a PAI-EAS inference service.

  2. You have configured a Go development environment (Go 1.18 or later is recommended).

  3. You have installed the required gRPC dependency packages.

Key points

Endpoint format

To call a service through the gateway, obtain the service address from the console, extract the domain name, and append the port number :80. For more information, see Obtain the access endpoint and token.

Important

The port is fixed at 80 because PAI-EAS exposes gRPC services through a gateway that listens on port 80.

The following table provides endpoint examples.

Access method

Console endpoint

gRPC endpoint (append :80)

Public network

http://grpc-test.123456***.cn-hangzhou.pai-eas.aliyuncs.com/

grpc-test.123456***.cn-hangzhou.pai-eas.aliyuncs.com:80

VPC

http://grpc-test.123456***.vpc.cn-hangzhou.pai-eas.aliyuncs.com/

grpc-test.123456***.vpc.cn-hangzhou.pai-eas.aliyuncs.com:80

Note: A gRPC endpoint does not include the http:// prefix.

Token authentication

To call a PAI-EAS gRPC inference service, you must pass the authentication token in the request metadata. PAI-EAS uses Bearer Token authentication. When you make a gRPC call, set the authorization field in the metadata as follows:

authorization: Bearer <Token>

The token must be prefixed with Bearer followed by a space. You can obtain the token from the service details page in the PAI-EAS console.

Code examples

Before you write your code, prepare the proto file for your service and compile the client stub.

The following examples show the core workflow for calling a PAI-EAS inference service over gRPC. For the corresponding proto file and server code, see gRPC service demo.

package main

import (
	"context"
	"log"
	"time"

	pb "your-project/proto"

	"google.golang.org/grpc"
	"google.golang.org/grpc/credentials/insecure"
	"google.golang.org/grpc/metadata"
)

func main() {
	// Replace with your gRPC endpoint. You can find it in the call info panel in the PAI-EAS console.
	// The public network endpoint format is: service-name.{uid}.{region}.pai-eas.aliyuncs.com:80 (the port is fixed at 80).
	host := "your-service-endpoint:your port"
	// Replace with your service token. You can find it on the service details page in the PAI-EAS console.
	token := "your-service-token"

	log.Printf("Connecting to the PAI-EAS gRPC service (%s)...", host)
        
        // 1. Establish a connection. Connections are thread-safe. We recommend that you create one connection and reuse it.
	conn, err := grpc.NewClient(
		host,
		grpc.WithTransportCredentials(insecure.NewCredentials()),
	)
	if err != nil {
		log.Fatalf("Failed to connect: %v", err)
	}
	defer conn.Close()

	// 2. Create a client.
	client := pb.NewGreeterServiceClient(conn)
	
	// 3. Inject the authentication token by using gRPC metadata.
	md := metadata.Pairs("authorization", "Bearer " + token)

	ctx := metadata.NewOutgoingContext(context.Background(), md)

	// 4. Set a timeout.
	ctx, cancel := context.WithTimeout(ctx, 5*time.Second)
	defer cancel()

	// 5. Construct the request.
	req := &pb.HelloRequest{Name: "World"}

	// 6. Make the call.
	resp, err := client.SayHello(ctx, req)
	if err != nil {
		log.Fatalf("Call failed: %v", err)
	}

	log.Printf("Received response: %s", resp.Message)
}
import grpc
import helloworld_pb2
import helloworld_pb2_grpc

# Replace with your gRPC endpoint. You can find it in the call info panel in the PAI-EAS console.
# The public network endpoint format is: service-name.{uid}.{region}.pai-eas.aliyuncs.com:80 (the port is fixed at 80).
host = "your-service-endpoint:your port"
# Replace with your service token. You can find it on the service details page in the PAI-EAS console.
token = "your-service-token"

# 1. Create a channel to connect to the server.
channel = grpc.insecure_channel(host)

# 2. Initialize the stub (client handle).
stub = helloworld_pb2_grpc.GreeterServiceStub(channel)

# 3. Construct the request.
request = helloworld_pb2.HelloRequest(name="World")

# 4. Construct the authentication metadata to pass the token to the PAI-EAS service.
metadata = (("authorization",  "Bearer " + token),)

print("Initiating gRPC call...")

try:
    # 5. Make a synchronous request. Pass the metadata for authentication.
    response = stub.SayHello(request, metadata=metadata, timeout=5.0)
    print(response.message)
except grpc.RpcError as e:
    print(f"Call failed: {e.code()} - {e.details()}")
finally:
    channel.close()

Best practices

  • Connection reuse: gRPC connections are thread-safe. We recommend creating a connection when your application starts and reusing it for all subsequent requests.

    Important

    Idle connections can be dropped by intermediate network devices. In a production environment, you must enable a keepalive heartbeat. Use  KeepaliveParams in Go and the  options parameter in Python. This practice keeps connections active and allows seamless, automatic reconnections.

  • Timeout: Set an appropriate timeout for each gRPC call to prevent requests from hanging indefinitely due to network or service issues.

FAQ

Authentication fails (UNAUTHENTICATED)

Check whether your token is correct and has not expired. Also, verify that the authorization field name in the metadata is spelled correctly. The token must be prefixed with Bearer followed by a space.

Connection times out (UNAVAILABLE)

Verify that the gRPC endpoint is correct and that the network is reachable.

When using a VPC endpoint, make sure your client and the PAI-EAS service are in the same VPC and region. Also, check that your security group rules allow traffic on the relevant port.

Appendix

gRPC service demo

API definition

  1. In your project's root directory, create a file named helloworld.proto to define a GreeterService service that includes a SayHello method.

    • Request: HelloRequest, which contains a string field named name.

    • Response: HelloResponse, which contains a string field named message.

    Click to view helloworld.proto

    syntax = "proto3";
    
    package helloworld;
    
    option go_package = "your-project/proto;helloworld";
    
    message HelloRequest {
      string name = 1;
    }
    
    message HelloResponse {
      string message = 1;
    }
    
    service GreeterService {
      rpc SayHello (HelloRequest) returns (HelloResponse);
    }
  2. Compile the .proto file to generate the helloworld_pb2_grpc.py and helloworld_pb2.py code files. The following command uses Python as an example.

    python -m grpc_tools.protoc -I. --python_out=. --grpc_python_out=. helloworld.proto

2. Server deployment

Deploy the service to PAI-EAS. The project structure is as follows:

my-grpc-demo
├── helloworld_pb2_grpc.py
├── helloworld_pb2.py
└── server.py  

Click to view server.py

import grpc
from concurrent import futures
import helloworld_pb2
import helloworld_pb2_grpc

# Implement the service logic.
class GreeterServicer(helloworld_pb2_grpc.GreeterServiceServicer):
    def SayHello(self, request, context):
        # Business logic: Concatenate a string.
        response_msg = f"Hello, {request.name}!"
        return helloworld_pb2.HelloResponse(message=response_msg)


def serve():
    # 1. Create a server with a thread pool.
    server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
  
    # 2. Register the service implementation with the server.
    helloworld_pb2_grpc.add_GreeterServiceServicer_to_server(GreeterServicer(), server)
   
    # 3. Listen on a port. This is the deployment address.
    port = "[::]:50051"
    server.add_insecure_port(port)
   
    print(f"Server started, listening on {port} ...")
    server.start()
   
    # 4. Block the main thread to keep the service running.
    server.wait_for_termination()

if __name__ == "__main__":
    serve()

Click to view the JSON configuration for the PAI-EAS service

{
    "cloud": {
        "computing": {
            "instances": [
                {
                    "type": "ecs.c7.large"
                }
            ]
        }
    },
    "containers": [
        {
            "image": "eas-registry-vpc.cn-hangzhou.cr.aliyuncs.com/pai-eas/python-inference:py39-ubuntu2004",
            "port": 50051,
            "prepare": {
                "pythonRequirements": [
                    "grpcio",
                    "grpcio-tools"
                ]
            },
            "script": "python /mnt/data/server.py"
        }
    ],
    "metadata": {
        "cpu": 2,
        "disk": "30Gi",
        "enable_grpc": true,
        "instance": 1,
        "memory": 4000,
        "name": "grpc_test",
        "rpc": {
            "keepalive": 5000
        },
        "workspace_id": "your-workspace-id"
    },
    "storage": [
        {
            "mount_path": "/mnt/data/",
            "oss": {
                "path": "oss://my-oss/eas/my-grpc-demo/",
                "readOnly": false
            }
        }
    ]
}