Deploy an Agent2Agent (A2A) protocol server in Knative - Container Service for Kubernetes

The Agent2Agent (A2A) protocol is an open standard designed to enable seamless communication and collaboration between AI agents. By deploying an A2A server in Knative, you can leverage its features, such as auto-scaling (including scaling to zero), to achieve on-demand resource consumption and rapid version iteration.

How it works

AI agents possess inference, planning, and memory capabilities, allowing them to learn autonomously and complete tasks on behalf of users. Just as Model Communication Protocol (MCP) provides a standard for Large Language Models (LLMs) to access data and tools, the A2A protocol defines a standardized framework for interoperability between agents.

Deploying an A2A server in Knative involves the following core interactions:

Discovery: After deployment, the Service exposes its interface through a standard Agent Card. This allows other agents to query its skills (AgentSkill) and capabilities (AgentCapabilities).
Communication: Agents communicate through standard HTTP/gRPC message exchanges, which are handled by Knative's gateway and service routing.
Collaboration: Agents delegate tasks and coordinate actions through APIs.

See A2A specification to learn about the protocol architecture for agent communication and its core concepts.

Prerequisites

You have deployed Knative in your cluster.
You have the gateway address.
Gateway address can be found on the Components or Services tab on the Knative page. The following figure shows an example.

Step 1: Deploy the A2A server

This example deploys a basic agent Service named helloworld-agent-server.

Create an a2a-service.yaml file.

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: helloworld-agent-server
  # Modify the namespace as needed
  namespace: default 
  annotations:
    # Use a wildcard domain name for quick verification
    knative.aliyun.com/serving-ingress: / 
spec:
  template:
    spec:
      containers:
      # Replace {region} with the actual region, such as cn-hangzhou
      - image: registry-{region}-vpc.ack.aliyuncs.com/acs/knative-samples-a2a:v1.0-952c112
        name: user-container
        env:
        # INVOKE defines the callback URL returned in the Agent Card. Replace it with the gateway access address of the service.
        - name: INVOKE
          value: http://<YOUR_GATEWAY_ADDRESS>/invoke
        ports:
        - containerPort: 9001
          name: http1
          protocol: TCP

Deploy the Service.
```
kubectl apply -f a2a-service.yaml
```

Check the status of the Knative Service.

kubectl get ksvc helloworld-agent-server

Expected output:

NAME                      URL                                                  LATESTCREATED                   LATESTREADY                     READY   REASON
helloworld-agent-server   http://helloworld-agent-server.default.example.com   helloworld-agent-server-00001   helloworld-agent-server-00001   True

Step 2: Verify the Service and get the Agent Card

After deployment, verify that the Service correctly returns an Agent Card that complies with the A2A protocol.

Obtain the gateway address and the default domain name of the Service.
1. On the ACK Clusters page, click the name of the target cluster. In the left navigation pane, choose Applications > Knative.
2. On the Services tab, obtain the Default Domain of the Service.
  The following figure shows the Gateway and Default Domain.

Access the metadata endpoint of the Service.

# Replace <GATEWAY_ADDRESS> with the actual gateway address.
curl http://<GATEWAY_ADDRESS>/.well-known/agent-card.json | jq .

The output should include the agent's capabilities, description, and a list of skills.

{
  "capabilities": {
    "streaming": true
  },
  "defaultInputModes": [
    "text"
  ],
  "defaultOutputModes": [
    "text"
  ],
  "description": "Just a hello world agent",
  "name": "Hello World Agent",
  "preferredTransport": "JSONRPC",
  "protocolVersion": "",
  "skills": [
    {
      "description": "Returns a 'Hello, world!'",
      "examples": [
        "hi",
        "hello"
      ],
      "id": "hello_world",
      "name": "Hello, world!",
      "tags": [
        "hello world"
      ]
    }
  ],
  "url": "http://XXX/invoke",
  "version": ""
}

Step 3: Call the Service using an A2A Client

Use Go to write client code that simulates another agent communicating with the deployed A2A server.

Prepare the development environment. Install the Go language environment.

Create a main.go file and add the following code to it.

package main

import (
	"context"
	"flag"
	"log"

	// Import the core libraries related to the A2A protocol
	"github.com/a2aproject/a2a-go/a2a"
	"github.com/a2aproject/a2a-go/a2aclient"
	"github.com/a2aproject/a2a-go/a2aclient/agentcard"

	// Import the libraries related to gRPC
	"google.golang.org/grpc"
	"google.golang.org/grpc/credentials/insecure"
)

// Replace <GATEWAY_ADDRESS> with the actual gateway address.
var cardURL = flag.String("card-url", "http://<GATEWAY_ADDRESS>", "Base URL of AgentCard client.")

func main() {
	flag.Parse()
	ctx := context.Background()

	// Service discovery
	card, err := agentcard.DefaultResolver.Resolve(ctx, *cardURL)
	if err != nil {
		log.Fatalf("Failed to resolve an AgentCard: %v", err)
	}

	// Configure the transport layer
	withInsecureGRPC := a2aclient.WithGRPCTransport(grpc.WithTransportCredentials(insecure.NewCredentials()))

	// Create the client
	client, err := a2aclient.NewFromCard(ctx, card, withInsecureGRPC)
	if err != nil {
		log.Fatalf("Failed to create a client: %v", err)
	}

	// Build the message
	msg := a2a.NewMessage(a2a.MessageRoleUser, a2a.TextPart{Text: "Hello, world"})
	resp, err := client.SendMessage(ctx, &a2a.MessageSendParams{Message: msg})
	if err != nil {
		log.Fatalf("Failed to send a message: %v", err)
	}

	log.Printf("Server responded with: %+v", resp)
}

Run the code to test it.

go mod init a2a-demo
go mod tidy
go run main.go

The expected output is as follows. This output indicates that the client successfully connected to the Knative Service, and the server processed the request and returned a response.

2025/11/27 17:24:21 Server responded with: &{ID:019ac4a0-c386-7cdc-9aad-d40fb8f98ae2 ContextID: Extensions:[] Metadata:map[] Parts:[{Text:Hello, world! Metadata:map[]}] ReferenceTasks:[] Role:agent TaskID:}

Apply in production

Custom domain names and HTTPS: Avoid using test domain names in a production environment. Instead, configure a custom domain name and adds an HTTPS certificate to secure communication between agents.
Cold start optimization: If the agent is called infrequently, Knative scales the number of instances down to zero. To avoid cold start latency on the first request, which can affect user experience, configure a minimum number of instances (MinScale) or a reserved instance.

Billing

There is no additional cost for Knative itself. However, you are responsible for the costs of any underlying cloud resources that are provisioned while using Knative, such as compute resources (Elastic Compute Service instances) and network resources (Application Load Balancer). These resources are billed separately by each respective cloud service. For detailed pricing information, see Cloud resource fee.