The Agent2Agent (A2A) protocol is an open standard designed to enable seamless communication and collaboration between AI agents. By deploying an A2A server in Knative, you can leverage its features, such as auto-scaling (including scaling to zero), to achieve on-demand resource consumption and rapid version iteration.
How it works
AI agents possess inference, planning, and memory capabilities, allowing them to learn autonomously and complete tasks on behalf of users. Just as Model Communication Protocol (MCP) provides a standard for Large Language Models (LLMs) to access data and tools, the A2A protocol defines a standardized framework for interoperability between agents.
Deploying an A2A server in Knative involves the following core interactions:
Discovery: After deployment, the Service exposes its interface through a standard Agent Card. This allows other agents to query its skills (AgentSkill) and capabilities (AgentCapabilities).
Communication: Agents communicate through standard HTTP/gRPC message exchanges, which are handled by Knative's gateway and service routing.
Collaboration: Agents delegate tasks and coordinate actions through APIs.
See A2A specification to learn about the protocol architecture for agent communication and its core concepts.
Prerequisites
You have deployed Knative in your cluster.
You have the gateway address.
Gateway address can be found on the Components or Services tab on the Knative page. The following figure shows an example.

Step 1: Deploy the A2A server
This example deploys a basic agent Service named helloworld-agent-server.
Create an
a2a-service.yamlfile.apiVersion: serving.knative.dev/v1 kind: Service metadata: name: helloworld-agent-server # Modify the namespace as needed namespace: default annotations: # Use a wildcard domain name for quick verification knative.aliyun.com/serving-ingress: / spec: template: spec: containers: # Replace {region} with the actual region, such as cn-hangzhou - image: registry-{region}-vpc.ack.aliyuncs.com/acs/knative-samples-a2a:v1.0-952c112 name: user-container env: # INVOKE defines the callback URL returned in the Agent Card. Replace it with the gateway access address of the service. - name: INVOKE value: http://<YOUR_GATEWAY_ADDRESS>/invoke ports: - containerPort: 9001 name: http1 protocol: TCPDeploy the Service.
kubectl apply -f a2a-service.yamlCheck the status of the Knative Service.
kubectl get ksvc helloworld-agent-serverExpected output:
NAME URL LATESTCREATED LATESTREADY READY REASON helloworld-agent-server http://helloworld-agent-server.default.example.com helloworld-agent-server-00001 helloworld-agent-server-00001 True
Step 2: Verify the Service and get the Agent Card
After deployment, verify that the Service correctly returns an Agent Card that complies with the A2A protocol.
Obtain the gateway address and the default domain name of the Service.
On the ACK Clusters page, click the name of the target cluster. In the left navigation pane, choose .
On the Services tab, obtain the Default Domain of the Service.
The following figure shows the Gateway and Default Domain.

Access the metadata endpoint of the Service.
# Replace <GATEWAY_ADDRESS> with the actual gateway address. curl http://<GATEWAY_ADDRESS>/.well-known/agent-card.json | jq .The output should include the agent's
capabilities,description, and a list ofskills.{ "capabilities": { "streaming": true }, "defaultInputModes": [ "text" ], "defaultOutputModes": [ "text" ], "description": "Just a hello world agent", "name": "Hello World Agent", "preferredTransport": "JSONRPC", "protocolVersion": "", "skills": [ { "description": "Returns a 'Hello, world!'", "examples": [ "hi", "hello" ], "id": "hello_world", "name": "Hello, world!", "tags": [ "hello world" ] } ], "url": "http://XXX/invoke", "version": "" }
Step 3: Call the Service using an A2A Client
Use Go to write client code that simulates another agent communicating with the deployed A2A server.
Prepare the development environment. Install the Go language environment.
Create a
main.gofile and add the following code to it.package main import ( "context" "flag" "log" // Import the core libraries related to the A2A protocol "github.com/a2aproject/a2a-go/a2a" "github.com/a2aproject/a2a-go/a2aclient" "github.com/a2aproject/a2a-go/a2aclient/agentcard" // Import the libraries related to gRPC "google.golang.org/grpc" "google.golang.org/grpc/credentials/insecure" ) // Replace <GATEWAY_ADDRESS> with the actual gateway address. var cardURL = flag.String("card-url", "http://<GATEWAY_ADDRESS>", "Base URL of AgentCard client.") func main() { flag.Parse() ctx := context.Background() // Service discovery card, err := agentcard.DefaultResolver.Resolve(ctx, *cardURL) if err != nil { log.Fatalf("Failed to resolve an AgentCard: %v", err) } // Configure the transport layer withInsecureGRPC := a2aclient.WithGRPCTransport(grpc.WithTransportCredentials(insecure.NewCredentials())) // Create the client client, err := a2aclient.NewFromCard(ctx, card, withInsecureGRPC) if err != nil { log.Fatalf("Failed to create a client: %v", err) } // Build the message msg := a2a.NewMessage(a2a.MessageRoleUser, a2a.TextPart{Text: "Hello, world"}) resp, err := client.SendMessage(ctx, &a2a.MessageSendParams{Message: msg}) if err != nil { log.Fatalf("Failed to send a message: %v", err) } log.Printf("Server responded with: %+v", resp) }Run the code to test it.
go mod init a2a-demo go mod tidy go run main.goThe expected output is as follows. This output indicates that the client successfully connected to the Knative Service, and the server processed the request and returned a response.
2025/11/27 17:24:21 Server responded with: &{ID:019ac4a0-c386-7cdc-9aad-d40fb8f98ae2 ContextID: Extensions:[] Metadata:map[] Parts:[{Text:Hello, world! Metadata:map[]}] ReferenceTasks:[] Role:agent TaskID:}
Apply in production
Custom domain names and HTTPS: Avoid using test domain names in a production environment. Instead, configure a custom domain name and adds an HTTPS certificate to secure communication between agents.
Cold start optimization: If the agent is called infrequently, Knative scales the number of instances down to zero. To avoid cold start latency on the first request, which can affect user experience, configure a minimum number of instances (MinScale) or a reserved instance.
Billing
There is no additional cost for Knative itself. However, you are responsible for the costs of any underlying cloud resources that are provisioned while using Knative, such as compute resources (Elastic Compute Service instances) and network resources (Application Load Balancer). These resources are billed separately by each respective cloud service. For detailed pricing information, see Cloud resource fee.