Deploy an agent application developed with the AgentScope framework to an ACK Knative environment with a single command to leverage Serverless capabilities like auto scaling (including scaling to zero) and revision management for fast, elastic, and low-cost agent hosting.
How it works
The agentscope deploy knative command wraps the complex process of containerizing an agent application and deploying it to a cluster. The core workflow is as follows:
-
Packages the application: It packages the specified Python agent application code, its dependencies (
requirements.txt), and environment variables. -
Builds the image: It builds a container image that includes the agent application in your local Docker environment, based on a specified base image.
-
Pushes the image: It pushes the built image to a specified container image registry, such as Container Registry (ACR).
-
Deploys the service: It generates and applies a Knative Service (
ksvc) manifest in the target cluster. Knative then uses this manifest to create a Deployment and Pods, and automatically configures network routing, load balancing, and auto scaling policies.
Before you begin
-
You have deployed Knative components in the cluster.
-
You have deployed Knative in the ACS cluster.
-
Docker is installed and running on your local machine for building container images.
-
You have installed AgentScope Runtime by using the
pipcommand.# Basic installation pip install agentscope-runtime>=1.1.0 # Dependencies for Kubernetes deployment pip install "agentscope-runtime[ext]>=1.1.0"
Step 1: Create an agent project
If you do not have an agent application, use the following example file structure and code.
-
Create the project directory structure:
my-agent-project/ ├── app_agent.py # Main file for the agent application ├── requirements.txt # Python dependencies (optional) └── .env # Environment variables (optional) -
Write the agent code in the
app_agent.pyfile.The following code creates an agent that uses a Qwen model and supports code execution and multi-turn conversations.
# -*- coding: utf-8 -*- import os from contextlib import asynccontextmanager from fastapi import FastAPI from agentscope.agent import ReActAgent from agentscope.formatter import DashScopeChatFormatter from agentscope.model import DashScopeChatModel from agentscope.pipeline import stream_printing_messages from agentscope.tool import Toolkit, execute_python_code from agentscope.memory import InMemoryMemory from agentscope.session import JSONSession from agentscope_runtime.engine.app import AgentApp from agentscope_runtime.engine.schemas.agent_schemas import AgentRequest @asynccontextmanager async def lifespan(app: FastAPI): """Initialize the service.""" app.state.session = JSONSession( save_dir="./", # Directory to save all session files ) try: yield finally: print("AgentApp is shutting down...") # Create an AgentApp agent_app = AgentApp( app_name="MyAssistant", app_description="A helpful assistant agent", lifespan=lifespan, ) @agent_app.query(framework="agentscope") async def query_func( self, msgs, request: AgentRequest = None, **kwargs, ): """Process user queries.""" session_id = request.session_id user_id = request.user_id # Create toolkit with Python execution toolkit = Toolkit() toolkit.register_tool_function(execute_python_code) # Create agent agent = ReActAgent( name="MyAssistant", model=DashScopeChatModel( "qwen-turbo", api_key=os.getenv("DASHSCOPE_API_KEY"), enable_thinking=True, stream=True, ), sys_prompt="You're a helpful assistant.", toolkit=toolkit, memory=InMemoryMemory(), formatter=DashScopeChatFormatter(), ) agent.set_console_output_enabled(False) await agent_app.state.session.load_session_state( session_id=session_id, user_id=user_id, agent=agent, ) async for msg, last in stream_printing_messages( agents=[agent], coroutine_task=agent(msgs), ): yield msg, last await agent_app.state.session.save_session_state( session_id=session_id, user_id=user_id, agent=agent, ) if __name__ == "__main__": agent_app.run()
Step 2: Deploy the agent application
Use the agentscope deploy command to deploy your local agent application as a Knative service.
-
Run the deployment command.
Navigate to the
my-agent-projectdirectory and run the following command to deploy the service.Replace
DASHSCOPE_API_KEYwith your actual value.agentscope deploy knative app_agent.py \ --image-name agent_app \ --env DASHSCOPE_API_KEY=sk-xxx \ --image-tag linux-amd64-18 \ --registry-url registry.cn-hangzhou.aliyuncs.com \ --base-image registry.cn-hangzhou.aliyuncs.com/knative-sample/python:3.10-slim-bookworm \ --registry-namespace knative-sample \ --namespace default \ --push -
View the deployment result.
After the deployment succeeds, the terminal outputs information such as the service URL. Record this URL to use in the next step.
Deployment successful! Deployment ID: d4b4a54d-9976-443c-a1da-a77643****** Resource Name: agent-d4b4***** URL: http://agent-03e*****.default.example.com Namespace: default
Step 3: Access the deployed agent
-
Obtain the access gateway.
On the ACK Clusters page, click the name of your cluster. In the left navigation pane, click .
-
Log on to the ACS console. In the left navigation pane, click Clusters.
-
On the Clusters page, click the name of the target cluster. In the left navigation pane, choose .
-
On the Services or Component management page, obtain the Gateway.
The IP address of the Access Gateway (for example,
120.xxx.159) is at the bottom of the page. Before accessing the service, you must resolve the service domain to this IP address.
-
Send a request to the agent service using the
curlcommand.Replace
115.29.xxx.xxxwith the access gateway IP address and replace theHostvalue with the access URL you obtained earlier.curl -i -X POST "http://115.29.xxx.xxx:80/process" \ -H "Content-Type: application/json" \ -H "Host: agent-03e*****.default.example.com" \ -d '{ "input": [ { "role": "user", "content": [ { "type": "text", "text": "Hello, how are you?" } ] } ], "session_id": "123" }' -
View the result. The response streams the agent's thinking process and the final output.
Step 4: Observe auto scaling
-
Scale-down: After a period of inactivity, Knative automatically scales the service's Pods down to zero to save resources. You can observe the changes in the Pods by running the following command:
# Continuously watch the Pods in the specified namespace kubectl get pods -n default -wAlternatively, on the Knative page, click Services, and then click the service name. You can view the running status of the Pods for the current revision at the bottom of the page.
On the Revision information tab, the Ready/requested Pods column for the current revision shows
0/0, which confirms that the Pods have scaled to zero. -
Scale-up: When a new request arrives, Knative automatically starts new Pods within seconds to handle the request.
Production recommendations
-
Persist state: The
InMemoryStateServiceandInMemorySessionHistoryServicementioned in the example lose all state and conversation history when a Pod restarts, which makes them unsuitable for production. For production deployments, switch to an implementation backed by Redis or another persistent storage solution. -
Manage secrets: Do not pass sensitive information like
API_KEYdirectly on the command line or hard-code it in your application. Use Kubernetes Secrets and mount them as environment variables in the container through the Knative Service configuration. -
Plan resources: Based on your expected workload, stress-test and configure parameters such as
--cpu-request,--cpu-limit,--memory-request, and--memory-limitto ensure service performance and stability. -
Improve observability: Configure log collection (such as Log Service) and monitoring with alerts (such as Application Real-Time Monitoring Service (ARMS)) to troubleshoot issues and monitor service health.
Related operations
-
Update the agent: After modifying the code, rerun the
agentscope deploy knativecommand with a new image tag (for example,--image-tag v1.1) to perform a rolling update. -
Uninstall the agent:
# Replace <resource-name> and <namespace> with your actual values kubectl delete ksvc <resource-name> -n <namespace>This operation only deletes the service in Kubernetes. It does not delete the container image that was pushed to the container registry.