All Products
Search
Document Center

Container Service for Kubernetes:Deploy an agent application as a Knative service

Last Updated:Jun 18, 2026

Deploy an agent application developed with the AgentScope framework to an ACK Knative environment with a single command to leverage Serverless capabilities like auto scaling (including scaling to zero) and revision management for fast, elastic, and low-cost agent hosting.

How it works

The agentscope deploy knative command wraps the complex process of containerizing an agent application and deploying it to a cluster. The core workflow is as follows:

  1. Packages the application: It packages the specified Python agent application code, its dependencies (requirements.txt), and environment variables.

  2. Builds the image: It builds a container image that includes the agent application in your local Docker environment, based on a specified base image.

  3. Pushes the image: It pushes the built image to a specified container image registry, such as Container Registry (ACR).

  4. Deploys the service: It generates and applies a Knative Service (ksvc) manifest in the target cluster. Knative then uses this manifest to create a Deployment and Pods, and automatically configures network routing, load balancing, and auto scaling policies.

Before you begin

  • You have deployed Knative components in the cluster.

  • You have deployed Knative in the ACS cluster.

  • Docker is installed and running on your local machine for building container images.

  • You have installed AgentScope Runtime by using the pip command.

    # Basic installation
    pip install agentscope-runtime>=1.1.0
    # Dependencies for Kubernetes deployment
    pip install "agentscope-runtime[ext]>=1.1.0"

Step 1: Create an agent project

If you do not have an agent application, use the following example file structure and code.

  1. Create the project directory structure:

    my-agent-project/
    ├── app_agent.py          # Main file for the agent application
    ├── requirements.txt      # Python dependencies (optional)
    └── .env                  # Environment variables (optional)
  2. Write the agent code in the app_agent.py file.

    The following code creates an agent that uses a Qwen model and supports code execution and multi-turn conversations.
    # -*- coding: utf-8 -*-
    import os
    from contextlib import asynccontextmanager
    from fastapi import FastAPI
    from agentscope.agent import ReActAgent
    from agentscope.formatter import DashScopeChatFormatter
    from agentscope.model import DashScopeChatModel
    from agentscope.pipeline import stream_printing_messages
    from agentscope.tool import Toolkit, execute_python_code
    from agentscope.memory import InMemoryMemory
    from agentscope.session import JSONSession
    from agentscope_runtime.engine.app import AgentApp
    from agentscope_runtime.engine.schemas.agent_schemas import AgentRequest
    @asynccontextmanager
    async def lifespan(app: FastAPI):
        """Initialize the service."""
        app.state.session = JSONSession(
            save_dir="./",  # Directory to save all session files
        )
        try:
            yield
        finally:
            print("AgentApp is shutting down...")
    # Create an AgentApp
    agent_app = AgentApp(
        app_name="MyAssistant",
        app_description="A helpful assistant agent",
        lifespan=lifespan,
    )
    @agent_app.query(framework="agentscope")
    async def query_func(
        self,
        msgs,
        request: AgentRequest = None,
        **kwargs,
    ):
        """Process user queries."""
        session_id = request.session_id
        user_id = request.user_id
        # Create toolkit with Python execution
        toolkit = Toolkit()
        toolkit.register_tool_function(execute_python_code)
        # Create agent
        agent = ReActAgent(
            name="MyAssistant",
            model=DashScopeChatModel(
                "qwen-turbo",
                api_key=os.getenv("DASHSCOPE_API_KEY"),
                enable_thinking=True,
                stream=True,
            ),
            sys_prompt="You're a helpful assistant.",
            toolkit=toolkit,
            memory=InMemoryMemory(),
            formatter=DashScopeChatFormatter(),
        )
        agent.set_console_output_enabled(False)
        await agent_app.state.session.load_session_state(
            session_id=session_id,
            user_id=user_id,
            agent=agent,
        )
        async for msg, last in stream_printing_messages(
            agents=[agent],
            coroutine_task=agent(msgs),
        ):
            yield msg, last
        await agent_app.state.session.save_session_state(
            session_id=session_id,
            user_id=user_id,
            agent=agent,
        )
    if __name__ == "__main__":
        agent_app.run()

Step 2: Deploy the agent application

Use the agentscope deploy command to deploy your local agent application as a Knative service.

  1. Run the deployment command.

    Navigate to the my-agent-project directory and run the following command to deploy the service.

    Replace DASHSCOPE_API_KEY with your actual value.
    agentscope deploy knative app_agent.py \
      --image-name agent_app \
      --env DASHSCOPE_API_KEY=sk-xxx \
      --image-tag linux-amd64-18 \
      --registry-url registry.cn-hangzhou.aliyuncs.com \
      --base-image registry.cn-hangzhou.aliyuncs.com/knative-sample/python:3.10-slim-bookworm \
      --registry-namespace knative-sample \
      --namespace default \
      --push

    Command syntax

    agentscope deploy knative SOURCE [OPTIONS]
    • SOURCE: The path to the Python file, such as app_agent.py.

    • [OPTIONS]: Common parameters are described below.

      For more parameters, run agentscope deploy knative --help.

      Parameter

      Type

      Default

      Description

      --namespace

      string

      agentscope-runtime

      The namespace where you deploy the agent.

      --kube-config-path-c

      path

      None

      The path to the KubeConfig file used to connect to the cluster. If you do not specify this parameter, the command uses the default path.

      --port

      integer

      8080

      The port that the agent application listens on inside the container.

      --image-name

      string

      agent_app

      The name of the container image for the agent application.

      --image-tag

      string

      linux-amd64

      The tag of the container image for the agent application.

      --registry-url

      string

      localhost

      The URL of the target container registry to push the image to, for example, registry.cn-hangzhou.aliyuncs.com.

      --registry-namespace

      string

      agentscope-runtime

      The namespace (or project) used in the container registry.

      --push

      flag

      False

      Add this flag to push the built image to a remote registry. This is typically required for deployments to a remote cluster.

      --base-image

      string

      python:3.10-slim-bookworm

      The base image used to build the application. It must include a Python environment.

      --requirements

      string

      None

      Specifies the Python dependencies for the application. This can be a path to a requirements.txt file or a comma-separated list of packages.

      --cpu-request

      string

      200m

      Sets the CPU resource request for the Pod. The unit is m (millicores) or an integer core value, such as 200m or 1.

      --cpu-limit

      string

      1000m

      Sets the CPU resource limit for the Pod, for example, 1000m or 2.

      --memory-request

      string

      512Mi

      Sets the memory resource request for the Pod. Common units are Mi or Gi, for example, 512Mi or 1Gi.

      --memory-limit

      string

      2Gi

      Sets the memory resource limit for the Pod, for example, 2Gi or 4Gi.

      --image-pull-policy

      choice

      IfNotPresent

      Sets the image pull policy for the Pod. Valid values are Always, IfNotPresent, and Never.

      --deploy-timeout

      integer

      300

      The timeout period in seconds for waiting for the Knative service to become ready.

      --health-check

      flag

      None

      Add this flag to enable a health check for the Knative service.

      --platform

      string

      linux/amd64

      The target hardware platform for building the image, for example, linux/amd64 or linux/arm64.

      --pypi-mirror

      string

      None

      The PyPI mirror to use when installing Python packages, for example, https://pypi.tuna.tsinghua.edu.cn/simple.

  2. View the deployment result.

    After the deployment succeeds, the terminal outputs information such as the service URL. Record this URL to use in the next step.

    Deployment successful!
    Deployment ID: d4b4a54d-9976-443c-a1da-a77643******
    Resource Name: agent-d4b4*****
    URL: http://agent-03e*****.default.example.com
    Namespace: default

Step 3: Access the deployed agent

  1. Obtain the access gateway.

    1. On the ACK Clusters page, click the name of your cluster. In the left navigation pane, click Applications > Knative.

    2. Log on to the ACS console. In the left navigation pane, click Clusters.

    3. On the Clusters page, click the name of the target cluster. In the left navigation pane, choose Applications > Knative.

    4. On the Services or Component management page, obtain the Gateway.

      The IP address of the Access Gateway (for example, 120.xxx.159) is at the bottom of the page. Before accessing the service, you must resolve the service domain to this IP address.

  2. Send a request to the agent service using the curl command.

    Replace 115.29.xxx.xxx with the access gateway IP address and replace the Host value with the access URL you obtained earlier.
    curl -i -X POST "http://115.29.xxx.xxx:80/process" \
      -H "Content-Type: application/json" \
      -H "Host: agent-03e*****.default.example.com" \
      -d '{
        "input": [
          {
            "role": "user",
            "content": [
              {
                "type": "text",
                "text": "Hello, how are you?"
              }
            ]
          }
        ],
        "session_id": "123"
      }'
  3. View the result. The response streams the agent's thinking process and the final output.

Step 4: Observe auto scaling

  • Scale-down: After a period of inactivity, Knative automatically scales the service's Pods down to zero to save resources. You can observe the changes in the Pods by running the following command:

    # Continuously watch the Pods in the specified namespace
    kubectl get pods -n default -w

    Alternatively, on the Knative page, click Services, and then click the service name. You can view the running status of the Pods for the current revision at the bottom of the page.

    On the Revision information tab, the Ready/requested Pods column for the current revision shows 0/0, which confirms that the Pods have scaled to zero.

  • Scale-up: When a new request arrives, Knative automatically starts new Pods within seconds to handle the request.

Production recommendations

  • Persist state: The InMemoryStateService and InMemorySessionHistoryService mentioned in the example lose all state and conversation history when a Pod restarts, which makes them unsuitable for production. For production deployments, switch to an implementation backed by Redis or another persistent storage solution.

  • Manage secrets: Do not pass sensitive information like API_KEY directly on the command line or hard-code it in your application. Use Kubernetes Secrets and mount them as environment variables in the container through the Knative Service configuration.

  • Plan resources: Based on your expected workload, stress-test and configure parameters such as --cpu-request, --cpu-limit, --memory-request, and --memory-limit to ensure service performance and stability.

  • Improve observability: Configure log collection (such as Log Service) and monitoring with alerts (such as Application Real-Time Monitoring Service (ARMS)) to troubleshoot issues and monitor service health.

Related operations

  • Update the agent: After modifying the code, rerun the agentscope deploy knative command with a new image tag (for example, --image-tag v1.1) to perform a rolling update.

  • Uninstall the agent:

    # Replace <resource-name> and <namespace> with your actual values
    kubectl delete ksvc <resource-name> -n <namespace>
    This operation only deletes the service in Kubernetes. It does not delete the container image that was pushed to the container registry.