Use the high-code mode to create an agent when you need to customize complex business logic, integrate development frameworks such as LangChain, or have fine-grained control over the runtime environment. This mode provides maximum development freedom and engineering capabilities.
Overview
Function Introduction
The high-code mode lets you customize an agent's implementation logic using code. It is ideal for experienced developers who need fine-grained control over behavior and performance.
In this mode, you can use your own code or frameworks, such as LangChain, to implement complex inference processes, tool orchestration, and enterprise-level integrations.
Key features include the following:
Multiple code sources and deployment methods: Upload code packages, deploy code from local storage or OSS, edit code online, or deploy agent services using custom container images.
Multiple language runtimes and custom startup methods: Choose from various runtimes, such as Python, Node.js, and Java, and configure the start command, listening port, and dependency structure.
Flexible resource and environment configuration: Configure CPU, memory, maximum concurrent sessions per instance, and session idle timeout to meet your requirements. Configure advanced parameters such as VPC networks, execution roles, access credentials, environment variables, and health checks.
Deep customization with SDKs and frameworks: Use the AgentRun SDK in your code to connect to models, Sandbox, tools, and external systems. Integrate with existing agent frameworks, such as LangChain, to implement complex workflows and business logic. For more information, see AgentRun SDK Documentation.
Deep customization with SDKs and frameworks: You can use the AgentRun SDK in your code to connect to models, Sandbox, tools, and external systems. You can also integrate with existing agent frameworks, such as LangChain, to implement complex workflows and business logic. For more information, see AgentRun SDK Documentation.
Complete development, debugging, and operations and maintenance (O&M) workflow: Use the online WebIDE, real-time logs, and a debugging interface based on the OpenAI Chat Completions protocol. This mode also supports version management and canary releases.
Core concepts
The following concepts are essential for creating a code-based agent. Skip this section if you are already familiar with them.
Core concept | Description |
Code source | The high-code mode supports multiple code sources for deploying an agent:
The method you choose depends on your team's habits and engineering practices, such as whether you already have a CI/CD pipeline for building images. |
Runtime type | When creating an agent, you need to select a runtime language environment. The following are currently supported:
Note Currently, official sample templates are only provided for Python 3.10. For other runtimes, use a custom code package or image. |
Sample code package | To help users get started quickly, the platform provides sample templates written based on the AgentRun software development kit (SDK). The official sample is a LangChain Agent example based on the AgentRun SDK. It includes sample code for model calls and Sandbox tool calls. |
Startup configuration | For code package modes (local upload, OSS, or online editing), you must specify:
If you use the official template, you can usually keep the default configurations. If you use custom code, specify the actual entry file and service port. For the container image mode, you must specify at least:
The start command is usually determined by the |
Resource configuration | Resource configuration is used to control the resource specifications and concurrent behavior of an agent instance. It includes:
Configuring these parameters correctly helps you balance performance and cost. |
Advanced configuration | Advanced configuration includes options related to the runtime environment, such as:
These configurations determine which resources the agent can access at runtime and how it accesses them. |
Observability | Supports one-click activation of agent observability to monitor and analyze the agent's operational status:
Note In application monitoring, token statistics are sourced from the Large Language Model (LLM) response. When you configure the code, ensure that the model response includes token information. |
Procedure
Preparations
Prepare the code-based project
Ensure you have runnable service code. The following steps use the official sample code package as an example. You can download the package here: Official sample code package.
NoteYou can also obtain the code package by creating an agent in no-code mode and then converting it to code mode. For more information, see Create an agent without code (no-code). After the conversion, select Code and Debug from the navigation pane on the left of the agent page to view and download the code.
The code can be packaged locally as a .zip file, or stored as a container image or an OSS code package.
Confirm the exposed service endpoint (start command, listening port).
Prepare runtime dependencies
Determine the runtime language version to use, such as Python 3.10, Python 3.12, Node.js 18/20, or Java 8/11/17.
NoteThe official sample code uses the Python 3.10 runtime.
If you use the code package mode:
Package the code and dependencies together in a .zip file.
For Python projects, place dependencies in a directory such as
/pythonor/code/python. Then, set thePYTHONPATHenvironment variable to/opt/python:/code/python.
If you use the container image mode:
Include the complete runtime environment and dependencies in the image.
Correctly configure the entry command and listening port in the Dockerfile.
Prepare models, tools, and Sandbox
You have created a model in Model Management and recorded its name.
If the agent needs to use tools, such as business APIs, web scraping, or online search, create them in Tool Management.
If you need code execution or browser capabilities, create a Sandbox service (BrowserTool, Code Interpreter) and record the service name.
Step 1: Go to the Create Agent page for code-based creation
Go to the AgentRun console. On the Agent Runtime page, click Create Agent. In the dialog box, select Create with Code.
The code-based agent creation requires the
AliyunDevsFCServicesDeployPolicypermission. If the current account lacks this permission, a dialog box appears. You can choose to grant permissions with one click or skip.Enter an Agent Name and Description. Use a meaningful name for easy management and identification.
Step 2: Configure the code
Select a code source
Select a code source (5 options):
Source
Description
Upload a code package
Upload a local .zip file
Object Storage Service (OSS)
Select the Bucket Name and OSS Directory
Container Image
Select an image from Alibaba Cloud Container Registry. Image address format:
registry-intl-vpc.{Region}.aliyuncs.comOnline Coding
Write and modify code in the WebIDE
Official Template
Use the official LangChain Agent sample code
Startup configuration
Upload Code Package/Object Storage Service (OSS)/Online Editing Mode
Start command: For example:
python3 main.pynode app.jsjava -jar app.jar
Listening port: For example,
9000. This must match the port your service listens on in the code.
NoteIf you use the official template, such as the Python sample above, the default settings are usually sufficient.
Container image mode
The start command is usually determined by the
CMD/ENTRYPOINTin the image's Dockerfile.You must specify the listening port in the configuration. It must match the port the application listens on inside the container.
Step 3: Configure resources, environment variables, and advanced settings
Set the runtime parameters in the Resource Configuration and Advanced Configuration sections.
Compute resources
CPU (Cores) and Memory: The compute resources allocated to each agent instance. More resources allow for faster processing of complex tasks but also increase costs.
Maximum concurrent sessions per instance: The number of sessions a single agent instance can handle simultaneously. Increasing this value can reduce the number of required instances and save costs, but you must ensure your code is thread-safe and that a single instance has enough CPU and memory to support the load.
Session idle timeout (seconds): If an instance is idle for longer than this time after the last session ends, the instance is automatically released.
Environment variables:
Configure runtime environment variables for the agent to pass parameters to the code as needed. If you use the official sample, you must configure the following environment variables:
MODEL_NAME: Required. Enter the name of the model that you created in Model Management.SANDBOX_NAME: Optional. The name of the Sandbox that you created.
Network configuration: Select a network type:
Network type
Description
PUBLIC
The default NIC is allowed to access the public network
PRIVATE
Access only through VPC. You must configure VPC, VSwitch, and security group
PUBLIC_AND_PRIVATE
Both public and VPC access are supported
NONE
No network access
When you select PRIVATE or PUBLIC_AND_PRIVATE, select a VPC, VSwitch (multiple selections supported), and Security Group from the drop-down lists.
Optional: Enable Fixed Public IP (requires feature access).
Log configuration
Enable the Log Feature to save the agent's operational logs to Simple Log Service. You can specify a Log Project and a Logstore. Select Auto Config or use the One-click Config feature in Custom Config to use default values.
Health check configuration
When you Enable Health Check, you can configure parameters such as the Health Check Path (for example,
/health), Health Check Interval (seconds), Timeout (seconds), and Failure Threshold. The system periodically sends HTTP requests to the specified path to check the service status. If the number of consecutive failures reaches the threshold, the system automatically restarts the service instance, which improves service availability and provides automatic fault recovery.
Step 4: Configure the execution role as needed
This configuration grants your agent code permission to access other Alibaba Cloud services, such as model services and OSS. AgentRun uses an execution role (a RAM role) to manage permissions. Your code assumes this role at runtime to obtain the corresponding operation permissions.
Trusted entity: Function Compute (fc.aliyuncs.com)
Required permissions: Add policies based on the services you use.
Operation: Select a role from the Execution Role ARN list. If no suitable role exists, follow these steps to add one.
Click the add button
to the right of the Execution Role ARN drop-down list to go to the RAM > Roles list.Click Create Role.
On the Create Role page, set Trusted Entity Type to Alibaba Cloud Service.
For Trusted Service, select Function Compute/FC.
Click OK, enter a Role Name, click OK, and complete the security authentication as prompted.
After the role is created, you are redirected to the role details page. Click Add Permissions.
Add policies based on the services you use. Common policy examples:
AliyunOSSFullAccess: If the agent needs to use resources in OSS, you must configure permissions to manage Object Storage Service (OSS).
AliyunAgentRunFullAccess: If you are creating the agent using the official sample, you must configure permissions to manage the AgentRun service.
AliyunDevsFullAccess: Permissions to manage the Serverless Devs platform.
Step 5: Configure access credentials
Configure credentials for the agent's endpoint, such as an API key, to protect your agent from unauthorized calls. Credentials are managed and injected by AgentRun's Credential Management.
In the Access Credentials module, click Inbound: Access Credentials.
Select a credential mode:
No credentials (not recommended): The agent's call address can be accessed anonymously from the internet, which poses a security risk. This mode is for functional testing only and must not be used in a production environment.
<<<<<<< HEAD
Use existing credentials (recommended): To ensure agent security, we recommend this option. Select an existing credential. If you do not have one, click the
icon and see Credential Management to create one.=======
Use existing credentials (recommended): To ensure agent security, we recommend this option. Select an existing credential. If you do not have one, you can click the
icon and see Credential Management to create one.>>>>>>> 5a60e6f (docs: update 265 file(s) from ICMS)
Step 6: Complete the configuration and test
After completing the steps above, click Start Deployment in the upper-right corner to finish creating the agent. You are automatically redirected to the agent details page. Select Code and Debug and test the agent in the Debugging Tool.
Step 7: Publish a version and perform a canary release
AgentRun supports version management and canary releases. After making changes to an agent's prompt, tools, or model, we recommend that you first publish a version. Then, create an endpoint and enable a secondary version (canary release) to allocate a small amount of traffic to the new version. Monitor the new version to confirm it is stable and reliable. Gradually increase the traffic percentage until it is fully online.
In the navigation pane on the left, select Versions and Canary Releases.
Publish the current version: Click Publish Version, enter a Version Description that describes the main changes and features of this version, and then click Publish Version.
Create an endpoint:
Enter an Endpoint Name. In the Primary Version drop-down list, select the version number you just published.
Enable secondary version (canary release): Select the Enable secondary version (canary release) checkbox, select a secondary version, and configure the Traffic Allocation percentage for the primary and secondary versions.
Next steps
Integrate the agent into your application
Go to the Integration and Publishing module to integrate the agent into your frontend web pages, backend applications, and other services. UI Integration, Code Integration, and Ecosystem Integration methods are supported. For more information, see Agent Integration and Publishing.
Enable agent observability
After creating the agent, configure observability capabilities as needed for subsequent debugging and O&M.
View infrastructure monitoring: Enabled by default. In the navigation pane on the left, select Observability > Infrastructure Monitoring to directly view:
Call count, success rate, and average/maximum latency.
Call trends and latency curves for the recent period.
Enable log collection (recommended)
On the agent details page, select Observability > Logs from the navigation pane on the left.
The first time you open this page, a message is displayed indicating that Logs are not enabled. Click Click to enable logs.
Select a configuration method:
Auto Config: Create a default SLS Project and Logstore with one click. Suitable for quick integration.
Custom Config: Select an existing Project and Logstore. Suitable for enterprise-level unified log management.
After saving, the agent's operational logs are automatically written to the corresponding Logstore. You can search, analyze, and set up alerts in the SLS console.
Connect to Application Monitoring and Tracing Analysis (advanced, optional)
You must activate ARMS when you first access this feature. On the Observability > Application Monitoring (or Tracing Analysis) page, click one-click activation.
Follow the prompts or see Agent LLM Observability to learn how to integrate application monitoring with your agent code. This enables application monitoring and tracing analysis.
After the configuration is complete, you can view business metrics and trends, such as token consumption analysis and performance analysis, on the Application Monitoring panel. On the Tracing Analysis panel, you can view Span details and call topology for each request to quickly locate slow requests and abnormal nodes.
FAQ and troubleshooting
Q1: When starting the agent, the log shows ModuleNotFoundError: No module named 'xxx'.
A: This is a typical missing Python dependency issue. Check the following:
If you are using the Upload Code Package mode, ensure that you followed the guidance in the Prepare runtime dependencies subsection of the Preparations section to package all dependency libraries into the
/pythondirectory of the code package.If the dependencies include C/C++ extensions and the compilation environment is complex, we recommend switching to the container image mode.
Q2: The agent fails to start, and the log shows a port listening error or timeout.
A: Check your code:
Ensure that the web service is listening on
0.0.0.0and not127.0.0.1.Ensure that the port number that the code listens on matches the one you configured for Listening Port in the console. A best practice is to dynamically retrieve the port number from the
AGENT_PORTenvironment variable.
Q3: Are official templates only provided for Python 3.10? How do I start with other languages?
A: Yes, they are. The official LangChain sample template currently only supports Python 3.10. For other runtimes, such as Node.js and Java, you must select the high-code mode and upload your own custom code package or use a custom image. You can refer to the official Python code sample and migrate its logic to your chosen language and framework. The core task is to implement an HTTP service that meets the specifications.