The high-code mode offers flexible code customization and resource control to support complex business logic and enterprise-level integrations. It is ideal for experienced developers who want to implement fine-grained, engineered agents., The high-code mode offers flexible code customization and resource control to support complex business logic and enterprise-level integrations. It is ideal for experienced developers who want to implement fine-grained, engineered Agents. - Function Compute

Use the high-code mode to create an agent when you need to customize complex business logic, integrate development frameworks such as LangChain, or have fine-grained control over the runtime environment. This mode provides maximum development freedom and engineering capabilities.

Overview

Function Introduction

The high-code mode lets you customize an agent's implementation logic using code. It is ideal for experienced developers who need fine-grained control over behavior and performance.
In this mode, you can use your own code or frameworks, such as LangChain, to implement complex inference processes, tool orchestration, and enterprise-level integrations.

Key features include the following:

Multiple code sources and deployment methods: Upload code packages, deploy code from local storage or OSS, edit code online, or deploy agent services using custom container images.
Multiple language runtimes and custom startup methods: Choose from various runtimes, such as Python, Node.js, and Java, and configure the start command, listening port, and dependency structure.
Flexible resource and environment configuration: Configure CPU, memory, maximum concurrent sessions per instance, and session idle timeout to meet your requirements. Configure advanced parameters such as VPC networks, execution roles, access credentials, environment variables, and health checks.
Deep customization with SDKs and frameworks: Use the AgentRun SDK in your code to connect to models, Sandbox, tools, and external systems. Integrate with existing agent frameworks, such as LangChain, to implement complex workflows and business logic. For more information, see AgentRun SDK Documentation.
Deep customization with SDKs and frameworks: You can use the AgentRun SDK in your code to connect to models, Sandbox, tools, and external systems. You can also integrate with existing agent frameworks, such as LangChain, to implement complex workflows and business logic. For more information, see AgentRun SDK Documentation.
Complete development, debugging, and operations and maintenance (O&M) workflow: Use the online WebIDE, real-time logs, and a debugging interface based on the OpenAI Chat Completions protocol. This mode also supports version management and canary releases.

Core concepts

The following concepts are essential for creating a code-based agent. Skip this section if you are already familiar with them.

Core concept	Description
Code source	The high-code mode supports multiple code sources for deploying an agent: Upload local code package: Directly upload a packaged .zip file. OSS code package: Configure the address of the code package on OSS. The platform pulls the package from OSS. Online editing: Write and modify code online in the WebIDE without local packaging. Custom container image: Select an existing image from an Alibaba Cloud image repository. Use the code and runtime environment from the image. The method you choose depends on your team's habits and engineering practices, such as whether you already have a CI/CD pipeline for building images.
Runtime type	When creating an agent, you need to select a runtime language environment. The following are currently supported: Python 3.10 Python 3.12 Node.js 20 Node.js 18 Java 17 Java 11 Java 8 Note Currently, official sample templates are only provided for Python 3.10. For other runtimes, use a custom code package or image.
Sample code package	To help users get started quickly, the platform provides sample templates written based on the AgentRun software development kit (SDK). The official sample is a LangChain Agent example based on the AgentRun SDK. It includes sample code for model calls and Sandbox tool calls.
Startup configuration	For code package modes (local upload, OSS, or online editing), you must specify: Start command (for example, `python3 main.py`) Listening port (for example, `9000`) If you use the official template, you can usually keep the default configurations. If you use custom code, specify the actual entry file and service port. For the container image mode, you must specify at least: Listening port (the port the application listens on inside the image) The start command is usually determined by the `CMD` or `ENTRYPOINT` instruction in the image's Dockerfile and is not configured on the platform.
Resource configuration	Resource configuration is used to control the resource specifications and concurrent behavior of an agent instance. It includes: CPU (cores): The number of CPU cores allocated to a single instance. Memory (MB): The amount of memory allocated to a single instance. Maximum concurrent sessions per instance: The number of sessions that each instance can process in parallel. Session idle timeout (seconds): If an instance is idle for longer than this time after the last session ends, the instance is automatically destroyed. Configuring these parameters correctly helps you balance performance and cost.
Advanced configuration	Advanced configuration includes options related to the runtime environment, such as: Environment variables (such as model name and business configuration) VPC network configuration (to access internal network resources) Execution role (permissions to access resources such as models, Sandbox, and OSS) Access credentials (integrated with AgentRun credential management) Health check configuration (to determine if an instance is available) These configurations determine which resources the agent can access at runtime and how it accesses them.
Observability	Supports one-click activation of agent observability to monitor and analyze the agent's operational status: View operational metrics such as call volume, response time, and error rate. Troubleshoot issues and reconstruct failure scenarios using logs. Analyze model calls and token consumption for cost optimization. Identify performance bottlenecks and abnormal nodes in complex call chains. Note In application monitoring, token statistics are sourced from the Large Language Model (LLM) response. When you configure the code, ensure that the model response includes token information.

Procedure

Preparations

Prepare the code-based project
- Ensure you have runnable service code. The following steps use the official sample code package as an example. You can download the package here: Official sample code package.
  Note
  You can also obtain the code package by creating an agent in no-code mode and then converting it to code mode. For more information, see Create an agent without code (no-code). After the conversion, select Code and Debug from the navigation pane on the left of the agent page to view and download the code.
- The code can be packaged locally as a .zip file, or stored as a container image or an OSS code package.
- Confirm the exposed service endpoint (start command, listening port).

Prepare runtime dependencies
- Determine the runtime language version to use, such as Python 3.10, Python 3.12, Node.js 18/20, or Java 8/11/17.
  Note
  The official sample code uses the Python 3.10 runtime.
- If you use the code package mode:
  - Package the code and dependencies together in a .zip file.
  - For Python projects, place dependencies in a directory such as /python or /code/python. Then, set the PYTHONPATH environment variable to /opt/python:/code/python.

If you use the container image mode:
- Include the complete runtime environment and dependencies in the image.
- Correctly configure the entry command and listening port in the Dockerfile.
Prepare models, tools, and Sandbox
- You have created a model in Model Management and recorded its name.
- If the agent needs to use tools, such as business APIs, web scraping, or online search, create them in Tool Management.
- If you need code execution or browser capabilities, create a Sandbox service (BrowserTool, Code Interpreter) and record the service name.

Step 1: Go to the Create Agent page for code-based creation

Go to the AgentRun console. On the Agent Runtime page, click Create Agent. In the dialog box, select Create with Code.
The code-based agent creation requires the AliyunDevsFCServicesDeployPolicy permission. If the current account lacks this permission, a dialog box appears. You can choose to grant permissions with one click or skip.
Enter an Agent Name and Description. Use a meaningful name for easy management and identification.

Step 2: Configure the code

Select a code source

Select a code source (5 options):

Source	Description
Upload a code package	Upload a local .zip file
Object Storage Service (OSS)	Select the Bucket Name and OSS Directory
Container Image	Select an image from Alibaba Cloud Container Registry. Image address format: `registry-intl-vpc.{Region}.aliyuncs.com`
Online Coding	Write and modify code in the WebIDE
Official Template	Use the official LangChain Agent sample code

Startup configuration
Upload Code Package/Object Storage Service (OSS)/Online Editing Mode
- Start command: For example:
  - python3 main.py
  - node app.js
  - java -jar app.jar
- Listening port: For example, 9000. This must match the port your service listens on in the code.
Note
If you use the official template, such as the Python sample above, the default settings are usually sufficient.
Container image mode
- The start command is usually determined by the CMD/ENTRYPOINT in the image's Dockerfile.
- You must specify the listening port in the configuration. It must match the port the application listens on inside the container.

Step 3: Configure resources, environment variables, and advanced settings

Set the runtime parameters in the Resource Configuration and Advanced Configuration sections.

Compute resources
- CPU (Cores) and Memory: The compute resources allocated to each agent instance. More resources allow for faster processing of complex tasks but also increase costs.
- Maximum concurrent sessions per instance: The number of sessions a single agent instance can handle simultaneously. Increasing this value can reduce the number of required instances and save costs, but you must ensure your code is thread-safe and that a single instance has enough CPU and memory to support the load.
- Session idle timeout (seconds): If an instance is idle for longer than this time after the last session ends, the instance is automatically released.
Environment variables:
Configure runtime environment variables for the agent to pass parameters to the code as needed. If you use the official sample, you must configure the following environment variables:
- MODEL_NAME: Required. Enter the name of the model that you created in Model Management.
- SANDBOX_NAME: Optional. The name of the Sandbox that you created.

Network configuration: Select a network type:

Network type	Description
PUBLIC	The default NIC is allowed to access the public network
PRIVATE	Access only through VPC. You must configure VPC, VSwitch, and security group
PUBLIC_AND_PRIVATE	Both public and VPC access are supported
NONE	No network access

When you select PRIVATE or PUBLIC_AND_PRIVATE, select a VPC, VSwitch (multiple selections supported), and Security Group from the drop-down lists.

Optional: Enable Fixed Public IP (requires feature access).

Log configuration
Enable the Log Feature to save the agent's operational logs to Simple Log Service. You can specify a Log Project and a Logstore. Select Auto Config or use the One-click Config feature in Custom Config to use default values.
Health check configuration
When you Enable Health Check, you can configure parameters such as the Health Check Path (for example, /health), Health Check Interval (seconds), Timeout (seconds), and Failure Threshold. The system periodically sends HTTP requests to the specified path to check the service status. If the number of consecutive failures reaches the threshold, the system automatically restarts the service instance, which improves service availability and provides automatic fault recovery.

Step 4: Configure the execution role as needed

This configuration grants your agent code permission to access other Alibaba Cloud services, such as model services and OSS. AgentRun uses an execution role (a RAM role) to manage permissions. Your code assumes this role at runtime to obtain the corresponding operation permissions.

Trusted entity: Function Compute (fc.aliyuncs.com)

Required permissions: Add policies based on the services you use.

Operation: Select a role from the Execution Role ARN list. If no suitable role exists, follow these steps to add one.

Click the add button to the right of the Execution Role ARN drop-down list to go to the RAM > Roles list.
Click Create Role.
On the Create Role page, set Trusted Entity Type to Alibaba Cloud Service.
For Trusted Service, select Function Compute/FC.
Click OK, enter a Role Name, click OK, and complete the security authentication as prompted.
After the role is created, you are redirected to the role details page. Click Add Permissions.
Add policies based on the services you use. Common policy examples:
- AliyunOSSFullAccess: If the agent needs to use resources in OSS, you must configure permissions to manage Object Storage Service (OSS).
- AliyunAgentRunFullAccess: If you are creating the agent using the official sample, you must configure permissions to manage the AgentRun service.
- AliyunDevsFullAccess: Permissions to manage the Serverless Devs platform.

Step 5: Configure access credentials

Configure credentials for the agent's endpoint, such as an API key, to protect your agent from unauthorized calls. Credentials are managed and injected by AgentRun's Credential Management.

In the Access Credentials module, click Inbound: Access Credentials.
Select a credential mode:
- No credentials (not recommended): The agent's call address can be accessed anonymously from the internet, which poses a security risk. This mode is for functional testing only and must not be used in a production environment.
- <<<<<<< HEAD
  Use existing credentials (recommended): To ensure agent security, we recommend this option. Select an existing credential. If you do not have one, click the icon and see Credential Management to create one.
  =======
  Use existing credentials (recommended): To ensure agent security, we recommend this option. Select an existing credential. If you do not have one, you can click the icon and see Credential Management to create one.
  >>>>>>> 5a60e6f (docs: update 265 file(s) from ICMS)

Step 6: Complete the configuration and test

After completing the steps above, click Start Deployment in the upper-right corner to finish creating the agent. You are automatically redirected to the agent details page. Select Code and Debug and test the agent in the Debugging Tool.

Step 7: Publish a version and perform a canary release

AgentRun supports version management and canary releases. After making changes to an agent's prompt, tools, or model, we recommend that you first publish a version. Then, create an endpoint and enable a secondary version (canary release) to allocate a small amount of traffic to the new version. Monitor the new version to confirm it is stable and reliable. Gradually increase the traffic percentage until it is fully online.

In the navigation pane on the left, select Versions and Canary Releases.
Publish the current version: Click Publish Version, enter a Version Description that describes the main changes and features of this version, and then click Publish Version.
Create an endpoint:
1. Enter an Endpoint Name. In the Primary Version drop-down list, select the version number you just published.
2. Enable secondary version (canary release): Select the Enable secondary version (canary release) checkbox, select a secondary version, and configure the Traffic Allocation percentage for the primary and secondary versions.

Next steps

Integrate the agent into your application

Go to the Integration and Publishing module to integrate the agent into your frontend web pages, backend applications, and other services. UI Integration, Code Integration, and Ecosystem Integration methods are supported. For more information, see Agent Integration and Publishing.

Enable agent observability

After creating the agent, configure observability capabilities as needed for subsequent debugging and O&M.

View infrastructure monitoring: Enabled by default. In the navigation pane on the left, select Observability > Infrastructure Monitoring to directly view:
- Call count, success rate, and average/maximum latency.
- Call trends and latency curves for the recent period.
Enable log collection (recommended)
- On the agent details page, select Observability > Logs from the navigation pane on the left.
- The first time you open this page, a message is displayed indicating that Logs are not enabled. Click Click to enable logs.
- Select a configuration method:
  - Auto Config: Create a default SLS Project and Logstore with one click. Suitable for quick integration.
  - Custom Config: Select an existing Project and Logstore. Suitable for enterprise-level unified log management.
- After saving, the agent's operational logs are automatically written to the corresponding Logstore. You can search, analyze, and set up alerts in the SLS console.
Connect to Application Monitoring and Tracing Analysis (advanced, optional)
- You must activate ARMS when you first access this feature. On the Observability > Application Monitoring (or Tracing Analysis) page, click one-click activation.
- Follow the prompts or see Agent LLM Observability to learn how to integrate application monitoring with your agent code. This enables application monitoring and tracing analysis.
- After the configuration is complete, you can view business metrics and trends, such as token consumption analysis and performance analysis, on the Application Monitoring panel. On the Tracing Analysis panel, you can view Span details and call topology for each request to quickly locate slow requests and abnormal nodes.

FAQ and troubleshooting

Q1: When starting the agent, the log shows ModuleNotFoundError: No module named 'xxx'.

A: This is a typical missing Python dependency issue. Check the following:

If you are using the Upload Code Package mode, ensure that you followed the guidance in the Prepare runtime dependencies subsection of the Preparations section to package all dependency libraries into the /python directory of the code package.
If the dependencies include C/C++ extensions and the compilation environment is complex, we recommend switching to the container image mode.

Q2: The agent fails to start, and the log shows a port listening error or timeout.

A: Check your code:

Ensure that the web service is listening on 0.0.0.0 and not 127.0.0.1.
Ensure that the port number that the code listens on matches the one you configured for Listening Port in the console. A best practice is to dynamically retrieve the port number from the AGENT_PORT environment variable.

Q3: Are official templates only provided for Python 3.10? How do I start with other languages?

A: Yes, they are. The official LangChain sample template currently only supports Python 3.10. For other runtimes, such as Node.js and Java, you must select the high-code mode and upload your own custom code package or use a custom image. You can refer to the official Python code sample and migrate its logic to your chosen language and framework. The core task is to implement an HTTP service that meets the specifications.

Function Compute:Create an agent using code (high-code)

Overview

Function Introduction

Core concepts

Code source

Runtime type

Sample code package

Startup configuration

Resource configuration

Advanced configuration

Observability