BrowserTool is a high-performance, cloud-native headless browser sandbox service built with the Go language. It lets you remotely control a headless browser instance that runs in an isolated cloud container using the standard Chrome DevTools Protocol (CDP) over WebSocket. The service is natively compatible with popular automation frameworks such as Puppeteer and Playwright.
Primary use cases
-
AI agent integration: It acts as the "eyes" and "hands" for large language models (LLMs), giving them the ability to perform complex tasks such as web browsing, information extraction, and online operations.
-
Automated testing: You can run end-to-end (E2E) and visual regression tests on demand in the cloud without needing to maintain a local testing environment.
-
Data collection: You can scrape web pages with stability and efficiency, and easily handle dynamic content and anti-scraping challenges.
-
Content generation: You can automatically generate PDFs or screenshots from dynamic web pages or data dashboards for reports and archives.
Core value
-
Fully managed: You do not need to install, configure, or maintain Chrome browsers and their complex dependencies on your servers or local machines.
-
Serverless architecture: The service uses a pay-as-you-go billing model. Its excellent elastic scaling capabilities help you control costs.
-
Native compatibility: You can smoothly migrate your existing Puppeteer or Playwright scripts, usually without any modifications.
Quick start
This guide walks you through the entire process, from creating a service to running your first automation script.
Step 1: Create a BrowserTool service
-
Log on to the AgentRun console.
-
The first time you use the service, you are prompted to grant the AgentRun service-linked role (
AliyunServiceRoleForAgentRun) to your Alibaba Cloud account. Follow the on-screen instructions to complete the authorization. This step is required for the service to function correctly. -
On the Runtimes and Sandboxes tab, select Sandbox. Click Create Sandbox Template, select Browser Sandbox, and then click Create Now to create the BrowserTool service.
You can also create the service using OpenAPI or an SDK.
-
OpenAPI documentation: CreateBrowser API
-
SDKs for multiple languages: AgentRun SDK Center
Step 2: Get the connection endpoint
-
After the service is created, find the BrowserTool service in the service list on the console. Click the service to open the Details page.
-
On the VNC Debugging tab on the details page, click Create Sandbox.
-
After the sandbox is created, a WebSocket endpoint (CDP Endpoint) for automated connections is provided. You can copy it directly from the console. The endpoint format is similar to the following:
wss://1234567890.agentrun-data.ap-southeast-1.aliyuncs.com/sandboxes/br-abcdef123456/ws/automation?tenantId=1234567890
Step 3: Connect and run an automation script
Use the full connection endpoint URL from the previous step in your automation framework script to connect to BrowserTool and run your task.
Puppeteer connection example
const puppeteer = require('puppeteer-core');
// Obtain this endpoint from the BrowserTool service details page.
const BROWSERTOOL_CDP_ENDPOINT = 'wss://{accountID}.agentrun-data.ap-southeast-1.aliyuncs.com/sandboxes/{sandboxID}/ws/automation?tenantId={accountID}';
async function main() {
const browser = await puppeteer.connect({
browserWSEndpoint: BROWSERTOOL_CDP_ENDPOINT,
});
const page = await browser.newPage();
await page.goto('https://example.com');
await page.screenshot({ path: 'example.png' });
console.log('Screenshot taken!');
await browser.close();
}
main();
Playwright connection example
const { chromium } = require('playwright-core');
// Obtain this endpoint from the BrowserTool service details page.
const BROWSERTOOL_CDP_ENDPOINT = 'wss://{accountID}.agentrun-data.ap-southeast-1.aliyuncs.com/sandboxes/{sandboxId}/ws/automation?tenantId={accountID}';
async function main() {
const browser = await chromium.connectOverCDP({
endpointURL: BROWSERTOOL_CDP_ENDPOINT,
});
const page = await browser.newPage();
await page.goto('https://example.com');
await page.screenshot({ path: 'example.png' });
console.log('Screenshot taken!');
await browser.close();
}
main();
Feature guide
WebSocket automation endpoints
BrowserTool provides two main WebSocket endpoints for different automation scenarios.
-
CDP automation endpoint (/ws/automation)
This endpoint is for browser automation and is compatible with Puppeteer and Playwright. The endpoint format is as follows:
wss://{accountID}.agentrun-data.ap-southeast-1.aliyuncs.com/sandboxes/{sandboxId}/ws/automation?tenantId={accountID}You can use the
wscattool to interact directly with the CDP endpoint and send raw CDP commands for debugging.# Install wscat npm install -g wscat # Connect to the CDP proxy wscat -c "wss://{accountID}.agentrun-data.ap-southeast-1.aliyuncs.com/sandboxes/{sandboxID}/ws/automation/?tenantId={accountID}" # Send a CDP command {"id":1,"method":"Runtime.evaluate","params":{"expression":"navigator.userAgent"}} -
VNC live stream endpoint (/ws/livestream)
This WebSocket endpoint provides a real-time view of the browser's desktop environment. It supports viewing through a NoVNC client. For more information, see View the browser interface in real time (VNC debugging). The endpoint format is as follows:
wss://{accountID}.agentrun-data.ap-southeast-1.aliyuncs.com/sandboxes/{sandboxID}/ws/livestream?tenantId={accountID}
View the browser interface in real time (VNC debugging)
BrowserTool supports real-time viewing of the remote browser's desktop environment through Virtual Network Computing (VNC). This makes it easy to monitor the execution of automation tasks during development and debugging.
Recommended method: Use the online noVNC client
-
You can access the official online noVNC client at: https://novnc.com/noVNC/vnc.html
-
In the connection settings, go to and enter the following connection information:
-
Host:
{accountID}.agentrun-data.ap-southeast-1.aliyuncs.com -
Port:
443 -
Path:
sandboxes/{sandboxID}/ws/livestream?tenantId={accountID}
-
-
Click Connect to view the browser interface.
If you want to create a custom frontend integration for full control over the VNC view's presentation and interaction logic, you must import the core noVNC JavaScript library and initialize it in your frontend code. For an example, see https://github.com/novnc/noVNC/blob/master/vnc_lite.html.
Alternative method: Use a Docker image
To use the client in a local or offline environment, you can run our pre-configured noVNC Docker image.
-
You can run the following Docker command in your local terminal to start the noVNC client container:
docker run -p 8184:80 -d --rm registry.cn-shanghai.aliyuncs.com/fc-demo2/custom-container-repository:browsertool-sandbox-vnc-client_v0.3.1 -
Open your browser and navigate to
http://localhost:8184. -
In the connection settings, go to and enter the following connection information:
-
Host:
{accountID}.agentrun-data.ap-southeast-1.aliyuncs.com -
Port:
80 -
Path:
sandboxes/{sandboxID}/ws/livestream?tenantId={accountID}
Click Connect to see the browser interface.
-
After a successful connection, the initial interface may be a black or gray screen. This is normal because the browser is waiting for instructions. Content is displayed after your automation script runs an operation such as page.goto().
Framework integration examples
BrowserTool integrates easily with AI agent frameworks or automation libraries in various programming languages.
-
BrowserUse operation example
from browser_use import Agent, BrowserSession from browser_use.llm import ChatDeepSeek from browser_use.browser import BrowserProfile from dotenv import load_dotenv import os import asyncio load_dotenv() async def main(): # Enter the CDP URL browser_session_wss_url = "wss://{accountID}.agentrun-data.ap-southeast-1.aliyuncs.com/sandboxes/{sandboxID}/ws/automation?tenantId={accountID}" browser_session = BrowserSession( cdp_url=browser_session_wss_url, browser_profile=BrowserProfile( headless=False, user_agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.X.X Safari/537.36", timeout=3000000, keep_alive=True, ) ) # Enter the API key for DeepSeek. If you use a different model, modify this accordingly. llm = ChatDeepSeek(api_key="sk-your-deepseek-sk") agent = Agent( task="Visit https://www.aliyun.com/product/list and analyze the products that Alibaba Cloud currently offers.", llm=llm, browser_session=browser_session, use_vision=True ) result = await agent.run() print(result) if __name__ == "__main__": asyncio.run(main())Preview:
-
Puppeteer connection example
const puppeteer = require('puppeteer-core'); const browser = await puppeteer.connect({ browserWSEndpoint: 'wss://{accountID}.agentrun-data.ap-southeast-1.aliyuncs.com/sandboxes/{sandboxID}/ws/automation?tenantId={accountID}' }); const page = await browser.newPage(); await page.goto('https://example.com'); await page.screenshot({ path: 'screenshot.png' }); await browser.close();Preview:
Operation recording
1. List all recordings
This operation retrieves a list of all VNC recording files in the system. Paged queries are supported.
Basic information
-
Endpoint:
GET /recordings/ -
Function: Returns a list of VNC recording files.
-
Tag: Recording Management
Request parameters
|
Parameter name |
Location |
Type |
Required |
Default |
Description |
|
|
query |
integer |
No |
1 |
The page number. Starts from 1. |
|
|
query |
integer |
No |
20 |
The number of items per page. Maximum: 100. Minimum: 1. |
Features
-
Paged queries are supported.
-
The recording status (recording or completed) is automatically detected.
-
The list is sorted by creation time in descending order (newest first).
-
Detailed file information is returned, such as the filename, size, and creation time.
Response format
Successful response (200 OK)
{
"recordings": [
{
"filename": "vnc_global_20251116_103022_seg000.mkv",
"sessionId": "vnc_global_20251116_103022",
"segment": 0,
"size": 15728640,
"createdAt": "2025-11-16T10:30:22+08:00",
"format": "mkv",
"downloadUrl": "/recordings/vnc_global_20251116_103022_seg000.mkv",
"recordingType": "vnc",
"status": "completed"
}
],
"total": 1,
"page": 1,
"pageSize": 20
}
Response field descriptions
|
Field name |
Type |
Description |
|
|
array |
A list of recording files, sorted by creation time in descending order. |
|
|
string |
The filename. |
|
|
string |
The session ID. |
|
|
integer |
The segment index. Starts from 0. Used for segmented recordings. |
|
|
integer |
The file size in bytes. |
|
|
string |
The file creation time in ISO 8601 format. |
|
|
string |
The file format. Fixed as |
|
|
string |
The download URL. MKV supports stream writing, so all files can be downloaded. |
|
|
string |
The recording type. Fixed as |
|
|
string |
The recording status. MKV supports stream writing, so the status for all files is |
|
|
integer |
The total number of recording files, regardless of paging. |
|
|
integer |
The current page number. Returned only for paged queries. |
|
|
integer |
The number of items per page. Returned only for paged queries. |
Error responses
-
400 Bad Request: Invalid request (parameter error).
-
500 Internal Server Error: Internal server error.
Request examples
# Get the first page with 20 records per page (default)
curl -X GET "http://localhost:3000/recordings/"
# Get the second page with 10 records per page
curl -X GET "http://localhost:3000/recordings/?page=2&page_size=10"
# Get all recording files (set a large page_size)
curl -X GET "http://localhost:3000/recordings/?page_size=100"
2. Download a recording file
This operation downloads a specified VNC recording file.
Basic information
-
Endpoint:
GET /recordings/{filename} -
Function: Downloads the specified recording file.
-
Tag: Recording Management
Request parameters
|
Parameter name |
Location |
Type |
Required |
Description |
|
|
path |
string |
Yes |
The recording filename. Must be an .mkv file. |
Features
-
The MKV format supports stream writing.
-
Files can be downloaded and played during recording.
-
Only
.mkvfiles can be downloaded. -
The correct Content-Type (
video/x-matroska) is automatically set.
Response format
Successful response (200 OK)
-
Content-Type:
video/x-matroska -
Body: Binary video data.
Error responses
-
400 Bad Request: Invalid request.
{ "error": "Invalid filename" }-
Invalid filename.
-
Not an MKV file.
-
-
404 Not Found: File not found.
-
500 Internal Server Error: Internal server error.
Request examples
# Download the recording file
curl -X GET "http://localhost:3000/recordings/vnc_global_20251116_103022_seg000.mkv" \
-o vnc_global_20251116_103022_seg000.mkv
# Access directly in a browser
http://localhost:3000/recordings/vnc_global_20251116_103022_seg000.mkv
Notes
-
File format restriction: Only
.mkvfiles can be downloaded. -
Streaming support: The MKV format supports stream writing, which allows downloading during recording.
-
Player compatibility: MKV files can be played with mainstream players such as VLC, Chrome, and Firefox.
3. Delete a recording file
This operation deletes a specified VNC recording file.
Basic information
-
Endpoint:
DELETE /recordings/{filename} -
Function: Deletes the specified recording file.
-
Tag: Recording Management
Request parameters
|
Parameter |
Location |
Type |
Required |
Description |
|
|
path |
string |
Yes |
The recording filename. Must be an .mkv file. |
Features
-
Deletion of
.mkvfiles is supported. -
Because the MKV format supports stream writing, files can be deleted during recording.
-
Finding files in the VNC folder
-
Use caution when deleting a file that is being recorded. Ensure that the recording has stopped.
Response format
Successful response (200 OK)
{
"message": "Recording file deleted successfully",
"filename": "vnc_global_20251116_103022_seg000.mkv"
}
Response field descriptions
|
Field Name |
Type |
Description |
|
|
string |
A message indicating the result of the operation. |
|
|
string |
The name of the deleted file. |
Error responses
-
400 Bad Request: Invalid request.
-
Invalid filename.
-
Unsupported file type.
-
-
404 Not Found: File not found.
-
500 Internal Server Error: Internal server error.
Request examples
# Delete the specified recording file
curl -X DELETE "http://localhost:3000/recordings/vnc_global_20251116_103022_seg000.mkv"
# Use jq to format the output
curl -X DELETE "http://localhost:3000/recordings/vnc_global_20251116_103022_seg000.mkv" | jq
Notes
-
Irreversible: The delete operation is irreversible. Proceed with caution.
-
Files being recorded: When deleting a file that is being recorded, make sure the recording has stopped.
-
File format restriction: Only
.mkvfiles can be deleted.
Data models
RecordingInfo
This object contains information about a recording file.
interface RecordingInfo {
filename: string; // The filename.
sessionId: string; // The session ID.
segment: number; // The segment index. Starts from 0. Used for segmented recordings.
size: number; // The file size in bytes.
createdAt: string; // The file creation time in ISO 8601 format.
format: "mkv"; // The file format. Fixed as mkv.
downloadUrl: string; // The download URL.
recordingType: "vnc"; // The recording type. Fixed as vnc.
status: "completed"; // The recording status. Fixed as completed.
}
RecordingListResponse
This object contains the response for a recording file list request.
interface RecordingListResponse {
recordings: RecordingInfo[]; // A list of recording files, sorted by creation time in descending order.
total: number; // The total number of recording files, regardless of paging.
page?: number; // The current page number. Returned only for paged queries.
pageSize?: number; // The number of items per page. Returned only for paged queries.
}
RecordingDeleteResponse
This object contains the response for a recording file deletion request.
interface RecordingDeleteResponse {
message: string; // A message indicating the result of the operation.
filename: string; // The name of the deleted file.
}
Service details
Limits
-
Session lifecycle: The default maximum lifecycle of a single sandbox session is 6 hours. The session is automatically destroyed when it expires.
-
Hibernation (formerly idle) timeout: You can set the hibernation timeout for a session when you create the service using the
sandboxIdleTimeoutSecondsparameter. If a session has no activity during this period, the session is terminated early to save costs. -
Browser support: The service currently has built-in support for Chromium/Chrome browsers. Support for Firefox and Edge is planned for the future.
Security notes
-
Environment isolation: Each browser sandbox instance runs in an independent container environment. This ensures complete isolation of the file system and process space between tasks and users, protecting task security.
-
Permission management: The service follows the principle of least privilege. It uses a service-linked role (
AliyunServiceRoleForAgentRun) to ensure that it can only access necessary cloud resources. -
Encryption in transit: All data plane access endpoints (CDP and VNC) use the WebSocket Secure (WSS) protocol. This ensures that your instructions and the data returned by the browser are encrypted during transmission.
Error handling
Common error response format
{
"code": 1001,
"error": "invalid request"
}
Common error codes
|
HTTP status code |
Error scenario |
Recommended action |
|
400 |
Parameter error, unsupported file format. |
Check the request parameters and filename. |
|
404 |
File not found. |
Confirm that the file exists. |
|
500 |
Internal server error. |
Check the server logs and contact technical support. |
Billing
BrowserTool uses a serverless billing model. You pay for what you use. Billing is mainly based on the runtime duration of the sandbox instance. No fees are incurred when no instances are running.
For more information about billing rules and prices, see Function Compute Billing overview.
Technical architecture
BrowserTool uses a layered architecture to ensure high performance, high stability, and easy scalability.
Overall architecture diagram
Core components
-
Protocol proxy layer (Go)
-
CDP WebSocket proxy: This proxy acts as a bridge between the client and the browser sandbox. It securely forwards CDP commands from Puppeteer or Playwright and handles authentication and session management. The core service is built with the Go language and goroutines. Compared to traditional Node.js implementations, this approach optimizes memory usage and concurrent connection handling.
-
VNC WebSocket proxy: This proxy transcodes the native VNC stream from
TigerVNCinto a WebSocket stream in real time. This allows standard web clients such as noVNC to render the remote desktop directly on a web page, providing a real-time view.
-
-
Browser runtime
-
Containerization solution: This solution uses Docker multi-stage builds to generate lightweight runtime images, which speeds up deployment and startup. The images have all necessary dependencies and browser drivers pre-installed.
-
VNC Remote Desktop Service: Inside the container, a combination of
Xvfb(virtual screen),Openbox(minimalist window manager), andTigerVNC(VNC service) provides a graphical interface for the headless browser that can be viewed remotely.
-