All Products
Search
Document Center

Function Compute:BrowserTool Browser

最終更新日:Jun 03, 2026

BrowserTool is a high-performance, cloud-native headless browser sandbox service built with the Go language. It lets you remotely control a headless browser instance that runs in an isolated cloud container using the standard Chrome DevTools Protocol (CDP) over WebSocket. The service is natively compatible with popular automation frameworks such as Puppeteer and Playwright.

Primary use cases

  • AI agent integration: It acts as the "eyes" and "hands" for large language models (LLMs), giving them the ability to perform complex tasks such as web browsing, information extraction, and online operations.

  • Automated testing: You can run end-to-end (E2E) and visual regression tests on demand in the cloud without needing to maintain a local testing environment.

  • Data collection: You can scrape web pages with stability and efficiency, and easily handle dynamic content and anti-scraping challenges.

  • Content generation: You can automatically generate PDFs or screenshots from dynamic web pages or data dashboards for reports and archives.

Core value

  • Fully managed: You do not need to install, configure, or maintain Chrome browsers and their complex dependencies on your servers or local machines.

  • Serverless architecture: The service uses a pay-as-you-go billing model. Its excellent elastic scaling capabilities help you control costs.

  • Native compatibility: You can smoothly migrate your existing Puppeteer or Playwright scripts, usually without any modifications.

Quick start

This guide walks you through the entire process, from creating a service to running your first automation script.

Step 1: Create a BrowserTool service

  1. Log on to the AgentRun console.

  2. The first time you use the service, you are prompted to grant the AgentRun service-linked role (AliyunServiceRoleForAgentRun) to your Alibaba Cloud account. Follow the on-screen instructions to complete the authorization. This step is required for the service to function correctly.

  3. On the Runtimes and Sandboxes tab, select Sandbox. Click Create Sandbox Template, select Browser Sandbox, and then click Create Now to create the BrowserTool service.

You can also create the service using OpenAPI or an SDK.

Step 2: Get the connection endpoint

  1. After the service is created, find the BrowserTool service in the service list on the console. Click the service to open the Details page.

  2. On the VNC Debugging tab on the details page, click Create Sandbox.

  3. After the sandbox is created, a WebSocket endpoint (CDP Endpoint) for automated connections is provided. You can copy it directly from the console. The endpoint format is similar to the following:

    wss://1234567890.agentrun-data.ap-southeast-1.aliyuncs.com/sandboxes/br-abcdef123456/ws/automation?tenantId=1234567890

Step 3: Connect and run an automation script

Use the full connection endpoint URL from the previous step in your automation framework script to connect to BrowserTool and run your task.

Puppeteer connection example

const puppeteer = require('puppeteer-core');

// Obtain this endpoint from the BrowserTool service details page.
const BROWSERTOOL_CDP_ENDPOINT = 'wss://{accountID}.agentrun-data.ap-southeast-1.aliyuncs.com/sandboxes/{sandboxID}/ws/automation?tenantId={accountID}';

async function main() {
  const browser = await puppeteer.connect({
    browserWSEndpoint: BROWSERTOOL_CDP_ENDPOINT,
  });

  const page = await browser.newPage();
  await page.goto('https://example.com');
  await page.screenshot({ path: 'example.png' });
  console.log('Screenshot taken!');
  await browser.close();
}

main();

Playwright connection example

const { chromium } = require('playwright-core');

// Obtain this endpoint from the BrowserTool service details page.
const BROWSERTOOL_CDP_ENDPOINT = 'wss://{accountID}.agentrun-data.ap-southeast-1.aliyuncs.com/sandboxes/{sandboxId}/ws/automation?tenantId={accountID}';

async function main() {
  const browser = await chromium.connectOverCDP({
    endpointURL: BROWSERTOOL_CDP_ENDPOINT,
  });

  const page = await browser.newPage();
  await page.goto('https://example.com');
  await page.screenshot({ path: 'example.png' });
  console.log('Screenshot taken!');
  await browser.close();
}

main();

Feature guide

WebSocket automation endpoints

BrowserTool provides two main WebSocket endpoints for different automation scenarios.

  1. CDP automation endpoint (/ws/automation)

    This endpoint is for browser automation and is compatible with Puppeteer and Playwright. The endpoint format is as follows:

    wss://{accountID}.agentrun-data.ap-southeast-1.aliyuncs.com/sandboxes/{sandboxId}/ws/automation?tenantId={accountID}

    You can use the wscat tool to interact directly with the CDP endpoint and send raw CDP commands for debugging.

    # Install wscat
    npm install -g wscat
    
    # Connect to the CDP proxy
    wscat -c "wss://{accountID}.agentrun-data.ap-southeast-1.aliyuncs.com/sandboxes/{sandboxID}/ws/automation/?tenantId={accountID}"
    
    # Send a CDP command
    {"id":1,"method":"Runtime.evaluate","params":{"expression":"navigator.userAgent"}}
  2. VNC live stream endpoint (/ws/livestream)

    This WebSocket endpoint provides a real-time view of the browser's desktop environment. It supports viewing through a NoVNC client. For more information, see View the browser interface in real time (VNC debugging). The endpoint format is as follows:

    wss://{accountID}.agentrun-data.ap-southeast-1.aliyuncs.com/sandboxes/{sandboxID}/ws/livestream?tenantId={accountID}

View the browser interface in real time (VNC debugging)

BrowserTool supports real-time viewing of the remote browser's desktop environment through Virtual Network Computing (VNC). This makes it easy to monitor the execution of automation tasks during development and debugging.

Recommended method: Use the online noVNC client

  1. You can access the official online noVNC client at: https://novnc.com/noVNC/vnc.html

  2. In the connection settings, go to Advanced > WebSocket and enter the following connection information:

    • Host: {accountID}.agentrun-data.ap-southeast-1.aliyuncs.com

    • Port: 443

    • Path: sandboxes/{sandboxID}/ws/livestream?tenantId={accountID}

  3. Click Connect to view the browser interface.

If you want to create a custom frontend integration for full control over the VNC view's presentation and interaction logic, you must import the core noVNC JavaScript library and initialize it in your frontend code. For an example, see https://github.com/novnc/noVNC/blob/master/vnc_lite.html.

Alternative method: Use a Docker image

To use the client in a local or offline environment, you can run our pre-configured noVNC Docker image.

  1. You can run the following Docker command in your local terminal to start the noVNC client container:

    docker run -p 8184:80 -d --rm registry.cn-shanghai.aliyuncs.com/fc-demo2/custom-container-repository:browsertool-sandbox-vnc-client_v0.3.1
  2. Open your browser and navigate to http://localhost:8184.

  3. In the connection settings, go to Advanced > WebSocket and enter the following connection information:

    • Host: {accountID}.agentrun-data.ap-southeast-1.aliyuncs.com

    • Port: 80

    • Path: sandboxes/{sandboxID}/ws/livestream?tenantId={accountID}

    Click Connect to see the browser interface.

Note

After a successful connection, the initial interface may be a black or gray screen. This is normal because the browser is waiting for instructions. Content is displayed after your automation script runs an operation such as page.goto().

Framework integration examples

BrowserTool integrates easily with AI agent frameworks or automation libraries in various programming languages.

  • BrowserUse operation example

    from browser_use import Agent, BrowserSession
    from browser_use.llm import ChatDeepSeek
    from browser_use.browser import BrowserProfile
    
    from dotenv import load_dotenv
    import os
    import asyncio
    
    load_dotenv()
    
    async def main():
        # Enter the CDP URL
        browser_session_wss_url = "wss://{accountID}.agentrun-data.ap-southeast-1.aliyuncs.com/sandboxes/{sandboxID}/ws/automation?tenantId={accountID}"
        
        browser_session = BrowserSession(
            cdp_url=browser_session_wss_url, 
            browser_profile=BrowserProfile(
                headless=False,
                user_agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.X.X Safari/537.36",
                timeout=3000000,
                keep_alive=True,
            )
        )
        
        # Enter the API key for DeepSeek. If you use a different model, modify this accordingly.
        llm = ChatDeepSeek(api_key="sk-your-deepseek-sk")
    
        agent = Agent(
            task="Visit https://www.aliyun.com/product/list and analyze the products that Alibaba Cloud currently offers.",
            llm=llm,
            browser_session=browser_session,
            use_vision=True
        )
        result = await agent.run()
        print(result)
    
    
    if __name__ == "__main__":
        asyncio.run(main())

    Preview:

  • Puppeteer connection example

    const puppeteer = require('puppeteer-core');
    
    const browser = await puppeteer.connect({
      browserWSEndpoint: 'wss://{accountID}.agentrun-data.ap-southeast-1.aliyuncs.com/sandboxes/{sandboxID}/ws/automation?tenantId={accountID}'
    });
    
    const page = await browser.newPage();
    await page.goto('https://example.com');
    await page.screenshot({ path: 'screenshot.png' });
    await browser.close();

    Preview:

Operation recording

1. List all recordings

This operation retrieves a list of all VNC recording files in the system. Paged queries are supported.

Basic information

  • Endpoint: GET /recordings/

  • Function: Returns a list of VNC recording files.

  • Tag: Recording Management

Request parameters

Parameter name

Location

Type

Required

Default

Description

page

query

integer

No

1

The page number. Starts from 1.

page_size

query

integer

No

20

The number of items per page. Maximum: 100. Minimum: 1.

Features

  • Paged queries are supported.

  • The recording status (recording or completed) is automatically detected.

  • The list is sorted by creation time in descending order (newest first).

  • Detailed file information is returned, such as the filename, size, and creation time.

Response format

Successful response (200 OK)

{
  "recordings": [
    {
      "filename": "vnc_global_20251116_103022_seg000.mkv",
      "sessionId": "vnc_global_20251116_103022",
      "segment": 0,
      "size": 15728640,
      "createdAt": "2025-11-16T10:30:22+08:00",
      "format": "mkv",
      "downloadUrl": "/recordings/vnc_global_20251116_103022_seg000.mkv",
      "recordingType": "vnc",
      "status": "completed"
    }
  ],
  "total": 1,
  "page": 1,
  "pageSize": 20
}

Response field descriptions

Field name

Type

Description

recordings

array

A list of recording files, sorted by creation time in descending order.

recordings[].filename

string

The filename.

recordings[].sessionId

string

The session ID.

recordings[].segment

integer

The segment index. Starts from 0. Used for segmented recordings.

recordings[].size

integer

The file size in bytes.

recordings[].createdAt

string

The file creation time in ISO 8601 format.

recordings[].format

string

The file format. Fixed as mkv.

recordings[].downloadUrl

string

The download URL. MKV supports stream writing, so all files can be downloaded.

recordings[].recordingType

string

The recording type. Fixed as vnc.

recordings[].status

string

The recording status. MKV supports stream writing, so the status for all files is completed. You can play the file during recording.

total

integer

The total number of recording files, regardless of paging.

page

integer

The current page number. Returned only for paged queries.

pageSize

integer

The number of items per page. Returned only for paged queries.

Error responses

  • 400 Bad Request: Invalid request (parameter error).

  • 500 Internal Server Error: Internal server error.

Request examples

# Get the first page with 20 records per page (default)
curl -X GET "http://localhost:3000/recordings/"

# Get the second page with 10 records per page
curl -X GET "http://localhost:3000/recordings/?page=2&page_size=10"

# Get all recording files (set a large page_size)
curl -X GET "http://localhost:3000/recordings/?page_size=100"

2. Download a recording file

This operation downloads a specified VNC recording file.

Basic information

  • Endpoint: GET /recordings/{filename}

  • Function: Downloads the specified recording file.

  • Tag: Recording Management

Request parameters

Parameter name

Location

Type

Required

Description

filename

path

string

Yes

The recording filename. Must be an .mkv file.

Features

  • The MKV format supports stream writing.

  • Files can be downloaded and played during recording.

  • Only .mkv files can be downloaded.

  • The correct Content-Type (video/x-matroska) is automatically set.

Response format

Successful response (200 OK)

  • Content-Type: video/x-matroska

  • Body: Binary video data.

Error responses

  • 400 Bad Request: Invalid request.

    {
      "error": "Invalid filename"
    }
    • Invalid filename.

    • Not an MKV file.

  • 404 Not Found: File not found.

  • 500 Internal Server Error: Internal server error.

Request examples

# Download the recording file
curl -X GET "http://localhost:3000/recordings/vnc_global_20251116_103022_seg000.mkv" \
  -o vnc_global_20251116_103022_seg000.mkv

# Access directly in a browser
http://localhost:3000/recordings/vnc_global_20251116_103022_seg000.mkv

Notes

  1. File format restriction: Only .mkv files can be downloaded.

  2. Streaming support: The MKV format supports stream writing, which allows downloading during recording.

  3. Player compatibility: MKV files can be played with mainstream players such as VLC, Chrome, and Firefox.

3. Delete a recording file

This operation deletes a specified VNC recording file.

Basic information

  • Endpoint: DELETE /recordings/{filename}

  • Function: Deletes the specified recording file.

  • Tag: Recording Management

Request parameters

Parameter

Location

Type

Required

Description

filename

path

string

Yes

The recording filename. Must be an .mkv file.

Features

  • Deletion of .mkv files is supported.

  • Because the MKV format supports stream writing, files can be deleted during recording.

  • Finding files in the VNC folder

  • Use caution when deleting a file that is being recorded. Ensure that the recording has stopped.

Response format

Successful response (200 OK)

{
  "message": "Recording file deleted successfully",
  "filename": "vnc_global_20251116_103022_seg000.mkv"
}

Response field descriptions

Field Name

Type

Description

message

string

A message indicating the result of the operation.

filename

string

The name of the deleted file.

Error responses

  • 400 Bad Request: Invalid request.

    • Invalid filename.

    • Unsupported file type.

  • 404 Not Found: File not found.

  • 500 Internal Server Error: Internal server error.

Request examples

# Delete the specified recording file
curl -X DELETE "http://localhost:3000/recordings/vnc_global_20251116_103022_seg000.mkv"

# Use jq to format the output
curl -X DELETE "http://localhost:3000/recordings/vnc_global_20251116_103022_seg000.mkv" | jq

Notes

  1. Irreversible: The delete operation is irreversible. Proceed with caution.

  2. Files being recorded: When deleting a file that is being recorded, make sure the recording has stopped.

  3. File format restriction: Only .mkv files can be deleted.

Data models

RecordingInfo

This object contains information about a recording file.

interface RecordingInfo {
  filename: string;           // The filename.
  sessionId: string;          // The session ID.
  segment: number;            // The segment index. Starts from 0. Used for segmented recordings.
  size: number;               // The file size in bytes.
  createdAt: string;          // The file creation time in ISO 8601 format.
  format: "mkv";              // The file format. Fixed as mkv.
  downloadUrl: string;        // The download URL.
  recordingType: "vnc";       // The recording type. Fixed as vnc.
  status: "completed";        // The recording status. Fixed as completed.
}

RecordingListResponse

This object contains the response for a recording file list request.

interface RecordingListResponse {
  recordings: RecordingInfo[];  // A list of recording files, sorted by creation time in descending order.
  total: number;                // The total number of recording files, regardless of paging.
  page?: number;                // The current page number. Returned only for paged queries.
  pageSize?: number;            // The number of items per page. Returned only for paged queries.
}

RecordingDeleteResponse

This object contains the response for a recording file deletion request.

interface RecordingDeleteResponse {
  message: string;   // A message indicating the result of the operation.
  filename: string;  // The name of the deleted file.
}

Service details

Limits

  • Session lifecycle: The default maximum lifecycle of a single sandbox session is 6 hours. The session is automatically destroyed when it expires.

  • Hibernation (formerly idle) timeout: You can set the hibernation timeout for a session when you create the service using the sandboxIdleTimeoutSeconds parameter. If a session has no activity during this period, the session is terminated early to save costs.

  • Browser support: The service currently has built-in support for Chromium/Chrome browsers. Support for Firefox and Edge is planned for the future.

Security notes

  • Environment isolation: Each browser sandbox instance runs in an independent container environment. This ensures complete isolation of the file system and process space between tasks and users, protecting task security.

  • Permission management: The service follows the principle of least privilege. It uses a service-linked role (AliyunServiceRoleForAgentRun) to ensure that it can only access necessary cloud resources.

  • Encryption in transit: All data plane access endpoints (CDP and VNC) use the WebSocket Secure (WSS) protocol. This ensures that your instructions and the data returned by the browser are encrypted during transmission.

Error handling

Common error response format

{
  "code": 1001,
  "error": "invalid request"
}

Common error codes

HTTP status code

Error scenario

Recommended action

400

Parameter error, unsupported file format.

Check the request parameters and filename.

404

File not found.

Confirm that the file exists.

500

Internal server error.

Check the server logs and contact technical support.

Billing

BrowserTool uses a serverless billing model. You pay for what you use. Billing is mainly based on the runtime duration of the sandbox instance. No fees are incurred when no instances are running.

For more information about billing rules and prices, see Function Compute Billing overview.

Technical architecture

BrowserTool uses a layered architecture to ensure high performance, high stability, and easy scalability.

Overall architecture diagram

image

Core components

  • Protocol proxy layer (Go)

    • CDP WebSocket proxy: This proxy acts as a bridge between the client and the browser sandbox. It securely forwards CDP commands from Puppeteer or Playwright and handles authentication and session management. The core service is built with the Go language and goroutines. Compared to traditional Node.js implementations, this approach optimizes memory usage and concurrent connection handling.

    • VNC WebSocket proxy: This proxy transcodes the native VNC stream from TigerVNC into a WebSocket stream in real time. This allows standard web clients such as noVNC to render the remote desktop directly on a web page, providing a real-time view.

  • Browser runtime

    • Containerization solution: This solution uses Docker multi-stage builds to generate lightweight runtime images, which speeds up deployment and startup. The images have all necessary dependencies and browser drivers pre-installed.

    • VNC Remote Desktop Service: Inside the container, a combination of Xvfb (virtual screen), Openbox (minimalist window manager), and TigerVNC (VNC service) provides a graphical interface for the headless browser that can be viewed remotely.