Overview of open capabilities
Tool | Name | Description | Support status |
| Creates a new AgentBay sandbox and returns its ID. | Supported | |
| Gets the runtime URL for accessing the Wuying MCP runtime. The URL can be used only once and becomes invalid after use. | Supported | |
| Captures a full-screen screenshot of the current screen and returns a shareable URL. The screenshot is automatically processed and stored securely. For security, the generated URL expires after 64 minutes. | Support | |
| Releases resources after the task is complete. | Supported | |
| Executes shell commands on the Linux platform with timeout control. Returns the command output or an error message. | Supported | |
| Closes the page. | Supported | |
| Resizes the browser window. | Supported | |
| Returns all console messages. | Supported | |
| Handles a dialog box. | Supported | |
| Uploads one or more files. | Supported | |
| Installs the browser specified in the configuration. Call this tool if you receive an error that the browser is not installed. | Supported | |
| Presses a key on the keyboard. | Supported | |
| Navigates to a URL. | Supported | |
| Navigates to the previous page. | Supported | |
| Navigates to the next page. | Supported | |
| Returns all network requests since the page was loaded. | Supported | |
| Saves the page as a PDF file. | Supported | |
| Takes a screenshot of the current page. Actions cannot be performed based on the screenshot. To perform actions, use | Supported | |
| Captures an accessibility snapshot of the current page. | Supported | |
| Performs a click operation on the web page. | Supported | |
| Performs a drag-and-drop operation between two elements. | Supported | |
| Hovers the mouse over a page element. | Supported | |
| Enters text into an editable element. | Supported | |
| Selects an option in a drop-down menu. | Supported | |
| Lists the browser tabs. | Supported | |
| Opens a new tab. | Supported | |
| Selects a tab by its index. | Supported | |
| Closes a tab. | Supported | |
| Generates a Playwright test for a specified scenario. | Supported | |
| Waits for text to appear or disappear, or for a specified time to pass. | Supported | |
| The command entered by the client. For example: | Supported | |
| The command execution timeout in milliseconds. If not specified, the default value of 1000 milliseconds is used. | Supported | |
| Retrieves a list of applications installed on the system. Supports filtering by Start menu entries and desktop shortcuts, and lets you exclude system applications. Returns application details, including the name, start command, optional stop command, and working directory. The start command can contain placeholders, such as %F and %U, which follow standard Linux desktop entry rules. When an application starts, these placeholders must be replaced with the appropriate arguments, file paths, or URLs. | Supported | |
| Starts a specified application using the provided command and optional working directory. Returns a list of processes associated with the launched application, including the process name, PID, and start command. | Supported | |
| Terminates all processes associated with a specified process name. Use with caution, as this forcefully terminates the specified process. | Support | |
| Terminates a specific process identified by its process ID. Use with caution, as this forcefully terminates the specified process. | Supported | |
| Terminates an application using the provided stop command. Use with caution, as this forcefully terminates the specified process. | Supported | |
| Lists all applications with visible windows, including their associated process information. Returns a list of processes that have visible windows, including the process name, PID, and start command. | Supported | |
| Creates a new directory or ensures a directory exists. You can create multiple nested directories in one operation. If the directory already exists, this operation succeeds silently. This is ideal for setting up directory structures for projects or ensuring required paths exist. Note Works only in allowed directories. | Supported | |
| Performs line-based edits on a text file. Each edit replaces an exact sequence of lines with new content. Returns a git-style diff to show the changes made. Note Works only in allowed directories. | Supported | |
| Retrieves detailed metadata for a file or directory. Returns comprehensive information including size, creation time, last modified time, permissions, and type. This tool is ideal for understanding file characteristics without reading the actual content. Note Works only in allowed directories. | Supported | |
| Reads the content of a file from the file system. You can specify an optional Note Works only in allowed directories. | Supported | |
| Reads the contents of multiple files at the same time. This is more efficient than reading files one by one when you need to analyze or compare multiple files. The content of each file is returned with its path as a reference. A failure to read a single file does not stop the entire operation. Note Works only in allowed directories. | Supported | |
| Gets a detailed list of all files and directories in the specified path. The results clearly distinguish between files and directories with [FILE] and [DIR] prefixes. This tool is useful for understanding the directory structure and finding specific files. Note Works only in allowed directories. | Supported | |
| Moves or renames files and directories. You can move a file to a different directory and rename it in a single operation. If the destination already exists, the operation fails. It works between different directories and can be used for simple renaming within the same directory. Note The source and destination must both be in allowed directories. | Supported | |
| Recursively searches for files and directories that match a pattern. It searches all subdirectories from the starting path. The search is case-sensitive and matches partial names. It returns the full paths of all matching items. This is ideal for finding files when you do not know their exact location. Note Searches only in allowed directories. | Supported | |
| Creates a new file or writes to the content of an existing file. You can choose to completely overwrite the file or append to the end of the file by specifying the Note Works only in allowed directories. | Supported | |
| Lists all root windows and their related information. Returns a list of root windows, including the window ID, window title, process ID, and process name. | Supported | |
| Retrieves information about the current active window. Returns details including the window ID, title, process ID (pid), and process name (pname). | Supported | |
| Activates a specific window by its window ID. | Supported | |
| Maximizes a specific window by its window ID. | Supported | |
| Minimizes a specific window by its window ID. | Supported | |
| Restores a specific window to its normal state by its window ID. | Supported | |
| Closes a specific window by its window ID. | Supported | |
| Resizes a specific window by its window ID. | Supported | |
| Sets a specific window to full-screen mode by its window ID. | Supported | |
| Enables or disables focus mode. When focus mode is enabled, only windows from the currently active process and its child processes can remain in the foreground. If other processes attempt to bring their windows to the foreground, those windows are closed. | Supported | |
| Closes the current browser proxy session. Terminates the browser process managed by the proxy and cleans up related resources. | Supported | |
| Extracts information from a web page based on the provided instructions and returns structured results in the specified schema format. | Supported | |
| Navigates to the specified URL in the browser. | Supported | |
| An asynchronous function to get the result of a task. | Supported | |
| Captures a screenshot of the current web page. Provides flexible screenshot features, including full-page screenshots, cropping of specified areas, and image quality settings. The result is returned in data URL format (data:image/png;base64,...), which can be used directly in frontend applications. | Supported | |
| Identifies and locates interactive UI elements. Discovers and describes elements that can be interacted with, such as buttons or input boxes, for subsequent operations. Use the observe tool to find actionable elements. To extract structured data or text content, use the extract tool. | Supported | |
| Gets the current progress or final result of an asynchronous operation task. This method must be used with the | Supported | |
| Asynchronously extracts information from a web page based on the provided instructions and returns structured results in the specified schema format. | Supported | |
| Asynchronously starts the execution of one or more operations on the current web page and returns a Unlike | Supported | |
| Executes one or more operations on the current web page and blocks until all operations are complete. This method immediately executes the specified operation or sequence of operations through the current Agent and returns the final result after the entire sequence is complete. It can handle a single interaction or a series of ordered steps (an action sequence). The call blocks until one of the following conditions is met:
Note Unlike To monitor or check the result of each step, use the asynchronous version. | Supported |
System
The System tool builds a secure and isolated cloud sandbox environment. It supports dynamic resource creation and destruction, command execution, retrieval of visualization results, and controlled access.
Name | Description | Parameters |
| Creates a new AgentBay sandbox and returns its ID. | |
| Gets the runtime URL for accessing the Wuying MCP runtime. The URL can be used only once and becomes invalid after use. | |
| Captures a full-screen screenshot of the current screen and returns a shareable URL. The screenshot is automatically processed and stored securely. For security, the generated URL expires after 64 minutes. | |
| Releases resources after the task is complete. | |
| Executes shell commands on the Linux platform with timeout control. Returns the command output or an error message. | |
Playwright
The Playwright tool offers an end-to-end automation solution for developers and testers. It features fine-grained browser control, rich element interaction, and full debugging support. Its modular design lets you combine tools to handle everything from simple page operations to complex test scenarios. This helps you efficiently build and validate modern web application test flows.
Name | Description | Parameters |
| Closes the page. | |
| Resizes the browser window. | |
| Returns all console messages. | |
| Handles a dialog box. | |
| Uploads one or more files. | |
| Installs the browser specified in the configuration. Call this tool if you receive an error that the browser is not installed. | |
| Presses a key on the keyboard. | |
| Navigates to a URL. | |
| Navigates to the previous page. | |
| Navigates to the next page. | |
| Returns all network requests since the page was loaded. | |
| Saves the page as a PDF file. | |
| Takes a screenshot of the current page. Actions cannot be performed based on the screenshot. To perform actions, use browser_snapshot. | |
| Captures an accessibility snapshot of the current page. | |
| Performs a click operation on the web page. | |
| Performs a drag-and-drop operation between two elements. | |
| Hovers the mouse over a page element. | |
| Enters text into an editable element. | |
| Selects an option in a drop-down menu. | |
| Lists the browser tabs. | |
| Opens a new tab. | |
| Selects a tab by its index. | |
| Closes a tab. | |
| Generates a Playwright test for a specified scenario. | |
| Waits for text to appear or disappear, or for a specified time to pass. | |
Shell
The Shell tool executes terminal commands on the Linux platform and returns the output or an error after a timeout.
Name | Description | Parameters |
| The command entered by the client. For example: | |
| The command execution timeout in milliseconds. If not specified, the default value of 1000 milliseconds is used. |
Application
The Application tool provides core features for application management. It covers the complete application lifecycle, from discovery and startup to shutdown.
Name | Description | Parameters |
| Retrieves a list of applications installed on the system. Supports filtering by Start menu entries and desktop shortcuts, and lets you exclude system applications. Returns application details, including the name, start command, optional stop command, and working directory. The start command can contain placeholders, such as %F and %U, which follow standard Linux desktop entry rules. When an application starts, these placeholders must be replaced with the appropriate arguments, file paths, or URLs. | |
| Starts a specified application using the provided command and optional working directory. Returns a list of processes associated with the launched application, including the process name, PID, and start command. | |
| Terminates all processes associated with a specified process name. Use with caution, as this forcefully terminates the specified process. | |
| Terminates a specific process identified by its process ID. Use with caution, as this forcefully terminates the specified process. | |
| Terminates an application using the provided stop command. Use with caution, as this forcefully terminates the specified process. | |
| Lists all applications with visible windows, including their associated process information. Returns a list of processes that have visible windows, including the process name, PID, and start command. | |
FileSystem
The FileSystem tool allows the Agent to perform secure file operations through the MCP protocol. It supports core features such as reading and writing files, managing directory structures, searching across paths, and querying metadata. Sandboxed path permissions ensure file operations are secure and reliable within a defined scope. The tool includes a precise content editing mode and a dry-run validation mechanism, making it ideal for automated file management.
Name | Description | Parameters |
| Creates a new directory or ensures a directory exists. You can create multiple nested directories in one operation. If the directory already exists, this operation succeeds silently. This is ideal for setting up directory structures for projects or ensuring required paths exist. Note Works only in allowed directories. | |
| Performs line-based edits on a text file. Each edit replaces an exact sequence of lines with new content. Returns a git-style diff to show the changes made. Note Works only in allowed directories. | |
| Retrieves detailed metadata for a file or directory. Returns comprehensive information including size, creation time, last modified time, permissions, and type. This tool is ideal for understanding file characteristics without reading the actual content. Note Works only in allowed directories. | |
| Reads the content of a file from the file system. You can specify an optional Note Works only in allowed directories. | |
| Reads the contents of multiple files at the same time. This is more efficient than reading files one by one when you need to analyze or compare multiple files. The content of each file is returned with its path as a reference. A failure to read a single file does not stop the entire operation. Note Works only in allowed directories. | |
| Gets a detailed list of all files and directories in the specified path. The results clearly distinguish between files and directories with [FILE] and [DIR] prefixes. This tool is useful for understanding the directory structure and finding specific files. Note Works only in allowed directories. | |
| Moves or renames files and directories. You can move a file to a different directory and rename it in a single operation. If the destination already exists, the operation fails. It works between different directories and can be used for simple renaming within the same directory. Note The source and destination must both be in allowed directories. | |
| Recursively searches for files and directories that match a pattern. It searches all subdirectories from the starting path. The search is case-sensitive and matches partial names. It returns the full paths of all matching items. This is ideal for finding files when you do not know their exact location. Note Searches only in allowed directories. | |
| Creates a new file or writes to the content of an existing file. You can choose to completely overwrite the file or append to the end of the file by specifying the Note Works only in allowed directories. | |
UI
The UI tool is used for window management. It provides operations to enhance user experience and control in a graphical user interface (GUI) environment. Its features focus on retrieving window information, window operations, window adjustments, and focus mode management.
Name | Description | Parameters |
| Lists all root windows and their related information. Returns a list of root windows, including the window ID, window title, process ID, and process name. | |
| Retrieves information about the current active window. Returns details including the window ID, title, process ID (pid), and process name (pname). | |
| Activates a specific window by its window ID. | |
| Maximizes a specific window by its window ID. | |
| Minimizes a specific window by its window ID. | |
| Restores a specific window to its normal state by its window ID. | |
| Closes a specific window by its window ID. | |
| Resizes a specific window by its window ID. | |
| Sets a specific window to full-screen mode by its window ID. | |
| Enables or disables focus mode. When focus mode is enabled, only windows from the currently active process and its child processes can remain in the foreground. If other processes attempt to bring their windows to the foreground, those windows are closed. | |
PageUseAgent
The PageUseAgent tool enables the Agent to perform automated browser page operations through the MCP protocol. It supports features such as asynchronous and synchronous action execution, interactive element recognition, web page information extraction, and screenshot capture. The tool uses a task-based execution model and a progress polling mechanism to achieve precise page interaction and data retrieval in a controlled environment. It is suitable for scenarios such as web automation, RPA, and dynamic content collection.
Name | Description | Parameters |
| Closes the current browser proxy session. Terminates the browser process managed by the proxy and cleans up related resources. | |
| Extracts information from a web page based on the provided instructions and returns structured results in the specified schema format. | |
| Navigates to the specified URL in the browser. | |
| An asynchronous function to get the result of a task. | |
| Captures a screenshot of the current web page. Provides flexible screenshot features, including full-page screenshots, cropping of specified areas, and image quality settings. The result is returned in data URL format (data:image/png;base64,...), which can be used directly in frontend applications. | |
| Identifies and locates interactive UI elements. Discovers and describes elements that can be interacted with, such as buttons or input boxes, for subsequent operations. Use the observe tool to find actionable elements. To extract structured data or text content, use the extract tool. | |
| Gets the current progress or final result of an asynchronous operation task. This method must be used with the | |
| Asynchronously extracts information from a web page based on the provided instructions and returns structured results in the specified schema format. | |
| Asynchronously starts the execution of one or more operations on the current web page and returns a Unlike | |
page_use_act | Executes one or more operations on the current web page and blocks until all operations are complete. This method immediately executes the specified operation or sequence of operations through the current Agent and returns the final result after the entire sequence is complete. It can handle a single interaction or a series of ordered steps (an action sequence). The call blocks until one of the following conditions is met:
Note Unlike To monitor or check the result of each step, use the asynchronous version. | |
MCP Tool List
{
"playwright": {
"tools": [
{
"name": "browser_close",
"description": "Close the page",
"inputSchema": {
"type": "object",
"properties": {
},
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
},
{
"name": "browser_resize",
"description": "Resize the browser window",
"inputSchema": {
"type": "object",
"properties": {
"width": {
"type": "number",
"description": "Width of the browser window"
},
"height": {
"type": "number",
"description": "Height of the browser window"
}
},
"required": [
"width",
"height"
],
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
},
{
"name": "browser_console_messages",
"description": "Returns all console messages",
"inputSchema": {
"type": "object",
"properties": {
},
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
},
{
"name": "browser_handle_dialog",
"description": "Handle a dialog",
"inputSchema": {
"type": "object",
"properties": {
"accept": {
"type": "boolean",
"description": "Whether to accept the dialog."
},
"promptText": {
"type": "string",
"description": "The text of the prompt in case of a prompt dialog."
}
},
"required": [
"accept"
],
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
},
{
"name": "browser_file_upload",
"description": "Upload one or multiple files",
"inputSchema": {
"type": "object",
"properties": {
"paths": {
"type": "array",
"items": {
"type": "string"
},
"description": "The absolute paths to the files to upload. Can be a single file or multiple files."
}
},
"required": [
"paths"
],
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
},
{
"name": "browser_install",
"description": "Install the browser specified in the config. Call this if you get an error about the browser not being installed.",
"inputSchema": {
"type": "object",
"properties": {
},
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
},
{
"name": "browser_press_key",
"description": "Press a key on the keyboard",
"inputSchema": {
"type": "object",
"properties": {
"key": {
"type": "string",
"description": "Name of the key to press or a character to generate, such as `ArrowLeft` or `a`"
}
},
"required": [
"key"
],
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
},
{
"name": "browser_navigate",
"description": "Navigate to a URL",
"inputSchema": {
"type": "object",
"properties": {
"url": {
"type": "string",
"description": "The URL to navigate to"
}
},
"required": [
"url"
],
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
},
{
"name": "browser_navigate_back",
"description": "Go back to the previous page",
"inputSchema": {
"type": "object",
"properties": {
},
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
},
{
"name": "browser_navigate_forward",
"description": "Go forward to the next page",
"inputSchema": {
"type": "object",
"properties": {
},
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
},
{
"name": "browser_network_requests",
"description": "Returns all network requests since loading the page",
"inputSchema": {
"type": "object",
"properties": {
},
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
},
{
"name": "browser_pdf_save",
"description": "Save page as PDF",
"inputSchema": {
"type": "object",
"properties": {
"filename": {
"type": "string",
"description": "File name to save the pdf to. Defaults to `page-{timestamp}.pdf` if not specified."
}
},
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
},
{
"name": "browser_take_screenshot",
"description": "Take a screenshot of the current page. You can't perform actions based on the screenshot, use browser_snapshot for actions.",
"inputSchema": {
"type": "object",
"properties": {
"raw": {
"type": "boolean",
"description": "Whether to return without compression (in PNG format). Default is false, which returns a JPEG image."
},
"filename": {
"type": "string",
"description": "File name to save the screenshot to. Defaults to `page-{timestamp}.{png|jpeg}` if not specified."
},
"element": {
"type": "string",
"description": "Human-readable element description used to obtain permission to screenshot the element. If not provided, the screenshot will be taken of viewport. If element is provided, ref must be provided too."
},
"ref": {
"type": "string",
"description": "Exact target element reference from the page snapshot. If not provided, the screenshot will be taken of viewport. If ref is provided, element must be provided too."
}
},
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
},
{
"name": "browser_snapshot",
"description": "Captures an accessibility snapshot of the current page. This is better than a screenshot.",
"inputSchema": {
"type": "object",
"properties": {
},
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
},
{
"name": "browser_click",
"description": "Perform click on a web page",
"inputSchema": {
"type": "object",
"properties": {
"element": {
"type": "string",
"description": "Human-readable element description used to obtain permission to interact with the element"
},
"ref": {
"type": "string",
"description": "Exact target element reference from the page snapshot"
}
},
"required": [
"element",
"ref"
],
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
},
{
"name": "browser_drag",
"description": "Perform drag and drop between two elements",
"inputSchema": {
"type": "object",
"properties": {
"startElement": {
"type": "string",
"description": "Human-readable source element description used to obtain the permission to interact with the element"
},
"startRef": {
"type": "string",
"description": "Exact source element reference from the page snapshot"
},
"endElement": {
"type": "string",
"description": "Human-readable target element description used to obtain the permission to interact with the element"
},
"endRef": {
"type": "string",
"description": "Exact target element reference from the page snapshot"
}
},
"required": [
"startElement",
"startRef",
"endElement",
"endRef"
],
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
},
{
"name": "browser_hover",
"description": "Hover over element on page",
"inputSchema": {
"type": "object",
"properties": {
"element": {
"type": "string",
"description": "Human-readable element description used to obtain permission to interact with the element"
},
"ref": {
"type": "string",
"description": "Exact target element reference from the page snapshot"
}
},
"required": [
"element",
"ref"
],
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
},
{
"name": "browser_type",
"description": "Type text into editable element",
"inputSchema": {
"type": "object",
"properties": {
"element": {
"type": "string",
"description": "Human-readable element description used to obtain permission to interact with the element"
},
"ref": {
"type": "string",
"description": "Exact target element reference from the page snapshot"
},
"text": {
"type": "string",
"description": "Text to type into the element"
},
"submit": {
"type": "boolean",
"description": "Whether to submit entered text (press Enter after)"
},
"slowly": {
"type": "boolean",
"description": "Whether to type one character at a time. Useful for triggering key handlers in the page. By default entire text is filled in at once."
}
},
"required": [
"element",
"ref",
"text"
],
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
},
{
"name": "browser_select_option",
"description": "Select an option in a dropdown",
"inputSchema": {
"type": "object",
"properties": {
"element": {
"type": "string",
"description": "Human-readable element description used to obtain permission to interact with the element"
},
"ref": {
"type": "string",
"description": "Exact target element reference from the page snapshot"
},
"values": {
"type": "array",
"items": {
"type": "string"
},
"description": "Array of values to select in the dropdown. This can be a single value or multiple values."
}
},
"required": [
"element",
"ref",
"values"
],
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
},
{
"name": "browser_tab_list",
"description": "List browser tabs",
"inputSchema": {
"type": "object",
"properties": {
},
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
},
{
"name": "browser_tab_new",
"description": "Open a new tab",
"inputSchema": {
"type": "object",
"properties": {
"url": {
"type": "string",
"description": "The URL to navigate to in the new tab. If not provided, the new tab will be blank."
}
},
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
},
{
"name": "browser_tab_select",
"description": "Select a tab by index",
"inputSchema": {
"type": "object",
"properties": {
"index": {
"type": "number",
"description": "The index of the tab to select"
}
},
"required": [
"index"
],
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
},
{
"name": "browser_tab_close",
"description": "Close a tab",
"inputSchema": {
"type": "object",
"properties": {
"index": {
"type": "number",
"description": "The index of the tab to close. Closes current tab if not provided."
}
},
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
},
{
"name": "browser_generate_playwright_test",
"description": "Generate a Playwright test for given scenario",
"inputSchema": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "The name of the test"
},
"description": {
"type": "string",
"description": "The description of the test"
},
"steps": {
"type": "array",
"items": {
"type": "string"
},
"description": "The steps of the test"
}
},
"required": [
"name",
"description",
"steps"
],
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
},
{
"name": "browser_wait_for",
"description": "Wait for text to appear or disappear or a specified time to pass",
"inputSchema": {
"type": "object",
"properties": {
"time": {
"type": "number",
"description": "The time to wait in seconds"
},
"text": {
"type": "string",
"description": "The text to wait for"
},
"textGone": {
"type": "string",
"description": "The text to wait for to disappear"
}
},
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}
}
]
},
"oss": {
"tools": [
{
"name": "oss_env_init",
"description": "Create and initialize OSS environment variables with the specified endpoint, access key ID, access key secret, security token, and region. The temporary security credentials obtained from the STS (Security Token Service). For more information, see: https://www.alibabacloud.com/help/zh/oss/developer-reference/use-temporary-access-credentials-provided-by-sts-to-access-oss?spm=a2c4g.11186623.help-menu-search-31815.d_1#9ab17afd7cs4t .",
"inputSchema": {
"properties": {
"access_key_id": {
"description": "The Access Key ID for OSS authentication",
"type": "string"
},
"access_key_secret": {
"description": "The Access Key Secret for OSS authentication",
"type": "string"
},
"endpoint": {
"description": "The OSS service endpoint, e.g., If not specified, the default is https://oss-cn-hangzhou.aliyuncs.com",
"type": "string"
},
"region": {
"description": "The OSS region, e.g., cn-hangzhou. If not specified, the default is cn-hangzhou",
"type": "string"
},
"security_token": {
"description": "The Security Token for OSS authentication",
"type": "string"
}
},
"required": [
"access_key_id",
"access_key_secret",
"security_token"
],
"type": "object"
}
},
{
"name": "oss_upload",
"description": "Upload a local file or directory to the specified OSS bucket. If a directory is specified, it will be compressed into a zip file before uploading. The object name in OSS can be specified; if not, the file or zip name will be used by default. Note: Before using this tool, call `oss_env_init` to initialize OSS environment variables.",
"inputSchema": {
"properties": {
"bucket": {
"description": "OSS bucket name",
"type": "string"
},
"object": {
"description": "Object path in OSS bucket, e.g., test/test.txt",
"type": "string"
},
"path": {
"description": "Local file or not empty directory full path to upload, e.g., /tmp/test.txt /tmp on Linux or C:/tmp/test.txt C:/tmp on Windows",
"type": "string"
}
},
"required": [
"bucket",
"object",
"path"
],
"type": "object"
}
},
{
"name": "oss_download",
"description": "Download an object from the specified OSS bucket to the given local path. If the parent directory does not exist, it will be created automatically. If the target file already exists, it will be overwritten. Note: Before using this tool, call `oss_env_init` to initialize OSS environment variables.",
"inputSchema": {
"properties": {
"bucket": {
"description": "OSS bucket name",
"type": "string"
},
"object": {
"description": "Object path in OSS bucket, e.g., test/test.txt",
"type": "string"
},
"path": {
"description": "Local full path to save the downloaded file, e.g., /tmp/test.txt on Linux or C:/tmp/test.txt on Windows",
"type": "string"
}
},
"required": [
"bucket",
"object",
"path"
],
"type": "object"
}
},
{
"name": "oss_upload_annon",
"description": "Upload a local file or directory to the specified URL using HTTP PUT. If a directory is specified, it will be compressed into a zip file before uploading. If the upload target already exists, it will be overwritten.",
"inputSchema": {
"properties": {
"path": {
"description": "Local file or not empty directory full path to upload, e.g., /tmp/test.txt /tmp on Linux or C:/tmp/test.txt C:/tmp on Windows",
"type": "string"
},
"url": {
"description": "The HTTP/HTTPS URL to upload the file to",
"type": "string"
}
},
"required": [
"url",
"path"
],
"type": "object"
}
},
{
"name": "oss_download_annon",
"description": "Download a file from the specified URL to the given local path. If the parent directory does not exist, it will be created automatically. If the target file already exists, it will be overwritten.",
"inputSchema": {
"properties": {
"path": {
"description": "The full local file path to save the downloaded file, e.g., /tmp/test.txt on Linux or C:/tmp/test.txt on Windows",
"type": "string"
},
"url": {
"description": "The HTTP/HTTPS URL to download the file from",
"type": "string"
}
},
"required": [
"url",
"path"
],
"type": "object"
}
}
]
},
"shell": {
"tools": [
{
"name": "shell",
"description": "Executes a shell command on the Linux platform with a timeout and returns the output or an error.",
"inputSchema": {
"properties": {
"command": {
"description": "client input command",
"type": "string"
},
"timeout_ms": {
"default": 1000,
"description": "Command execution timeout (unit: milliseconds). If not specified, the default value (such as 1000 milliseconds) is used",
"type": "integer"
}
},
"required": [
"command",
"timeout_ms"
],
"type": "object"
}
}
]
},
"application": {
"tools": [
{
"name": "get_installed_apps",
"description": "Retrieve a list of installed applications on the system. Supports filtering by Start Menu entries and Desktop shortcuts, with an option to exclude system applications. Returns application details including name, start command, optional stop command, and working directory. Start commands may include placeholders (e.g., %F, %U) that follow Linux standard desktop entry rules. These placeholders should be replaced with appropriate arguments, file paths, or URLs when starting the application.",
"inputSchema": {
"properties": {
"desktop": {
"default": false,
"description": "Include Desktop shortcuts (default: false)",
"type": "boolean"
},
"ignore_system_app": {
"default": true,
"description": "Exclude system applications (default: true)",
"type": "boolean"
},
"start_menu": {
"default": true,
"description": "Include Start Menu applications (default: true)",
"type": "boolean"
}
},
"required": [
],
"type": "object"
}
},
{
"name": "start_app",
"description": "Start a specified application using the provided command and optional working directory. Returns a list of processes associated with the launched application, including their process names, PIDs, and startup commands.\n\n",
"inputSchema": {
"properties": {
"activity": {
"description": "The activity name on android,like com.xxActivity or com.xx/com.xxActivity.",
"type": "string"
},
"start_cmd": {
"description": "The command to start the application. This can include placeholders like %F (file path) or %U (URL/file path), which must be replaced with actual values before execution.",
"type": "string"
},
"work_directory": {
"default": "",
"description": "The directory from which the application should be launched. If omitted, the default directory is used.",
"type": "string"
}
},
"required": [
"start_cmd"
],
"type": "object"
}
},
{
"name": "stop_app_by_pname",
"description": "Stop all processes associated with a specified process name. Use with caution as this will forcefully terminate the specified process.",
"inputSchema": {
"properties": {
"pname": {
"description": "The name of the process to terminate.",
"type": "string"
}
},
"required": [
"pname"
],
"type": "object"
}
},
{
"name": "stop_app_by_pid",
"description": "Terminate a specific process identified by its Process ID. Use with caution as this will forcefully terminate the specified process.",
"inputSchema": {
"properties": {
"pid": {
"description": "The Process ID of the process to terminate.",
"type": "integer"
}
},
"required": [
"pid"
],
"type": "object"
}
},
{
"name": "stop_app_by_cmd",
"description": "Terminate an application using the provided stop command. Use with caution as this will forcefully terminate the specified process.",
"inputSchema": {
"properties": {
"stop_cmd": {
"description": "The command used to terminate the application.",
"type": "string"
}
},
"required": [
"stop_cmd"
],
"type": "object"
}
},
{
"name": "list_visible_apps",
"description": "List all applications with visible windows, including their associated process information. Returns a list of processes that have visible windows, including their process names, PIDs, and startup commands.",
"inputSchema": {
"properties": {
},
"required": [
],
"type": "object"
}
}
]
},
"filesystem": {
"tools": [
{
"name": "create_directory",
"description": "Create a new directory or ensure a directory exists. Can create multiple nested directories in one operation. If the directory already exists, this operation will succeed silently. Ideal for setting up directory structures for projects or ensuring required paths exist. Only works within allowed directories.",
"inputSchema": {
"properties": {
"path": {
"description": "Directory path to create.",
"type": "string"
}
},
"required": [
"path"
],
"type": "object"
}
},
{
"name": "edit_file",
"description": "Make line-based edits to a text file. Each edit replaces exact line sequences with new content. Returns a git-style diff showing the changes made. Only works within allowed directories.",
"inputSchema": {
"properties": {
"dryRun": {
"default": false,
"description": "Preview changes using git-style diff format",
"type": "boolean"
},
"edits": {
"items": {
"properties": {
"newText": {
"description": "Text to replace with",
"type": "string"
},
"oldText": {
"description": "Text to search for - must match exactly",
"type": "string"
}
},
"required": [
"oldText",
"newText"
],
"type": "object"
},
"type": "array"
},
"path": {
"description": "File path to edit.",
"type": "string"
}
},
"required": [
"path",
"edits"
],
"type": "object"
}
},
{
"name": "get_file_info",
"description": "Retrieve detailed metadata about a file or directory. Returns comprehensive information including size, creation time, last modified time, permissions, and type. This tool is perfect for understanding file characteristics without reading the actual content. Only works within allowed directories.",
"inputSchema": {
"properties": {
"path": {
"description": "File or directory path to inspect.",
"type": "string"
}
},
"required": [
"path"
],
"type": "object"
}
},
{
"name": "read_file",
"description": "Read the contents of a file from the file system. You can specify an optional 'offset' (in bytes) to start reading from a specific position, and an optional 'length' (in bytes) to limit how many bytes to read. If 'length' is omitted or 0, the file will be read to the end. Handles various text encodings and provides detailed error messages if the file cannot be read. Only works within allowed directories.",
"inputSchema": {
"properties": {
"length": {
"description": "Number of bytes to read. If omitted or 0, read to end of file.",
"minimum": 0,
"type": "integer"
},
"offset": {
"default": 0,
"description": "Start reading from this byte offset.",
"minimum": 0,
"type": "integer"
},
"path": {
"description": "File path to read.",
"type": "string"
}
},
"required": [
"path"
],
"type": "object"
}
},
{
"name": "read_multiple_files",
"description": "Read the contents of multiple files simultaneously. This is more efficient than reading files one by one when you need to analyze or compare multiple files. Each file's content is returned with its path as a reference. Failed reads for individual files won't stop the entire operation. Only works within allowed directories.",
"inputSchema": {
"properties": {
"paths": {
"description": "Array of file paths to read.",
"items": {
"type": "string"
},
"type": "array"
}
},
"required": [
"paths"
],
"type": "object"
}
},
{
"name": "list_directory",
"description": "Get a detailed listing of all files and directories in a specified path. Results clearly distinguish between files and directories with [FILE] and [DIR] prefixes. This tool is essential for understanding directory structure and finding specific files within a directory. Only works within allowed directories.",
"inputSchema": {
"properties": {
"path": {
"description": "Directory path to list.",
"type": "string"
}
},
"required": [
"path"
],
"type": "object"
}
},
{
"name": "move_file",
"description": "Move or rename files and directories. Can move files between directories and rename them in a single operation. If the destination exists, the operation will fail. Works across different directories and can be used for simple renaming within the same directory. Both source and destination must be within allowed directories.",
"inputSchema": {
"properties": {
"destination": {
"description": "Destination file or directory path.",
"type": "string"
},
"source": {
"description": "Source file or directory path.",
"type": "string"
}
},
"required": [
"source",
"destination"
],
"type": "object"
}
},
{
"name": "search_files",
"description": "Recursively search for files and directories matching a pattern. Searches through all subdirectories from the starting path. The search is case-sensitive and matches partial names. Returns full paths to all matching items. Great for finding files when you don't know their exact location. Only searches within allowed directories.",
"inputSchema": {
"properties": {
"excludePatterns": {
"default": [
],
"description": "Patterns to exclude (optional).",
"items": {
"type": "string"
},
"type": "array"
},
"path": {
"description": "Directory path to start search.",
"type": "string"
},
"pattern": {
"description": "Pattern to match.",
"type": "string"
}
},
"required": [
"path",
"pattern"
],
"type": "object"
}
},
{
"name": "write_file",
"description": "Create a new file or write content to an existing file. You can choose to completely overwrite the file or append to the end by specifying the 'mode' parameter. Use 'overwrite' mode (default) to clear the file before writing, or 'append' mode to add content to the end of the file. Handles text content with proper encoding. Only works within allowed directories.",
"inputSchema": {
"properties": {
"content": {
"description": "Content to write.",
"type": "string"
},
"mode": {
"default": "overwrite",
"description": "Write mode: 'overwrite' to clear file, 'append' to add to end.",
"enum": [
"overwrite",
"append"
],
"type": "string"
},
"path": {
"description": "File path to write.",
"type": "string"
}
},
"required": [
"path",
"content"
],
"type": "object"
}
}
]
},
"ui": {
"tools": [
{
"name": "list_root_windows",
"description": "List all root windows with their associated information. Returns a list of root windows, including their window IDs, window titles, process IDs, and process names.",
"inputSchema": {
"properties": {
},
"required": [
],
"type": "object"
}
},
{
"name": "get_active_window",
"description": "Retrieve information about the currently active window. Returns details including window ID, title, process ID (pid), and process name (pname).",
"inputSchema": {
"properties": {
},
"required": [
],
"type": "object"
}
},
{
"name": "activate_window",
"description": "Activate a specific window by its window ID.",
"inputSchema": {
"properties": {
"window_id": {
"description": "The unique identifier of the window to activate.",
"type": "integer"
}
},
"required": [
"window_id"
],
"type": "object"
}
},
{
"name": "maximize_window",
"description": "Maximize a specific window by its window ID.",
"inputSchema": {
"properties": {
"window_id": {
"description": "The unique identifier of the window to maximize.",
"type": "integer"
}
},
"required": [
"window_id"
],
"type": "object"
}
},
{
"name": "minimize_window",
"description": "Minimize a specific window by its window ID.",
"inputSchema": {
"properties": {
"window_id": {
"description": "The unique identifier of the window to minimize.",
"type": "integer"
}
},
"required": [
"window_id"
],
"type": "object"
}
},
{
"name": "restore_window",
"description": "Restore a specific window to its normal state by its window ID.",
"inputSchema": {
"properties": {
"window_id": {
"description": "The unique identifier of the window to restore.",
"type": "integer"
}
},
"required": [
"window_id"
],
"type": "object"
}
},
{
"name": "close_window",
"description": "Close a specific window by its window ID.",
"inputSchema": {
"properties": {
"window_id": {
"description": "The unique identifier of the window to close.",
"type": "integer"
}
},
"required": [
"window_id"
],
"type": "object"
}
},
{
"name": "resize_window",
"description": "Resize a specific window by its window ID.",
"inputSchema": {
"properties": {
"height": {
"description": "The new height of the window (in pixels).",
"type": "integer"
},
"width": {
"description": "The new width of the window (in pixels).",
"type": "integer"
},
"window_id": {
"description": "The unique identifier of the window to resize.",
"type": "integer"
}
},
"required": [
"window_id",
"width",
"height"
],
"type": "object"
}
},
{
"name": "fullscreen_window",
"description": "Set a specific window to fullscreen mode by its window ID.",
"inputSchema": {
"properties": {
"window_id": {
"description": "The unique identifier of the window to set to fullscreen.",
"type": "integer"
}
},
"required": [
"window_id"
],
"type": "object"
}
},
{
"name": "focus_mode",
"description": "Enable or disable focus mode. When focus mode is enabled, only windows from the currently active process and its child processes are allowed to remain in the foreground. Attempts by other processes to bring their windows to the foreground will result in those windows being closed.",
"inputSchema": {
"properties": {
"on": {
"description": "Whether to enable (true) or disable (false) focus mode.",
"type": "boolean"
}
},
"required": [
"on"
],
"type": "object"
}
}
]
}
}