Description
The User Interface (UI) class provides methods for interacting with UI elements in the AgentBay cloud environment. You can use these methods to retrieve UI elements, send key events, enter text, perform gestures, and take screenshots.
Method | Description | Environment | ||||
Linux Computer Use | Windows Computer Use | Browser Use | Mobile Use | Code Space | ||
| Clicks at specified coordinates on the screen. Supports left, middle, and right clicks. | Not supported | Not supported | Not supported | Supported | Not supported |
| Enters text. | Not supported | Not supported | Not supported | Supported | Not supported |
| Sends a key press. Supports Android-specific keys such as HOME, BACK, and volume keys. | Not supported | Not supported | Not supported | Supported | Not supported |
| Performs a swipe gesture. Set the start coordinates, end coordinates, and duration. | Not supported | Not supported | Not supported | Supported | Not supported |
| Takes a screenshot of the session and saves it to a specified OSS URL. | Supported | Supported | Supported | Supported | Not supported |
| Lists all root windows and their information, including window ID, title, process ID, and process name. | Not supported | Supported | Not supported | Not supported | Not supported |
| Gets information about the active window. | Not supported | Supported | Not supported | Not supported | Not supported |
| Activates a specified window. | Not supported | Supported | Not supported | Not supported | Not supported |
| Maximizes a window. | Not supported | Supported | Not supported | Not supported | Not supported |
| Minimizes a window. | Not supported | Supported | Not supported | Not supported | Not supported |
| Restores a window to its normal state. | Not supported | Supported | Not supported | Not supported | Not supported |
| Closes a window. | Not supported | Supported | Not supported | Not supported | Not supported |
| Resizes a window. Specify the width and height. | Not supported | Supported | Not supported | Not supported | Not supported |
| Sets a window to full screen mode. | Not supported | Supported | Not supported | Not supported | Not supported |
| Enables or disables focus mode. | Not supported | Supported | Not supported | Not supported | Not supported |
| Gets all UI elements on the device, including non-interactive elements. | Not supported | Not supported | Not supported | Supported | Not supported |
| Gets all clickable UI elements. | Not supported | Not supported | Not supported | Supported | Not supported |
Properties
Property name | Description |
HOME | Home key (3) |
BACK | Back key (4) |
VOLUME_UP | Volume up key (24) |
VOLUME_DOWN | Volume down key (25) |
POWER | Power key (26) |
MENU | Menu key (82) |
Methods
getClickableUIElements - Gets clickable UI elements
Golang
func (ui *UI) GetClickableUIElements(timeoutMs int) (*UIElementsResult, error)Parameter:
timeoutMs (int): The timeout period in milliseconds. If the value is
<= 0, the default is2000 ms.
Return values:
*UIElementsResult: A result object that contains the clickable UI elements and the request ID.error: An error message if the operation fails.
UIElementsResult struct:
type UIElementsResult struct {
RequestID string // A unique request identifier for debugging.
Elements []*UIElement // An array of UI elements.
}
type UIElement struct {
Bounds string // The bounds of the element.
ClassName string // The CSS class name.
Text string // The text content.
Type string // The element type.
ResourceId string // The resource ID.
Index int // The element index.
IsParent bool // Indicates whether the element is a parent element.
Children []*UIElement // The child elements.
}Python
def get_all_ui_elements(timeout_ms: int = 2000) -> List[Dict[str, Any]]Parameter:
timeout_ms (int, optional): The timeout period in milliseconds. The default is
2000ms.
Return value:
List[Dict[str, Any]]: A list of all UI elements, including parsing details.
Exception:
AgentBayError: Raised if the operation fails.
TypeScript
getClickableUIElements(timeoutMs?: number): Promise<string>Parameter:
timeoutMs (number, optional): The timeout period in milliseconds. The default is
2000ms.
Return value:
Promise<string>: A string representation of the clickable UI elements if the operation is successful.
Exception:
APIError: Thrown if the operation fails.
getAllUIElements - Gets all UI elements
Golang
func (ui *UI) GetAllUIElements(timeoutMs int) (*UIElementsResult, error)Parameter:
timeoutMs (int): The timeout period in milliseconds. If the value is
<= 0, the default value is2000 ms.
Return values:
*UIElementsResult: A result object that contains all UI elements and the request ID.error: An error message if the operation fails.
Python
def get_all_ui_elements(timeout_ms: int = 2000) -> List[Dict[str, Any]]Parameter:
timeout_ms (int, optional): The timeout period in milliseconds. The default is
2000ms.
Return value:
List[Dict[str, Any]]: A list of all UI elements, including parsing details.
Exception:
AgentBayError: Raised if the operation fails.
TypeScript
getAllUIElements(timeoutMs?: number): Promise<string>Parameter:
timeoutMs (number, optional): The timeout period in milliseconds. The default is
2000ms.
Return value:
Promise<string>: A string representation of all UI elements if the operation is successful.
Exception:
APIError: Thrown if the operation fails.
sendKey - Sends a key press
Golang
func (ui *UI) SendKey(key int) (*KeyActionResult, error)Parameter:
key (int): The key code to send. Use the
KeyCodeconstant.
Return values:
*KeyActionResult: A result object that contains the operation status and the request ID.error: An error message if the operation fails.
KeyActionResult struct:
type KeyActionResult struct {
RequestID string // A unique request identifier for debugging.
Success bool // Indicates whether the key was sent successfully.
}Python
def send_key(key: int) -> boolParameter:
key (int): The key code to send. Use the
KeyCodeconstant.
Return value:
bool: ReturnsTrueif the operation is successful.
Exception:
AgentBayError: Raised if the operation fails.
Example:
# Send the Home keypress
success = agent_bay.ui.send_key(KeyCode.HOME)
print("Keypress send result:", success)TypeScript
sendKey(key: number): Promise<string>Parameter:
key (number): The key code to send. Use the
KeyCodeconstant.
Return value:
Promise<string>: The response text if the operation is successful.
Exception:
APIError: Thrown if the operation fails.
Example:
// Send the Home keypress
const result = await agentBay.ui.sendKey(KeyCode.HOME);
console.log("Keypress send result:", result);inputText - Enters text
Golang
func (ui *UI) InputText(text string) (*TextInputResult, error)Parameter:
text (string): The text to enter.
Return values:
*TextInputResult: A result object that contains the entered text and the request ID.error: An error message if the operation fails.
TextInputResult struct:
type TextInputResult struct {
RequestID string // A unique request identifier for debugging.
Text string // The entered text.
}Python
def input_text(text: str) -> NoneParameter:
text (str): The text to enter.
Exception:
AgentBayError: Raised if the operation fails.
Example:
# Enter the text "Hello"
agent_bay.ui.input_text("Hello")TypeScript
inputText(text: string): Promise<string>Parameter:
text (string): The text to enter.
Return value:
Promise<string>: The response text if the operation is successful.
Exception:
APIError: Thrown if the operation fails.
Example:
// Enter the text "Hello"
const result = await agentBay.ui.inputText("Hello");
console.log("Text input result:", result);swipe - Performs a swipe
Golang
func (ui *UI) Swipe(startX, startY, endX, endY, durationMs int) (*SwipeResult, error)Parameters:
startX (int): The starting X coordinate.
startY (int): The starting Y coordinate.
endX (int): The ending X coordinate.
endY (int): The ending Y coordinate.
durationMs (int): The duration of the swipe in milliseconds.
Return values:
*SwipeResult: A result object that contains the operation status and the request ID.error: An error message if the operation fails.
SwipeResult struct:
type SwipeResult struct {
RequestID string // A unique request identifier for debugging.
Success bool // Indicates whether the swipe was successful.
}Python
def swipe(start_x: int, start_y: int, end_x: int, end_y: int, duration_ms: int = 300) -> NoneParameters:
start_x (int): The starting X coordinate.
start_y (int): The starting Y coordinate.
end_x (int): The ending X coordinate.
end_y (int): The ending Y coordinate.
duration_ms (int, optional): The duration of the swipe in milliseconds. The default is
300ms.
Exception:
AgentBayError: Raised if the operation fails.
Example:
# Swipe from (100, 200) to (300, 400)
agent_bay.ui.swipe(100, 200, 300, 400)TypeScript
swipe(
startX: number,
startY: number,
endX: number,
endY: number,
durationMs?: number
): Promise<string>Parameters:
startX (number): The starting X coordinate.
startY (number): The starting Y coordinate.
endX (number): The ending X coordinate.
endY (number): The ending Y coordinate.
durationMs (number, optional): The duration of the swipe in milliseconds. The default is
300ms.
Return value:
Promise<string>: The response text if the operation is successful.
Exception:
APIError: Thrown if the operation fails.
Example:
// Swipe from (100, 200) to (300, 400)
const result = await agentBay.ui.swipe(100, 200, 300, 400);
console.log("Swipe operation result:", result);click - Clicks a coordinate point
Golang
func (ui *UI) Click(x, y int, button string) (*UIResult, error)Parameters:
x (int): The X coordinate.
y (int): The Y coordinate.
button (string): The mouse button to use. If this parameter is empty, the default is
left.
Return values:
*UIResult: A result object that contains the operation status, component ID, and request ID.error: An error message if the operation fails.
UIResult struct:
type UIResult struct {
RequestID string // A unique request identifier for debugging.
ComponentID string // The component ID, if applicable.
Success bool // Indicates whether the operation was successful.
}Python
def click(x: int, y: int, button: str = "left") -> NoneParameters:
x (int): The X coordinate.
y (int): The Y coordinate.
button (str, optional): The mouse button to use. The default is
left.
Exception:
AgentBayError: Raised if the operation fails.
Example:
# Left-click the coordinates (200, 300)
agent_bay.ui.click(200, 300)TypeScript
click(x: number, y: number, button?: string): Promise<string>Parameters:
x (number): The X coordinate.
y (number): The Y coordinate.
button (string, optional): The mouse button to use. The default is
left.
Return value:
Promise<string>: The response text if the operation is successful.
Exception:
APIError: Thrown if the operation fails.
Example:
// Left-click the coordinates (500, 800)
const result = await agentBay.ui.click(500, 800);
console.log("Click operation result:", result);screenshot - Takes a screenshot
Golang
func (ui *UI) Screenshot() (*UIResult, error)Return values:
*UIResult: A result object that contains the operation status and the request ID.error: An error message if the operation fails.
Python
def screenshot() -> strReturn value:
str: The screenshot data.
Exception:
AgentBayError: Raised if the operation fails.
Example:
# Take a screenshot
image_data = agent_bay.ui.screenshot()
print("Screenshot data:", image_data[:100]) # Print the first 100 charactersTypeScript
screenshot(): Promise<string>Return value:
Promise<string>: The screenshot data if the operation is successful.
Exception:
APIError: Thrown if the operation fails.
Example:
// Take a screenshot
const imageData = await agentBay.ui.screenshot();
console.log("Screenshot data:", imageData);