All Products
Search
Document Center

:Window management

Last Updated:Feb 27, 2026

Discover, control, and manage application windows on an AgentBay cloud computer. List open windows, change their state, resize them, and control focus behavior.

All window management methods belong to the Computer Use module, accessed through session.computer. Each method returns a result object with success, error_message, and operation-specific fields.

Workflow

  1. Create a session with a cloud computer desktop environment.

  2. List windows to discover open applications.

  3. Select a target window by title, process name, or window ID.

  4. Control the window -- activate, maximize, minimize, resize, or close it.

  5. Manage focus to prevent other windows from interrupting automation.

Prerequisites

  • An AgentBay session with a desktop environment

import os
from agentbay import AgentBay, CreateSessionParams

api_key = os.getenv("AGENTBAY_API_KEY")
if not api_key:
    raise ValueError("The AGENTBAY_API_KEY environment variable is required")

agent_bay = AgentBay(api_key=api_key)

params = CreateSessionParams(image_id="linux_latest")
result = agent_bay.create(params)

if result.success:
    session = result.session
    print(f"Session created: {session.session_id}")
else:
    print(f"Failed to create session: {result.error_message}")
    exit(1)

List windows

Call list_root_windows() to get all top-level application windows on the desktop. Root windows refer to top-level application windows, not the X11 root window.

Parameters

NameTypeRequiredDefaultDescription
timeout_msintNo3000Timeout in milliseconds.

Returns: WindowListResult

result = session.computer.list_root_windows(timeout_ms=5000)

if result.success:
    windows = result.windows
    print(f"Found {len(windows)} windows")

    for window in windows:
        print(f"Title: {window.title}")
        print(f"Window ID: {window.window_id}")
        print(f"Process: {window.pname if window.pname else 'N/A'}")
        print(f"PID: {window.pid if window.pid else 'N/A'}")
        print(f"Position: ({window.absolute_upper_left_x}, {window.absolute_upper_left_y})")
        print(f"Size: {window.width}x{window.height}")
        print(f"Child windows: {len(window.child_windows)}")
        print("---")
else:
    print(f"Error listing windows: {result.error_message}")

Window object attributes

AttributeTypeDescription
window_idintUnique identifier of the window.
titlestrWindow title or description text.
absolute_upper_left_xOptional[int]X-coordinate of the upper-left corner.
absolute_upper_left_yOptional[int]Y-coordinate of the upper-left corner.
widthOptional[int]Window width in pixels.
heightOptional[int]Window height in pixels.
pidOptional[int]Process ID of the window owner.
pnameOptional[str]Process name of the window owner.
child_windowsList[Window]Child windows nested under this window.

Control window state

All window control methods take a window_id parameter (int) and return a BoolResult. Get the window_id from list_root_windows() first:

result = session.computer.list_root_windows()
if result.success and result.windows:
    window_id = result.windows[0].window_id

Activate a window

Bring a window to the foreground and give it input focus.

activate_result = session.computer.activate_window(window_id)

if activate_result.success:
    print("Window activated successfully")
else:
    print(f"Failed to activate window: {activate_result.error_message}")

Maximize a window

Expand a window to fill the entire screen area.

maximize_result = session.computer.maximize_window(window_id)

if maximize_result.success:
    print("Window maximized successfully")
else:
    print(f"Failed to maximize window: {maximize_result.error_message}")

Minimize a window

Hide a window to the taskbar.

minimize_result = session.computer.minimize_window(window_id)

if minimize_result.success:
    print("Window minimized successfully")
else:
    print(f"Failed to minimize window: {minimize_result.error_message}")

Restore a window

Return a maximized or minimized window to its previous size and position.

restore_result = session.computer.restore_window(window_id)

if restore_result.success:
    print("Window restored successfully")
else:
    print(f"Failed to restore window: {restore_result.error_message}")

Make a window full screen

Set a window to full screen mode.

fullscreen_result = session.computer.fullscreen_window(window_id)

if fullscreen_result.success:
    print("Window set to full screen")
else:
    print(f"Failed to set window to full screen: {fullscreen_result.error_message}")

Resize a window

Change a window to specific pixel dimensions.

Parameters

NameTypeRequiredDescription
window_idintYesTarget window identifier.
widthintYesNew width in pixels.
heightintYesNew height in pixels.
resize_result = session.computer.resize_window(window_id, 800, 600)

if resize_result.success:
    print("Window resized to 800x600")
else:
    print(f"Failed to resize window: {resize_result.error_message}")

Close a window

Permanently close a window. Use with caution -- the window and any unsaved data may be lost.

close_result = session.computer.close_window(window_id)

if close_result.success:
    print("Window closed successfully")
else:
    print(f"Failed to close window: {close_result.error_message}")

Manage focus

Call focus_mode() to prevent windows from stealing focus from the active window. This is useful during automation when background processes might open dialogs or notifications that interrupt the workflow.

Parameters

NameTypeRequiredDescription
onboolYesTrue to enable focus mode, False to disable it.

Returns: BoolResult

# Enable focus mode to prevent focus stealing
try:
    session.computer.focus_mode(True)
    print("Focus mode enabled - Windows will not steal focus")
except Exception as e:
    print(f"Failed to enable focus mode: {e}")

# Disable focus mode
try:
    session.computer.focus_mode(False)
    print("Focus mode disabled")
except Exception as e:
    print(f"Failed to disable focus mode: {e}")

Get the active window

Call get_active_window() to retrieve the window that currently has focus. This method may fail if no window is active.

Returns: WindowInfoResult

result = session.computer.get_active_window()

if result.success:
    active_window = result.window
    print(f"Active window:")
    print(f"  Title: {active_window.title}")
    print(f"  Window ID: {active_window.window_id}")
    print(f"  Process: {active_window.pname}")
    print(f"  PID: {active_window.pid}")
    print(f"  Position: ({active_window.absolute_upper_left_x}, {active_window.absolute_upper_left_y})")
    print(f"  Size: {active_window.width}x{active_window.height}")
else:
    print(f"Failed to get active window: {result.error_message}")

Complete example

Find an application, launch it, locate its window, and control it.

import os
import time
from agentbay import AgentBay, CreateSessionParams

api_key = os.getenv("AGENTBAY_API_KEY")
if not api_key:
    raise ValueError("The AGENTBAY_API_KEY environment variable is required")

agent_bay = AgentBay(api_key=api_key)

params = CreateSessionParams(image_id="linux_latest")
result = agent_bay.create(params)

if not result.success:
    print(f"Failed to create session: {result.error_message}")
    exit(1)

session = result.session
print(f"Session created: {session.session_id}")

# Step 1: Find installed applications
print("Step 1: Find installed applications...")
apps_result = session.computer.get_installed_apps(
    start_menu=True,
    desktop=False,
    ignore_system_apps=True
)

if not apps_result.success:
    print(f"Failed to get applications: {apps_result.error_message}")
    agent_bay.delete(session)
    exit(1)

target_app = None
for app in apps_result.data:
    if "chrome" in app.name.lower():
        target_app = app
        break

if not target_app:
    print("Google Chrome not found")
    agent_bay.delete(session)
    exit(1)

print(f"Found application: {target_app.name}")

# Step 2: Start the application
print("Step 2: Start the application...")
start_result = session.computer.start_app(target_app.start_cmd)

if not start_result.success:
    print(f"Failed to start application: {start_result.error_message}")
    agent_bay.delete(session)
    exit(1)

print(f"Application started, {len(start_result.data)} processes launched")

# Step 3: Wait for the window to load
print("Step 3: Wait for the application window to load...")
time.sleep(5)

# Step 4: Find the application window
print("Step 4: Find the application window...")
windows_result = session.computer.list_root_windows()

if not windows_result.success:
    print(f"Failed to list windows: {windows_result.error_message}")
    agent_bay.delete(session)
    exit(1)

app_window = None
for window in windows_result.windows:
    if target_app.name.lower() in window.title.lower():
        app_window = window
        break

if not app_window and windows_result.windows:
    app_window = windows_result.windows[0]
    print("Using the first available window")

if app_window:
    print(f"Found window: {app_window.title}")

    # Step 5: Control the window
    print("Step 5: Control the window...")
    try:
        session.computer.activate_window(app_window.window_id)
        print("Window activated")

        time.sleep(1)
        session.computer.maximize_window(app_window.window_id)
        print("Window maximized")

        time.sleep(1)
        session.computer.resize_window(app_window.window_id, 1024, 768)
        print("Window resized to 1024x768")

    except Exception as e:
        print(f"Window control failed: {e}")

# Clean up
print("Cleaning up session...")
agent_bay.delete(session)
print("Workflow complete!")

API reference

Methods

MethodParametersReturn typeDescription
list_root_windows()timeout_ms: int = 3000WindowListResultList all top-level application windows.
get_active_window()NoneWindowInfoResultGet the currently active window.
activate_window()window_id: intBoolResultActivate a window and give it focus.
maximize_window()window_id: intBoolResultMaximize a window.
minimize_window()window_id: intBoolResultMinimize a window.
restore_window()window_id: intBoolResultRestore a window from maximized or minimized state.
close_window()window_id: intBoolResultClose a window.
fullscreen_window()window_id: intBoolResultSet a window to full screen mode.
resize_window()window_id: int, width: int, height: intBoolResultResize a window to specified dimensions.
focus_mode()on: boolBoolResultEnable or disable focus stealing prevention.

Return types

WindowListResult

AttributeTypeDescription
successboolWhether the operation succeeded.
windowsList[Window]List of window objects.
error_messagestrError message if the operation failed.
request_idstrUnique request identifier.

WindowInfoResult

AttributeTypeDescription
successboolWhether the operation succeeded.
windowWindowThe window object.
error_messagestrError message if the operation failed.
request_idstrUnique request identifier.

BoolResult

AttributeTypeDescription
successboolWhether the operation succeeded.
databoolResult data of the operation.
error_messagestrError message if the operation failed.
request_idstrUnique request identifier.

Window

AttributeTypeDescription
window_idintUnique identifier of the window.
titlestrWindow title or description text.
absolute_upper_left_xOptional[int]X-coordinate of the upper-left corner.
absolute_upper_left_yOptional[int]Y-coordinate of the upper-left corner.
widthOptional[int]Window width in pixels.
heightOptional[int]Window height in pixels.
pidOptional[int]Process ID of the window owner.
pnameOptional[str]Process name of the window owner.
child_windowsList[Window]List of child windows.