All Products
Search
Document Center

Cloud Phone:MobileAgentSDK for Android

Last Updated:May 22, 2026

MobileAgentSDK is an Android library that wraps the C++ MobileAgent SDK and provides a Kotlin/Java-friendly interface. It connects to the MobileAgent server via WebSocket to execute, manage, and monitor agent tasks.

Download the SDK

MobileAgentSDK for Android

Usage

Import the AAR

Copy the generated AAR file into your project and add the following dependency to your build.gradle file:

dependencies {
    implementation(files('libs/sdk-release.aar'))
}

Basic usage

Option 1: Use Flow (recommended)

Flow provides a reactive programming style that is well-suited for Kotlin coroutines:

import com.wuying.mobileagentsdk.MobileAgentSdk
import com.wuying.mobileagentsdk.MobileAgentEvent
import com.wuying.mobileagentsdk.MobileAgentState
import kotlinx.coroutines.flow.filterIsInstance
import kotlinx.coroutines.launch
import kotlinx.coroutines.flow.collectLatest

// Use try-with-resources to automatically manage the lifecycle.
MobileAgentSdk().use { sdk ->
    // Collect events in a coroutine.
    lifecycleScope.launch {
        sdk.events.collectLatest { event ->
            when (event) {
                is MobileAgentEvent.StateChanged -> {
                    when (event.state) {
                        MobileAgentState.CONNECTED -> {
                            Log.d("SDK", "Connected")
                            // Execute a task.
                            sdk.executeTask("Please help me open settings", 10)
                        }
                        MobileAgentState.DISCONNECTED -> {
                            Log.d("SDK", "Disconnected")
                        }
                        else -> {}
                    }
                }
                is MobileAgentEvent.TaskResult -> {
                    Log.d("SDK", "Task complete: ${event.resultText}")
                }
                is MobileAgentEvent.Stream -> {
                    Log.d("SDK", "Stream data: ${event.content}")
                }
                else -> {}
            }
        }
    }

    // You can also collect only specific types of events.
    lifecycleScope.launch {
        sdk.events
            .filterIsInstance<MobileAgentEvent.StateChanged>()
            .collect { event ->
                Log.d("SDK", "State: ${event.state}")
            }
    }

    lifecycleScope.launch {
        sdk.events
            .filterIsInstance<MobileAgentEvent.TaskResult>()
            .collect { event ->
                Log.d("SDK", "Task result: ${event.resultText}")
            }
    }

    // Connect to the server.
    sdk.connect("wss://localhost:30005/ws")

    // The SDK is automatically destroyed at the end of the use block.
}

Option 2: Use a callback

This approach is for projects that do not use coroutines:

import com.wuying.mobileagentsdk.MobileAgentSdk
import com.wuying.mobileagentsdk.MobileAgentCallback
import com.wuying.mobileagentsdk.MobileAgentState

MobileAgentSdk().use { sdk ->
    // Set the callback.
    sdk.setCallback(object : MobileAgentCallback {
        override fun onStateChanged(state: MobileAgentState) {
            when (state) {
                MobileAgentState.CONNECTED -> {
                    Log.d("SDK", "Connected")
                    sdk.executeTask("Please help me open settings", 10)
                }
                MobileAgentState.DISCONNECTED -> {
                    Log.d("SDK", "Disconnected")
                }
                else -> {}
            }
        }

        override fun onTaskResult(
            taskId: String,
            result: String,
            resultText: String,
            success: Boolean,
            actualSteps: Int
        ) {
            Log.d("SDK", "Task complete: $resultText")
        }

        override fun onStream(
            taskId: String,
            step: Int,
            agent: String,
            contentType: String,
            content: String
        ) {
            Log.d("SDK", "Stream data: $content")
        }

        // Implement other callbacks as needed.
    })

    // Connect to the server.
    sdk.connect("wss://localhost:30005/ws")
}

Lifecycle management

The SDK implements the AutoCloseable interface, which supports the try-with-resources pattern:

// Recommended: Automatically manage the lifecycle.
MobileAgentSdk().use { sdk ->
    sdk.connect("wss://localhost:30005/ws")
    // Use the SDK...
} // Automatically calls close() to release resources.

// Or, manage the lifecycle manually.
val sdk = MobileAgentSdk()
try {
    sdk.connect("wss://localhost:30005/ws")
    // Use the SDK...
} finally {
    sdk.close()  // or sdk.destroy()
}

API

Connection management

connect(url: String)

Connects to the server using a WebSocket URL:

sdk.connect("wss://localhost:30005/ws")

connectWithTicket(ticket: String)

Connects using a ticket (requires server-side support):

sdk.connectWithTicket("your-ticket-here")

disconnect()

Disconnects from the server:

sdk.disconnect()

Task management

executeTask(task: String, maxSteps: Int): String

Executes a task and returns the task ID:

val taskId = sdk.executeTask("Please help me open the settings app", 10)

pauseTask(taskId: String)

Pauses a task:

sdk.pauseTask(taskId)

resumeTask(taskId: String)

Resumes a task:

sdk.resumeTask(taskId)

cancelTask(taskId: String)

Cancels a task:

sdk.cancelTask(taskId)

State query

getState(): MobileAgentState

Gets the current connection state:

val state = sdk.getState()
when (state) {
    MobileAgentState.CONNECTED -> // Connected
    MobileAgentState.CONNECTING -> // Connecting
    MobileAgentState.DISCONNECTED -> // Disconnected
}

Flow API

The SDK provides an events Flow that emits all event types. You can use filterIsInstance to filter for specific events:

// Collect all events.
sdk.events.collect { event ->
    when (event) {
        is MobileAgentEvent.StateChanged -> // Handle state changes.
        is MobileAgentEvent.TaskResult -> // Handle task results.
        is MobileAgentEvent.Stream -> // Handle stream data.
        // ...
    }
}

// Collect only specific types of events.
sdk.events
    .filterIsInstance<MobileAgentEvent.StateChanged>()
    .collect { event ->
        // Handle only state changes.
    }

Event types:

Event type

Description

MobileAgentEvent.StateChanged

Connection state changed.

MobileAgentEvent.TodoWrite

To-do list updated.

MobileAgentEvent.TaskStart

Task started.

MobileAgentEvent.StepStart

Step started.

MobileAgentEvent.Stream

Stream data received.

MobileAgentEvent.StepEnd

Step ended.

MobileAgentEvent.TaskResult

Task result received.

Callback API

The MobileAgentCallback interface provides the following callback methods. All methods have default empty implementations, allowing you to override only the ones you need:

Method

Description

onStateChanged(state: MobileAgentState)

Connection state changed

onTodoWrite(todos: Array<TodoItem>, count: Int)

To-do list updated

onTaskStart(taskId: String, task: String, maxSteps: Int)

Task started

onStepStart(taskId: String, step: Int)

Step started

onStream(taskId, step, agent, contentType, content)

Stream data received

onStepEnd(taskId, step, status, summary)

Step ended

onTaskResult(taskId, result, resultText, success, actualSteps)

Task result received

onRawMessage(rawJson: String)

Raw, unparsed JSON message.

Data classes

MobileAgentState

An enum for the connection state:

enum class MobileAgentState {
    DISCONNECTED,  // Disconnected
    CONNECTING,    // Connecting
    CONNECTED      // Connected
}

TodoItem

A data class for a to-do item:

data class TodoItem(
    val id: String,
    val title: String,
    val status: String,
    val details: String
)

MobileAgentEvent

A sealed class representing all possible event types:

sealed class MobileAgentEvent {
    data class StateChanged(val state: MobileAgentState)
    data class TodoWrite(val todos: Array<TodoItem>, val count: Int)
    data class TaskStart(val taskId: String, val task: String, val maxSteps: Int)
    data class StepStart(val taskId: String, val step: Int)
    data class Stream(val taskId: String, val step: Int, val agent: String, val contentType: String, val content: String)
    data class StepEnd(val taskId: String, val step: Int, val status: String, val summary: String)
    data class TaskResult(val taskId: String, val result: String, val resultText: String, val success: Boolean, val actualSteps: Int)
}

Advanced usage

Parse agent thoughts

The MobileAgent uses the <conclusion> tag in its stream output to wrap the agent's thought process. You can extract this content from Stream events.

<conclusion> format

The agent uses XML tags to mark its thought process in its output:

<conclusion>This is the agent's thought process...</conclusion>

Kotlin implementation

The following code demonstrates how to parse the thought content as it arrives:

import com.wuying.mobileagentsdk.MobileAgentSdk
import com.wuying.mobileagentsdk.MobileAgentEvent

class ConclusionParser {
    private val conclusionBuffer = StringBuilder()
    private var inConclusion = false

    /**
     * Processes stream data to extract content from <conclusion> tags.
     */
    fun processStream(content: String): String? {
        val result = StringBuilder()

        for (char in content) {
            if (!inConclusion) {
                // Not inside a conclusion; check for the start of <conclusion>.
                conclusionBuffer.append(char)

                // Check for <conclusion>.
                if (conclusionBuffer.length >= 12) {
                    if (conclusionBuffer.endsWith("<conclusion>")) {
                        inConclusion = true
                        conclusionBuffer.clear()
                        result.append("[Thought] ")
                    } else if (conclusionBuffer.length > 12) {
                        // Keep the last 11 characters for the next check.
                        conclusionBuffer.delete(0, conclusionBuffer.length - 11)
                    }
                }
            } else {
                // Inside a conclusion; check for the end tag.
                conclusionBuffer.append(char)

                // Check for </conclusion>.
                if (conclusionBuffer.length >= 13) {
                    if (conclusionBuffer.endsWith("</conclusion>")) {
                        // Found the end tag.
                        inConclusion = false
                        val conclusionContent = conclusionBuffer.toString()
                            .dropLast(13)  // Remove </conclusion>.
                            .trim()

                        if (conclusionContent.isNotEmpty()) {
                            result.append(conclusionContent).append("\n")
                        }

                        conclusionBuffer.clear()
                        continue
                    } else if (conclusionBuffer.length > 13) {
                        // Keep the last 12 characters for the next check.
                        conclusionBuffer.delete(0, conclusionBuffer.length - 12)
                    }
                }

                // If not part of a potential end tag, output immediately.
                if (!couldBeEndTag(conclusionBuffer.toString())) {
                    result.append(char)
                }
            }
        }

        return if (result.isNotEmpty()) result.toString() else null
    }

    /**
     * Checks if the end of the buffer could be a prefix of the end tag.
     */
    private fun couldBeEndTag(buffer: String): Boolean {
        val prefixes = listOf(
            "<", "</", "</c", "</co", "</con", "</conc",
            "</concl", "</conclu", "</conclus", "</conclusi",
            "</conclusio", "</conclusion"
        )

        return prefixes.any { buffer.endsWith(it) }
    }

    /**
     * Resets the parser state.
     */
    fun reset() {
        conclusionBuffer.clear()
        inConclusion = false
    }
}

// Sample usage
MobileAgentSdk().use { sdk ->
    val parser = ConclusionParser()

    lifecycleScope.launch {
        sdk.events
            .filterIsInstance<MobileAgentEvent.Stream>()
            .collect { event ->
                // Parse and print the output.
                val output = parser.processStream(event.content)
                if (output != null) {
                    print(output)
                }
            }
    }

    sdk.connect("wss://localhost:30005/ws")
}

Simplified version

If real-time output is not required, you can parse the content after it has been fully received:

/**
 * Extracts all content from <conclusion> tags in a complete string.
 */
fun extractConclusions(content: String): List<String> {
    val conclusions = mutableListOf<String>()
    val regex = "<conclusion>(.*?)</conclusion>".toRegex(RegexOption.DOT_MATCHES_ALL)

    for (match in regex.findAll(content)) {
        val conclusion = match.groupValues[1].trim()
        if (conclusion.isNotEmpty()) {
            conclusions.add(conclusion)
        }
    }

    return conclusions
}

// Sample usage
lifecycleScope.launch {
    sdk.events
        .filterIsInstance<MobileAgentEvent.TaskResult>()
        .collect { event ->
            // Extract the thought process from the task result.
            val conclusions = extractConclusions(event.resultText)

            println("Agent's thought process:")
            conclusions.forEachIndexed { index, conclusion ->
                println("${index + 1}. $conclusion")
            }
        }
}

Output

[Thought] The user wants to open the Settings app. I need to find its package name and Activity.
[Thought] By querying the system app list, I found the Settings app: com.android.settings/.Settings
[Thought] Now I will start the Settings app.

Task complete: Successfully opened the settings app.

Using Flow and callbacks together

You can use both approaches at the same time without interference:

MobileAgentSdk().use { sdk ->
    // Use Flow to handle the main logic.
    lifecycleScope.launch {
        sdk.events
            .filterIsInstance<MobileAgentEvent.TaskResult>()
            .collect { event ->
                // Handle the task result.
            }
    }

    // Use a callback for logging.
    sdk.setCallback(object : MobileAgentCallback {
        override fun onRawMessage(rawJson: String) {
            // Log raw messages for debugging.
            Log.d("SDK_RAW", rawJson)
        }
    })

    sdk.connect("wss://localhost:30005/ws")
}

Error handling

try {
    MobileAgentSdk().use { sdk ->
        sdk.connect("wss://localhost:30005/ws")
    }
} catch (e: IllegalStateException) {
    Log.e("SDK", "Failed to create SDK", e)
} catch (e: Exception) {
    Log.e("SDK", "Connection error", e)
}

Check SDK status

val sdk = MobileAgentSdk()

// Check if the SDK is destroyed.
if (sdk.isDestroyed()) {
    Log.w("SDK", "The SDK has been destroyed.")
}

// Check if the SDK is valid.
sdk.checkNotDestroyed()  // Throws an exception if destroyed.

ProGuard configuration

If you enable code obfuscation, you must preserve the SDK's native methods. The SDK already includes a proguard-rules.pro file, so you typically do not need extra configuration.

If you need to configure it manually:

# MobileAgentSDK
-keep class com.wuying.mobileagentsdk.** { *; }
-keepclassmembers class com.wuying.mobileagentsdk.MobileAgentSdk {
    private native <methods>;
}

Dependencies

The SDK has minimal runtime dependencies:

dependencies {
    // The only runtime dependency
    implementation("org.jetbrains.kotlinx:kotlinx-coroutines-android:1.7.3")
}

To remain lightweight, the SDK does not include any UI frameworks, such as AppCompat or Material.