Fun-ASR real-time speech recognition server-side events - Alibaba Cloud Model Studio

The Fun-ASR real-time speech recognition service delivers four types of server-side events to the client over WebSocket: task-started, result-generated, task-finished, and task-failed. The following sections describe the data structure and fields of each event.

User guide: For model details and selection guidance, see Speech-to-text.

Event flow: For the event interaction sequence, see WebSocket API.

task-started

Description: The task has started successfully. The client can begin sending audio data.

header object

Properties

task_id string

Task ID generated by the client (UUID format).

event string

Event type. Fixed value: task-started.

attributes object

Additional attributes. Typically empty.

{
    "header": {
        "task_id": "2bf83b9a-baeb-4fda-8d9a-xxxxxxxxxxxx",
        "event": "task-started",
        "attributes": {}
    },
    "payload": {}
}

payload object

Always {}.

result-generated

Description: A recognition result. Includes interim results (sentence_end=false) and final results (sentence_end=true).

header object

Properties

task_id string

Task ID generated by the client (UUID format).

event string

Event type. Fixed value: result-generated.

{
  "header": {
    "task_id": "2bf83b9a-baeb-4fda-8d9a-xxxxxxxxxxxx",
    "event": "result-generated",
    "attributes": {}
  },
  "payload": {
    "output": {
      "sentence": {
        "begin_time": 170,
        "end_time": 920,
        "text": "Okay, I got it.",
        "heartbeat": false,
        "sentence_end": true,
        "sentence_id": 1,
        "words": [
          {
            "begin_time": 170,
            "end_time": 295,
            "text": "Okay",
            "punctuation": "，"
          },
          {
            "begin_time": 295,
            "end_time": 503,
            "text": "I",
            "punctuation": ""
          },
          {
            "begin_time": 503,
            "end_time": 711,
            "text": "got",
            "punctuation": ""
          },
          {
            "begin_time": 711,
            "end_time": 920,
            "text": "it",
            "punctuation": ""
          }
        ]
      }
    },
    "usage": {
      "duration": 3
    }
  }
}

payload object

Properties

output object

Properties

usage object

When payload.output.sentence.sentence_end is false (the current sentence has not yet ended), usage is null.

When payload.output.sentence.sentence_end is true (the current sentence has ended), usage.duration reports the billable duration of the current task.

Properties

duration integer

Billable duration of the task, in seconds.

Properties

sentence object

Properties

begin_time integer

Sentence start time, in milliseconds.

end_time integer

Sentence end time, in milliseconds.

text string

Recognized text.

heartbeat boolean

If true, the result is a heartbeat packet and can be skipped.

sentence_end boolean

Indicates whether the sentence has ended. true = final result; false = interim result.

sentence_id integer

Sentence sequence identifier. For regular recognition results, sentence_id starts at 1 and increments. When heartbeat is true (heartbeat packet), sentence_id is fixed at 0.

words array[object]

Word-level timestamp information.

Properties

begin_time integer

Word start time, in milliseconds.

end_time integer

Word end time, in milliseconds.

text string

Recognized text.

punctuation string

Punctuation mark.

task-finished

Description: The task ended normally. The connection can be closed or reused.

header object

Properties

task_id string

Task ID generated by the client (UUID format).

event string

Event type. Fixed value: task-finished.

attributes object

Additional attributes. Typically empty.

{
    "header": {
        "task_id": "2bf83b9a-baeb-4fda-8d9a-xxxxxxxxxxxx",
        "event": "task-finished",
        "attributes": {}
    },
    "payload": {
        "output": {},
        "usage": null
    }
}

payload object

Typically {}; the contents are not relevant.

task-failed

Description: The task failed. The connection is closed and cannot be reused.

header object

Properties

task_id string

Task ID generated by the client (UUID format).

event string

Event type. Fixed value: task-failed.

error_code string

Error type description.

error_message string

Detailed error message.

attributes object

Additional attributes. Typically empty.

{
    "header": {
        "task_id": "2bf83b9a-baeb-4fda-8d9a-xxxxxxxxxxxx",
        "event": "task-failed",
        "error_code": "CLIENT_ERROR",
        "error_message": "request timeout after 23 seconds.",
        "attributes": {}
    },
    "payload": {}
}

payload object

Always {}.