All Products
Search
Document Center

Cloud Monitor:LLM trace field definitions

Last Updated:Sep 05, 2025

Alibaba Cloud defines its Large Language Model (LLM) trace fields based on the OpenTelemetry open standard and concepts from the LLM application domain. By extending Attributes, Resource, and Event, these fields describe the semantics of LLM application call chain data. They capture key operations such as LLM input and output requests and token consumption. The fields provide rich, context-aware semantic data for scenarios such as Completion, Chat, retrieval-augmented generation (RAG), Agent, and Tool to simplify data tracking and reporting. These semantic fields are continuously updated and optimized as the community evolves.

The definitions of level-1 span fields are based on the OpenTelemetry open standard. For more information about the underlying level-1 trace fields stored in Alibaba Cloud Managed Service for OpenTelemetry, see Trace analysis parameters.

Note

The LLM-related SpanKind is an attribute and is different from the Span kind defined in OpenTelemetry traces.

Common fields

Attributes

Attribute key

Description

Type

Example

Requirement level

gen_ai.session.id

The session ID.

String

ddde34343-f93a-4477-33333-sdfsdaf

Conditionally required

gen_ai.user.id

The ID of the end user of the application.

String

u-lK8JddD

Conditionally required

gen_ai.span.kind

The operation type.

String

See LLM Span Kind

Required

gen_ai.framework

The type of framework used.

String

langchain; llama_index

Conditionally required

Resources

Resource key

Description

Type

Example

Requirement level

service.name

The application name.

String

test-easy-rag

Required

Chain

A Chain is a tool that connects an LLM with multiple other components to perform complex tasks. It can be nested and may contain Retrieval, Embedding, and LLM calls.

Attributes

Attribute key

Description

Type

Example

Requirement level

gen_ai.span.kind

The operation type. This is an enumeration specific to the LLM SpanKind. For a Chain, the value must be CHAIN.

String

CHAIN

Required

gen_ai.operation.name

The sub-type of the operation.

String

WORKFLOW; TASK

Conditionally required

input.value

The input content.

String

Who Are You!

Recommended

output.value

The returned content.

String

I am ChatBot

Recommended

gen_ai.user.time_to_first_token

The time to first token (TTFT). This is the latency for the first packet of the overall response to a query. It measures the time from when the server receives the user request to when the first packet is returned. The unit is nanoseconds.

Integer

1000000

Recommended

Retriever

A Retriever typically accesses a vector store or database to retrieve data. This data is used to supplement context to improve the accuracy and efficiency of the LLM's response.

Attributes

Attribute key

Description

Type

Example

Requirement level

gen_ai.span.kind

The operation type. This is an enumeration specific to the LLM SpanKind. For a Retriever, the value must be RETRIEVER.

String

RETRIEVER

Required

retrieval.query

The short query string for retrieval.

String

what is the topic in xxx?

Recommended

retrieval.document

A list of retrieved documents.

JSON array

[{"document":{"content":"This is a sample document content.","metadata":{"source":"https://aliyun.com/xxx/wiki","title":"How LLM Works"},"score":0.7680862242896571,"id":"7af0e529-2531-42d9-bf3a-d5074a73c184"}}]

Required

Reranker

A Reranker sorts multiple input documents based on their relevance to the query content and may return the top K documents for the LLM.

Attributes

Attribute key

Description

Type

Example

Requirement level

gen_ai.span.kind

The operation type. This is an enumeration specific to the LLM SpanKind. For a Reranker, the value must be RERANKER.

String

RERANKER

Required

reranker.query

The input parameter for the Reranker request.

String

How to format timestamp?

Optional

reranker.model_name

The name of the model used by the Reranker.

String

cross-encoder/ms-marco-MiniLM-L-12-v2

Optional

reranker.top_k

The rank after reranking.

Integer

3

Optional

reranker.input_document

Metadata related to the input documents for reranking. This is a JSON array. The metadata contains basic document information, such as the path, filename, and source.

String

-

Required

reranker.output_document

Metadata related to the output documents after reranking. This is a JSON array. The metadata contains basic document information, such as the path, filename, and source.

String

-

Required

LLM

An LLM span identifies a call to a large model, such as requesting inference or text generation using an SDK or OpenAPI.

Attributes

Attribute key

Description

Type

Example

Requirement level

gen_ai.span.kind

The operation type. This is an enumeration specific to the LLM SpanKind. For an LLM, the value must be LLM.

String

LLM

Required

gen_ai.operation.name

The sub-type of the operation.

String

chat; completion

Optional

gen_ai.prompt_template.template

The prompt template.

String

Weather forecast for {city} on {date}

Optional

gen_ai.prompt_template.variables

The specific values for the prompt template.

String

{ context: "<context from retrieval>", subject: "math" }

Optional

gen_ai.prompt_template.version

The version number of the prompt template.

String

1.0

Optional

gen_ai.system

The provider of the large model.

String

openai

Required

gen_ai.request.parameters

The input parameters for the LLM call.

String

{"temperature": 0.7}

Optional

gen_ai.model_name

The model name.

String

gpt-4

Optional

gen_ai.conversation.id

The unique ID of the conversation. This should be collected if the instrumentation can easily obtain the session ID.

String

conv_5j66UpCpwteGg4YSxUnt7lPY

Conditionally required

gen_ai.output.type

The output type specified in the LLM request. This should be collected if it is available and the request specifies a type, such as an output format.

String

text;json;image;audio

Conditionally required

gen_ai.request.choice.count

The number of candidate generations requested from the LLM.

Int

3

Required if the condition is met and the value is not 1

gen_ai.request.model

The model name specified in the LLM request.

String

gpt-4

Required

gen_ai.request.seed

The seed specified in the LLM request.

String

gpt-4

Conditionally required

gen_ai.request.frequency_penalty

The frequency penalty set in the LLM request.

Float

0.1

Recommended

gen_ai.request.max_tokens

The maximum number of tokens specified in the LLM request.

Integer

100

Recommended

gen_ai.request.presence_penalty

The presence penalty set in the LLM request.

Float

0.1

Recommended

gen_ai.request.temperature

The temperature specified in the LLM request.

Float

0.1

Recommended

gen_ai.request.top_p

The top_p value specified in the LLM request.

Float

1.0

Recommended

gen_ai.request.top_k

The top_k value specified in the LLM request.

Float

1.0

Recommended

gen_ai.request.is_stream

Indicates whether the response is streamed. If this attribute is not present, the value is considered false.

Boolean

false

Recommended

gen_ai.request.stop_sequences

The stop sequences for the LLM.

String[]

["stop"]

Recommended

gen_ai.request.tool_calls

The content of the tool calls.

String

[{"tool_call.function.name": "get_current_weather"}]

Recommended

gen_ai.response.id

The unique ID generated by the LLM.

String

gpt-4-0613

Recommended

gen_ai.response.model

The name of the model used for the LLM generation.

String

gpt-4-0613

Recommended

gen_ai.response.finish_reason

The reason why the LLM stopped generating.

String[]

["stop"]

Recommended

gen_ai.response.time_to_first_token

The time to first token for the large model itself in a streaming scenario. It represents the latency for the first packet of the overall response to a query, measured from when the server receives the user request to when the first packet is returned. The unit is nanoseconds.

Integer

1000000

Recommended

gen_ai.response.reasoning_time

The inference time of the reasoning model. It represents the duration of the response reasoning process. The unit is milliseconds.

Integer

1248

Recommended

gen_ai.usage.input_tokens

The number of tokens used for the input.

Integer

100

Recommended

gen_ai.usage.output_tokens

The number of tokens used for the output.

Integer

200

Recommended

gen_ai.usage.total_tokens

The total number of tokens used.

Integer

300

Recommended

gen_ai.input.messages_ref

A link to the model's input content.

String

s3://acme.prod.support_bot.chats.2025/conv_1234/run_42.json

Recommended

gen_ai.output.messages_ref

A link to the model's output content.

String

s3://acme.prod.support_bot.chats.2025/conv_1234/run_42.json

Recommended

gen_ai.system.instructions_ref

A link to the content of the system prompt. This is used to separately record an external link to the content of the system prompt (/system instruction). If the system prompt content can be obtained separately, record it using this field. If the system prompt content is part of the model call, record it in the link corresponding to the gen_ai.input.messages_ref attribute.

String

s3://acme.prod.support_bot.chats.2025/conv_1234/invocation_42.json

Recommended if available

gen_ai.input.messages

The model's input content. Messages must be provided in the order they were sent to the model or agent.

By default, this information should not be collected unless the user explicitly enables it.

String

[{"role": "user", "parts": [{"type": "text", "content": "Weather in Paris?"}]}, {"role": "assistant", "parts": [{"type": "tool_call", "id": "call_VSPygqKTWdrhaFErNvMV18Yl", "name":"get_weather", "arguments":{"location":"Paris"}}]}, {"role": "tool", "parts": [{"type": "tool_call_response", "id":" call_VSPygqKTWdrhaFErNvMV18Yl", "result":"rainy, 57°F"}]}]

Optional

gen_ai.output.messages

The model's output content. Messages must be provided in the order they were sent to the model or agent.

By default, this information should not be collected unless the user explicitly enables it.

String

[{"role":"assistant","parts":[{"type":"text","content":"The weather in Paris is currently rainy with a temperature of 57°F."}],"finish_reason":"stop"}]

Optional

gen_ai.system.instructions

The content of the system prompt. This is used to separately record the content of the system prompt (/system instruction) as a JSON string. If the system prompt content can be obtained separately, record it using this field. If the system prompt content is part of the model call, record it in the gen_ai.input.messages attribute.

By default, this information should not be collected unless the user explicitly enables it.

String

{"role": "system", "message": {"type": "text", "content": "You are a helpful assistant"}}

Optional

gen_ai.response.reasoning_content

The reasoning content from the reasoning model. This represents the content of the response reasoning process. The default length is limited to 1,024 characters. Any content exceeding this limit should be truncated.

String

Okay, let's tackle this question about xxx.

Optional

Embedding

An Embedding span identifies an embedding process, such as an operation on a text embedding model. This embedding can be used later to optimize questions based on similarity queries.

Attributes

Attribute key

Description

Type

Example

Requirement level

gen_ai.span.kind

The operation type. This is an enumeration specific to the LLM SpanKind. For an Embedding, the value must be EMBEDDING.

String

EMBEDDING

Required

gen_ai.usage.input_tokens

The token consumption of the input text.

Integer

10

Optional

gen_ai.usage.total_tokens

The total token consumption for the embedding.

Integer

10

Optional

embedding.model_name

The name of the embedding model.

String

text-embedding-v1

Optional

embedding.embedding_output

The embedding result.

String

-

Optional

Tool

A Tool span identifies a call to an external tool, such as calling a calculator or requesting the latest weather conditions from a weather API.

Attributes

Attribute key

Description

Type

Example

Requirement level

gen_ai.span.kind

The operation type. This is an enumeration specific to the LLM SpanKind. For a Tool, the value must be TOOL.

String

TOOL

Required

tool.name

The tool name.

String

WeatherAPI

Required

tool.description

The tool description.

String

An API to get weather data.

Required

tool.parameters

The input parameters for the tool.

String

{'a': 'int' }

Required

Agent

An Agent represents an agent scenario. It is a more complex Chain that decides the next step based on the inference results of an LLM. For example, it may involve multiple calls to LLMs and Tools, making decisions step-by-step to produce a final answer.

Attributes

Attribute key

Description

Type

Example

Requirement level

gen_ai.span.kind

The operation type. This is an enumeration specific to the LLM SpanKind. For an Agent, the value must be AGENT.

String

AGENT

Required

input.value

The input parameters. Records the original input.

String

Please help me plan xxxx

Required

input.mime_type

The MIME type of the input.

String

text/plain; application/json

Optional

output.value

The returned result. Returns the final output.

String

Planning complete, please check the result xxx

Required

output.mime_type

The MIME type of the output.

String

text/plain; application/json

Optional

gen_ai.response.time_to_first_token

The time to first token for the Agent. It represents the latency for the first packet of the overall response to a query, measured from when the server receives the user request to when the first packet is returned. The unit is nanoseconds.

Integer

1000000

Recommended

Task

A Task span identifies an internal custom method, such as calling a local function to apply custom logic.

Attributes

Attribute key

Description

Type

Example

Requirement level

gen_ai.span.kind

The operation type. This is an enumeration specific to the LLM SpanKind. For a Task, the value must be TASK.

String

TASK

Required

input.value

The input parameters.

String

Custom JSON format

Optional

input.mime_type

The MIME type of the input.

String

text/plain; application/json

Optional

output.mime_type

The MIME type of the output.

String

text/plain; application/json

Optional