All Products
Search
Document Center

Application Real-Time Monitoring Service:LLM Trace field definitions

Last Updated:Oct 16, 2025

Alibaba Cloud defines LLM Trace fields based on the OpenTelemetry standard and concepts from the large language model (LLM) application domain. These fields extend Attributes, Resources, and Events to describe the semantics of LLM application trace data. They reflect key operations such as LLM input and output requests and token consumption. They provide rich, context-aware semantic data for scenarios such as Completion, Chat, retrieval-augmented generation (RAG), Agent, and Tool to facilitate data tracing and reporting. These semantic fields will be continuously updated and optimized as the community evolves.

Top-level Span field definitions are based on the OpenTelemetry open standard. For more information about the top-level Trace fields stored by Managed Service for OpenTelemetry, see Trace analysis parameters.

Note

The LLM-related SpanKind is an Attribute. It is different from the Span kind defined in OpenTelemetry Traces.

Common fields

Attributes

AttributeKey

Description

Type

Example value

Requirement level

gen_ai.session.id

Session ID

String

ddde34343-f93a-4477-33333-sdfsdaf

Conditionally required

gen_ai.user.id

The ID of the end user of the application.

String

u-lK8JddD

Conditionally required

gen_ai.span.kind

Operation type

String

See LLM Span Kind

Required

gen_ai.framework

The type of framework used.

String

langchain; llama_index

Conditionally required

Resources

ResourceKey

Description

Type

Example value

Requirement level

service.name

Application name

String

test-easy-rag

Required

Chain

A Chain is a tool that connects an LLM with multiple other components to perform complex tasks. It can include Retrieval, Embedding, LLM calls, and even nested Chains.

Attributes

AttributeKey

Description

Type

Example value

Requirement level

gen_ai.span.kind

Operation type. A dedicated enumeration for the LLM spanKind. For a Chain, the value must be CHAIN.

String

CHAIN

Required

gen_ai.operation.name

Secondary operation type

String

WORKFLOW; TASK

Conditionally required

input.value

Input content

String

Who Are You!

Recommended

output.value

Returned content

String

I am ChatBot

Recommended

gen_ai.user.time_to_first_token

Time to first token. The overall time from when the server receives a user's request to when the first packet of the response is returned. The unit is nanoseconds.

Integer

1000000

Recommended

Retriever

A Retriever typically accesses a vector store or database to retrieve data. This is often used to supplement context to improve the accuracy and efficiency of the LLM response.

Attributes

AttributeKey

Description

Type

Example value

Requirement level

gen_ai.span.kind

Operation type. A dedicated enumeration for the LLM spanKind. For a Retriever, the value must be RETRIEVER.

String

RETRIEVER

Required

retrieval.query

The short query for retrieval.

String

what is the topic in xxx?

Recommended

retrieval.document

A list of retrieved documents.

JSON array

[{"document":{"content":"This is a sample document content.","metadata":{"source":"https://aliyun.com/xxx/wiki","title":"How LLM Works"},"score":0.7680862242896571,"id":"7af0e529-2531-42d9-bf3a-d5074a73c184"}}]

Required

Reranker

A Reranker sorts multiple input documents based on their relevance to a query. It may return the top-K documents for the LLM.

Attributes

AttributeKey

Description

Type

Example value

Requirement level

gen_ai.span.kind

Operation type. A dedicated enumeration for the LLM spanKind. For a Reranker, the value must be RERANKER.

String

RERANKER

Required

reranker.query

The input parameter for the Reranker request.

String

How to format timestamp?

Optional

reranker.model_name

The name of the model used by the Reranker.

String

cross-encoder/ms-marco-MiniLM-L-12-v2

Optional

reranker.top_k

The rank after reranking.

Integer

3

Optional

reranker.input_document

Metadata related to the input documents for reranking. It is a JSON array structure. The metadata contains basic document information, such as path, file name, and source.

String

-

Required

reranker.output_document

Metadata related to the output documents after reranking. It is a JSON array structure. The metadata contains basic document information, such as path, file name, and source.

String

-

Required

LLM

An LLM span indicates a call to a large language model, such as using an SDK or OpenAPI to request inference or text generation from different large models.

Attributes

AttributeKey

Description

Type

Example value

Requirement level

gen_ai.span.kind

Operation type. A dedicated enumeration for the LLM spanKind. For an LLM, the value must be LLM.

String

LLM

Required

gen_ai.operation.name

Secondary operation type

String

chat; completion

Optional

gen_ai.prompt_template.template

Prompt template

String

Weather forecast for {city} on {date}

Optional

gen_ai.prompt_template.variables

The specific values for the prompt template.

String

{ context: "<context from retrieval>", subject: "math" }

Optional

gen_ai.prompt_template.version

The version number of the prompt template.

String

1.0

Optional

gen_ai.system

The provider of the large model.

String

openai

Required

gen_ai.request.parameters

Input parameters for the LLM call.

String

{"temperature": 0.7}

Optional

gen_ai.model_name

Model name

String

gpt-4

Optional

gen_ai.conversation.id

The unique ID of the conversation. This should be collected if the session ID can be easily obtained during instrumentation.

String

conv_5j66UpCpwteGg4YSxUnt7lPY

Conditionally required

gen_ai.output.type

The output type specified in the LLM request. This should be collected if it is available and the request specifies a type, such as an output format.

String

text; json; image; audio

Conditionally required

gen_ai.request.choice.count

The number of candidate generations requested from the LLM.

Int

3

Conditionally required if the value is not 1

gen_ai.request.model

The model name specified in the LLM request.

String

gpt-4

Required

gen_ai.request.seed

The seed specified in the LLM request.

String

gpt-4

Conditionally required

gen_ai.request.frequency_penalty

The frequency penalty set in the LLM request.

Float

0.1

Recommended

gen_ai.request.max_tokens

The maximum number of tokens specified in the LLM request.

Integer

100

Recommended

gen_ai.request.presence_penalty

The presence penalty set in the LLM request.

Float

0.1

Recommended

gen_ai.request.temperature

The temperature specified in the LLM request.

Float

0.1

Recommended

gen_ai.request.top_p

The top_p value specified in the LLM request.

Float

1.0

Recommended

gen_ai.request.top_k

The top_k value specified in the LLM request.

Float

1.0

Recommended

gen_ai.request.is_stream

Indicates whether the response is streamed. If this field does not exist, the value is considered false.

Boolean

false

Recommended

gen_ai.request.stop_sequences

The stop sequences for the LLM.

String[]

["stop"]

Recommended

gen_ai.request.tool_calls

The content of tool calls. (To be deprecated and replaced with gen_ai.tool.definitions).

String

[{"tool_call.function.name": "get_current_weather"}]

Recommended

gen_ai.response.id

The unique ID generated by the LLM.

String

gpt-4-0613

Recommended

gen_ai.response.model

The name of the model used for LLM generation.

String

gpt-4-0613

Recommended

gen_ai.response.finish_reason

The reason why the LLM stopped generating.

String[]

["stop"]

Recommended

gen_ai.response.time_to_first_token

The time to first token for the large model itself in a streaming response scenario. It represents the overall time from when the server receives a user's request to when the first packet of the response is returned. The unit is nanoseconds.

Integer

1000000

Recommended

gen_ai.response.reasoning_time

The inference time of the reasoning model. It represents the duration of the reasoning process in the response. The unit is milliseconds.

Integer

1248

Recommended

gen_ai.usage.input_tokens

The number of tokens used for the input.

Integer

100

Recommended

gen_ai.usage.output_tokens

The number of tokens used for the output.

Integer

200

Recommended

gen_ai.usage.total_tokens

The total number of tokens used.

Integer

300

Recommended

gen_ai.input.messages_ref

A link to the model input content.

String

s3://acme.prod.support_bot.chats.2025/conv_1234/run_42.json

Recommended

gen_ai.output.messages_ref

A link to the model output content.

String

s3://acme.prod.support_bot.chats.2025/conv_1234/run_42.json

Recommended

gen_ai.system.instructions_ref

A link to the content of the system prompt. Used to separately record an external link to the content of the system prompt (/system instruction). If the content of the system prompt can be obtained separately, it should be recorded in this field. If the system prompt content is part of the model call, it should be recorded in the link corresponding to the gen_ai.input.messages_ref property.

String

s3://acme.prod.support_bot.chats.2025/conv_1234/invocation_42.json

Recommended if available

gen_ai.input.messages

Model input content. Messages must be provided in the order they are sent to the model or agent.

By default, this information should not be collected unless the user explicitly enables it.

String

[{"role": "user", "parts": [{"type": "text", "content": "Weather in Paris?"}]}, {"role": "assistant", "parts": [{"type": "tool_call", "id": "call_VSPygqKTWdrhaFErNvMV18Yl", "name":"get_weather", "arguments":{"location":"Paris"}}]}, {"role": "tool", "parts": [{"type": "tool_call_response", "id":" call_VSPygqKTWdrhaFErNvMV18Yl", "result":"rainy, 57°F"}]}]

Optional

gen_ai.output.messages

Model output content. Messages must be provided in the order they are sent to the model or agent.

By default, this information should not be collected unless the user explicitly enables it.

String

[{"role":"assistant","parts":[{"type":"text","content":"The weather in Paris is currently rainy with a temperature of 57°F."}],"finish_reason":"stop"}]

Optional

gen_ai.system.instructions

The content of the system prompt. Used to separately record the content of the system prompt (/system instruction) as a JSON string. If the content of the system prompt can be obtained separately, it should be recorded in this field. If the system prompt content is part of the model call, it should be recorded in the gen_ai.input.messages property.

By default, this information should not be collected unless the user explicitly enables it.

String

{"role": "system", "message": {"type": "text", "content": "You are a helpful assistant"}}

Optional

gen_ai.response.reasoning_content

The reasoning content from the reasoning model. Represents the content of the reasoning process in the response. The default length limit is 1024 characters. Content that exceeds this limit should be truncated.

String

Okay, let's tackle this question about xxx.

Optional

gen_ai.tool.definitions

Tool definitions

String

[{"type":"function","name":"get_current_weather","description": "Get the current weather in a given location","parameters":{"type":"object","properties":{"location":{"type":"string","description":"The city and state, e.g. San Francisco, CA"},"unit": {"type":"string","enum":["celsius","fahrenheit"]}},"required":["location","unit"]}}]

Recommended

Embedding

An Embedding span indicates an embedding process, such as an operation to embed text into a large model. This can be used later for similarity queries to optimize subsequent operations.

Attributes

AttributeKey

Description

Type

Example value

Requirement level

gen_ai.span.kind

Operation type. A dedicated enumeration for the LLM spanKind. For an Embedding, the value must be EMBEDDING.

String

EMBEDDING

Required

gen_ai.usage.input_tokens

Token consumption for the input text.

Integer

10

Optional

gen_ai.usage.total_tokens

Total token consumption for the embedding.

Integer

10

Optional

embedding.model_name

The name of the embedding model. (To be deprecated and replaced with gen_ai.request.model).

String

text-embedding-v1

Optional

embedding.embedding_output

Embedding result. (To be deprecated).

String

-

Optional

gen_ai.operation.name

Secondary operation type

String

embeddings

Conditionally required

gen_ai.encoding.formats

Encoding format

String

["base64"]

Recommended

gen_ai.embeddings.dimension.count

Number of embedding dimensions.

Integer

100

Recommended

gen_ai.request.model

The model name specified in the Embedding request.

String

text-embedding-v1

Conditionally required

Tool

A Tool span indicates a call to an external tool. For example, it might involve calling a calculator or requesting the latest weather information from a weather API.

Attributes

AttributeKey

Description

Type

Example value

Requirement level

gen_ai.span.kind

Operation type. A dedicated enumeration for the LLM spanKind. For a Tool, the value must be TOOL.

String

TOOL

Required

tool.name

Tool name. (To be deprecated and replaced with gen_ai.tool.name).

String

WeatherAPI

Required

tool.description

Tool description. (To be deprecated and replaced with gen_ai.tool.description).

String

An API to get weather data.

Required

tool.parameters

Tool input parameters. (To be deprecated and replaced with gen_ai.tool.call.arguments).

String

{'a': 'int' }

Required

gen_ai.operation.name

Secondary operation type

String

execute_tool

Conditionally required

gen_ai.tool.call.id

Tool ID

String

call_mszuSIzqtI65i1wAUOE8w5H4

Recommended

gen_ai.tool.description

Tool description

String

Multiply two numbers

Recommended

gen_ai.tool.name

Tool name

String

get_weather

Recommended

gen_ai.tool.type

Tool type

String

function; extension; datastore

Recommended

gen_ai.tool.call.arguments

Input arguments for the tool call.

String

{"location": "San Francisco?","date": "2025-10-01"}

Optional

gen_ai.tool.call.result

Return value of the tool call.

String

{"temperature_range": {"high": 75,"low": 60},"conditions": "sunny"}

Optional

Agent

An Agent span represents an agent scenario. It is a more complex type of Chain that makes decisions for the next step based on the inference results of a large model. For example, it might involve multiple calls to LLMs and Tools, making step-by-step decisions to reach a final answer.

Attributes

AttributeKey

Description

Type

Example value

Requirement level

gen_ai.span.kind

Operation type. A dedicated enumeration for the LLM spanKind. For an Agent, the value must be AGENT.

String

AGENT

Required

input.value

Input parameter. Records the original input.

String

Please help me plan xxxx

Required

input.mime_type

Input MIME type.

String

text/plain; application/json

Optional

output.value

Return result. Returns the final output.

String

Planning is complete. Please check the result xxx

Required

output.mime_type

Output MIME type.

String

text/plain; application/json

Optional

gen_ai.response.time_to_first_token

The time to first token for the Agent. It represents the overall time from when the server receives a user's request to when the first packet of the response is returned. The unit is nanoseconds.

Integer

1000000

Recommended

Task

A Task span indicates an internal custom method. For example, it might involve calling a local function or other application-defined logic.

Attributes

AttributeKey

Description

Type

Example value

Requirement level

gen_ai.span.kind

Operation type. A dedicated enumeration for the LLM spanKind. For a Task, the value must be TASK.

String

TASK

Required

input.value

Input parameters

String

Custom JSON format

Optional

input.mime_type

Input MIME type.

String

text/plain; application/json

Optional

output.mime_type

Output MIME type.

String

text/plain; application/json

Optional