LLM Trace fields are defined by Alibaba Cloud based on the OpenTelemetry standard and concepts from the large language model (LLM) application domain. These fields extend Attributes, Resource, and Event to describe the semantics of LLM application trace data. They capture key operations such as LLM input and output requests and token consumption. They provide rich, context-aware semantic data for scenarios such as Completion, Chat, Retrieval-Augmented Generation (RAG), Agent, and Tool Calling—enabling effective data tracing and reporting.
This semantic specification evolves with the community. If your application is a Python application, you must manually collect observable data. You can use the loongsuite-util-genai component to help integrate data collection. For details, see the README.
Span-level field definitions follow the OpenTelemetry open standard. For detailed descriptions of top-level trace fields stored in Alibaba Cloud Managed Service for OpenTelemetry, see Trace Analysis Parameter Definitions.
The LLM-specific SpanKind is an Attribute—not the Span kind defined in the OpenTelemetry Trace specification. This semantic specification extends the OpenTelemetry GenAI Semantic Conventions. That specification is under active development and may change in future maintenance releases.
Common Section
Attributes
AttributeKey | Description | Type | Example Value | Requirement Level |
| Session ID | string |
| Required if available |
| End-user identifier | string |
| Required if available |
| Operation type [1] | string |
| Required |
| Secondary operation type [2] | string |
| Required |
| Framework type used | string |
| Required if available |
[1] gen_ai.span.kind: Maps to gen_ai.operation.name as follows:
|
| Description |
RETRIEVER |
| Document retrieval |
LLM |
| Model invocation |
EMBEDDING |
| Embedding |
TOOL |
| Tool calling |
AGENT |
| Agent invocation |
RERANKER | - | Reranking invocation |
CHAIN | - | Chain (invocation unit) |
TASK | - | Task invocation |
ENTRY | - | Entry invocation marker |
STEP | - | ReAct round marker |
[2] gen_ai.operation.name: Secondary operation type. Use one of the following enumerations or define a custom value:
Value | Description |
| Chat completion operation |
| Create GenAI agent operation |
| Word embedding operation |
| Tool calling operation |
| Multimodal content generation operation |
| Invoke GenAI agent operation |
| Document retrieval operation |
| Text completion operation |
Resources
ResourceKey | Description | Type | Example Value | Requirement Level |
| Application name | string |
| Required |
| Cloud Monitor workspace | string |
| Required if available |
| Cloud Monitor service ID | string |
| Required if available |
| Application source | string |
| Required if available |
| Application feature | string |
| Required Note
|
Chain
A Chain is a tool that connects LLMs and other components to perform complex tasks, such as Retrieval, Embedding, LLM invocations, and nested Chains.
Name the span chain {chain_name}. If chain_name cannot be obtained, name it chain.
The OpenTelemetry community has not yet defined a semantic convention for this span type. Currently, Chain spans apply only to the LangChain framework.
Attributes
AttributeKey | Description | Type | Example | Requirement Level |
| Operation type [1] | string |
| Required |
| Secondary operation type | string |
| Required if available |
| Input content | string |
| Recommended |
| Response content | string |
| Recommended |
| Time to First Byte [2] | integer | 1000000 | Recommended |
[1] gen_ai.span.kind: Dedicated enumeration for LLM spanKind. In a Chain, this value must be CHAIN.
[2] gen_ai.user.time_to_first_token: Time from the server's receipt of the user request until the first response packet returns. Unit: nanoseconds.
Retriever
A Retriever accesses vector stores or databases to retrieve data. It typically supplements context to improve LLM response accuracy and efficiency.
Set gen_ai.operation.name to retrieval. When gen_ai.operation.name is retrieval, infer gen_ai.span.kind as RETRIEVER.
Name the span {gen_ai.operation.name} {gen_ai.data_source.id}. Other naming formats are acceptable in special cases.
Attributes
AttributeKey | Description | Type | Example | Requirement Level |
| Operation type [1] | string |
| Required |
| Secondary operation type [2] | string |
| Required |
| Data source unique identifier [3] | string |
| Required if available |
| Large language model provider | string |
| Required if available |
| Model name specified in the request | string |
| Required if available |
| Top-K value specified in the request | float |
| Recommended |
| Retrieved document list [4] | string |
| Optional |
| Query text snippet | string |
| Optional |
[1] gen_ai.span.kind: Dedicated enumeration for LLM spanKind. In a Retriever, this value must be RETRIEVER.
[2] gen_ai.operation.name: Secondary operation type.
[3] gen_ai.data_source.id: Unique data source ID. This is the data source that AI Agents or RAG applications depend on. It can be an external database, Object Storage Service, document set, website, or other storage system.
[4] gen_ai.retrieval.documents: Records the retrieved document list. Each document object must contain at least these properties: id (string): Unique document identifier. score (double-precision floating-point number): Relevance score.
Reranker
The Reranker assesses the relevance of multiple input documents based on the query, sorts them, and may return the top-K documents as input to the LLM.
Name the span rerank {reranker.model_name}. If reranker.model_name cannot be retrieved, name it rerank.
The OpenTelemetry community has not yet defined a semantic convention for this span type.
Attributes
AttributeKey | Description | Type | Example | Requirement Level |
| Operation type [1] | string |
| Required |
| Reranker request parameter | string |
| Optional |
| Model name used by the Reranker | string |
| Optional |
| Rank after reranking | integer |
| Optional |
| Output the document metadata[2] | string |
| Required |
| Metadata for output documents [3] | string |
| Required |
[1] gen_ai.span.kind: Dedicated enumeration for LLM spanKind. In a Reranker, this value must be RERANKER.
[2] reranker.output_document: Input documents for reranking. JSON array structure. Metadata contains basic document information such as path, filename, and source.
[3] reranker.output_document: Output documents after reranking. JSON array structure. Metadata contains basic document information such as path, filename, and source.
LLM
An LLM span represents an LLM call or inference process. Examples include using an SDK or OpenAPI to invoke different LLMs for inference or text generation.
Set gen_ai.operation.name to one of chat, generate_content, or text_completion. When gen_ai.operation.name is chat, generate_content, or text_completion, infer gen_ai.span.kind as LLM.
Name the span {gen_ai.operation.name} {gen_ai.request.model}. Other naming formats are acceptable in special cases.
Attributes
AttributeKey | Description | Type | Example | Requirement Level |
| Operation type [1] | string |
| Required |
| Secondary operation type [2] | string |
| Required |
| Large language model provider | string |
| Required |
| Unique conversation ID [3] | string |
| Required if available |
| Output type specified in the LLM request [4] | string |
| Required if available |
| Number of candidate generations requested in the LLM request | int |
| Required if not 1 |
| Model name specified in the LLM request | string |
| Required |
| Seed specified in the LLM request | string |
| Required if available |
| Frequency penalty set in the LLM request | float |
| Recommended |
| Maximum token count specified in the LLM request | integer |
| Recommended |
| Presence penalty set in the LLM request | float |
| Recommended |
| Temperature specified in the LLM request | float |
| Recommended |
| Top-P value specified in the LLM request | float |
| Recommended |
| Top-K value specified in the LLM request | float |
| Recommended |
| Stop sequences for the LLM | string[] |
| Recommended |
| Unique ID generated by the LLM | string |
| Recommended |
| Model name used for LLM generation | string |
| Recommended |
| Reason the LLM stopped generating | string[] |
| Recommended |
| First-token latency for the LLM in streaming-response scenarios [5] | integer |
| Recommended |
| Inference time for reasoning models [6] | integer |
| Recommended |
| Number of input tokens used | integer |
| Recommended |
| Number of output tokens used | integer |
| Recommended |
| Total number of tokens used | integer |
| Recommended |
| Number of tokens written to the model provider's cache [7] | integer |
| Recommended |
| Number of tokens read from the model provider's cache [8] | integer |
| Recommended |
| Model input content [9] | string |
| Optional |
| Model output content [10] | string |
| Optional |
| System prompt content [11] | string |
| Optional |
| Tool definition list [12] | string |
| Optional |
| LLM prefill latency. Unit: nanoseconds | integer |
| Recommended |
| LLM decode latency. Unit: nanoseconds | integer |
| Recommended |
| LLM inference time. Equals the sum of prefill and decode time. Unit: nanoseconds | integer |
| Recommended |
| Multi-modal data involved in the LLM input content [13] | string[] |
| Recommended |
| Multi-modal data involved in the LLM output content [14] | string[] |
| Recommended |
[1] gen_ai.span.kind: Dedicated enumeration for LLM spanKind. In an LLM span, this value must be LLM.
[2] gen_ai.operation.name: Secondary operation type.
[3] gen_ai.conversation.id: Unique conversation ID. Collect this if you can easily obtain it.
[4] gen_ai.output.type: Collect this if the request specifies an output type (such as an output format). Values must be one of the following enumerations or a custom value:
Value | Description |
| Image |
| Well-formatted JSON object |
| Voice |
| Plain text |
[5] gen_ai.user.time_to_first_token: Time from the server's receipt of the user request until the first response packet returns. Unit: nanoseconds.
[6] gen_ai.response.reasoning_time: Duration of the reasoning process. Unit: milliseconds.
[7] gen_ai.usage.cache_creation.input_tokens: This value must already be included in gen_ai.usage.input_tokens.
[8] gen_ai.usage.cache_read.input_tokens: This value must already be included in gen_ai.usage.input_tokens.
[9] gen_ai.input.messages: Records the input content for the LLM call. Messages must be provided in the order sent to the model or agent. Follow gen_ai.input.messages.json.
Collect only when the OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT flag is enabled. This flag is enabled by default.
[10] gen_ai.output.messages: Records the model's output content. Messages must be provided in the order sent to the model or agent. Follow gen_ai.output.messages.json.
Collect only when the OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT flag is enabled. This flag is enabled by default.
[11] gen_ai.system_instructions: Records system prompt or instruction content separately. Use this field if you can obtain the system prompt or instruction content independently. Otherwise, record it in the gen_ai.input.messages attribute. Follow gen_ai.system_instructions.json.
Collect only when the OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT flag is enabled. This flag is enabled by default.
[12] gen_ai.tool.definitions: Records tool definitions passed to the LLM. This attribute may be very large. By default, collect only the type and name fields. Collect all other fields only when the OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT flag is enabled. This flag is enabled by default.
[13] gen_ai.input.multimodal_metadata: Aggregates multi-modal data referenced in the model's input content. Only includes UriPart messages. Follow gen_ai.input.messages.json.
Collect only when the OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT flag is enabled. This flag is enabled by default.
[14] gen_ai.output.multimodal_metadata: Aggregates multi-modal data referenced in the model's output content. Only includes UriPart messages. Follow gen_ai.output.messages.json.
Collect only when the OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT flag is enabled. This flag is enabled by default.
Record Prompts, Inputs, and Outputs
You can omit recording user inputs and model responses, record them in the span's attributes, or record them as events (logs). For details, see Control LLM Application Conversation History Collection Behavior.
Embedding
An embedding represents a single process, such as embedding text with a Large Language Model (LLM). You can then use similarity queries to optimize solutions to problems.
Set gen_ai.operation.name to embeddings. When gen_ai.operation.name is embeddings, infer gen_ai.span.kind as EMBEDDING.
Name the span {gen_ai.operation.name} {gen_ai.request.model}. Other naming formats are acceptable in special cases.
Attributes
AttributeKey | Description | Type | Example | Requirement Level |
| Operation type [1] | string |
| Required |
| Secondary operation type [2] | string |
| Required |
| Large language model provider | string |
| Required |
| Model name specified in the request | string |
| Required if available |
| Number of dimensions expected for the embedding operation | integer |
| Recommended |
| Encoding formats requested for the embedding operation | string[] |
| Recommended |
| Token count consumed by the input text | integer |
| Optional |
| Total token count consumed by the embedding | integer |
| Optional |
[1] gen_ai.span.kind: Dedicated enumeration for LLM spanKind. In an Embedding span, this value must be EMBEDDING.
[2] gen_ai.operation.name: Secondary operation type.
Tool
Tool spans represent calls to external tools. Examples include calling a calculator or requesting the latest weather from a weather API.
Set gen_ai.operation.name to execute_tool. When gen_ai.operation.name is execute_tool, infer gen_ai.span.kind as TOOL.
Name the span {gen_ai.operation.name} {gen_ai.tool.name}. Other naming formats are acceptable in special cases.
Attributes
AttributeKey | Description | Type | Example | Requirement Level |
| Operation type [1] | string |
| Required |
| Secondary operation type [2] | string |
| Required |
| Tool ID | string |
| Recommended |
| Tool description | string |
| Recommended |
| Tool name | string | Recommended | |
| Tool type | string |
| Recommended |
| Tool call input parameters [2] | string |
| Optional |
| Tool call return value [3] | string |
| Optional |
[1] gen_ai.span.kind: Dedicated enumeration for LLM spanKind. In a Tool span, this value must be TOOL.
[2] gen_ai.tool.call.arguments: Tool call input parameters, as a JSON string. Collect only when the OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT flag is enabled. This flag is enabled by default.
[3] gen_ai.tool.call.result: Tool call return value, as a JSON string. Collect only when the OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT flag is enabled. This flag is enabled by default.
Agent
An Agent span represents an intelligent agent scenario—a more complex Chain. The agent uses LLM inference results to decide the next step. This may involve multiple LLM and Tool calls, progressing step-by-step to reach a final answer.
Set gen_ai.operation.name to invoke_agent or create_agent. When gen_ai.operation.name is invoke_agent or create_agent, infer gen_ai.span.kind as AGENT.
Name the span {gen_ai.operation.name} {gen_ai.agent.name}. Other naming formats are acceptable in special cases.
Attributes
AttributeKey | Description | Type | Example | Requirement Level |
| Operation type [1] | string |
| Required |
| Secondary operation type [2] | string |
| Required |
| Unique conversation ID [3] | string |
| Required if available |
| Agent description | string |
| Required if available |
| Unique agent identifier | string |
| Required if available |
| Agent name | string |
| Required if available |
| Data source unique identifier [4] | string |
| Required if available |
| Number of input tokens used | integer |
| Recommended |
| Number of output tokens used | integer |
| Recommended |
| Total number of tokens used | integer |
| Recommended |
| Number of tokens written to the model provider's cache [5] | integer |
| Recommended |
| Number of tokens read from the model provider's cache [6] | integer |
| Recommended |
| Model input content [7] | string |
| Optional |
| Model output content [8] | string |
| Optional |
| System prompt content [9] | string |
| Optional |
| Tool definition list [10] | string |
| Optional |
| Agent's first-token response latency | integer |
| Recommended |
[1] gen_ai.span.kind: Dedicated enumeration for LLM spanKind. In an Agent span, this value must be AGENT.
[2] gen_ai.operation.name: Secondary operation type.
[3] gen_ai.conversation.id: Unique conversation ID. Collect this if you can easily obtain it.
[4] gen_ai.data_source.id: Unique data source ID. This is the data source that AI Agents or RAG applications depend on. It can be an external database, Object Storage Service, document set, website, or other storage system.
[5] gen_ai.usage.cache_creation.input_tokens: This value must already be included in gen_ai.usage.input_tokens.
[6] gen_ai.usage.cache_read.input_tokens: This value must already be included in gen_ai.usage.input_tokens.
[7] gen_ai.input.messages: Records the input content for the LLM call. Messages must be provided in the order sent to the model or agent. Follow gen_ai.input.messages.json.
Collect only when the OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT flag is enabled. This flag is enabled by default.
[8] gen_ai.output.messages: Records the model's output content. Messages must be provided in the order sent to the model or agent. Follow gen_ai.output.messages.json.
Collect only when the OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT flag is enabled. This flag is enabled by default.
[9] gen_ai.system_instructions: Records system prompt or instruction content separately. Use this field if you can obtain the system prompt or instruction content independently. Otherwise, record it in the gen_ai.input.messages attribute. Follow gen_ai.system_instructions.json.
Collect only when the OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT flag is enabled. This flag is enabled by default.
[10] gen_ai.tool.definitions: Records tool definitions passed to the LLM. This attribute may be very large. By default, collect only the type and name fields. Collect all other fields only when the OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT flag is enabled. This flag is enabled by default.
Task
A Task span represents a custom internal method call, such as invoking a local function or other application-defined logic.
Name the span run_task {gen_ai.task.name}. Other naming formats are acceptable in special cases.
The OpenTelemetry community has not yet defined a semantic convention for this span type. Therefore, gen_ai.operation.name may change.
Attributes
AttributeKey | Description | Type | Example | Requirement Level |
| Operation type [1] | string |
| Required |
| Secondary operation type | string |
| Required |
| Input parameters | string |
| Optional |
| Input MIME type | string |
| Optional |
| Output MIME type | string |
| Optional |
[1] gen_ai.span.kind: Dedicated enumeration for LLM spanKind. In a Task span, this value must be TASK.
Entry
An Entry span marks the entry point for a call to an AI application system.
Name the span enter_ai_application_system. Other naming formats are acceptable in special cases.
The OpenTelemetry community has not yet defined a semantic convention for this span type. Therefore, gen_ai.operation.name may change.
Attributes
AttributeKey | Description | Type | Example | Requirement Level |
| Operation type [1] | string |
| Required |
| Secondary operation type | string |
| Recommended |
| Session ID | string |
| Required if available |
| End-user identifier | string |
| Required if available |
| Model input content [2] | string |
| Optional |
| Model output content [3] | string |
| Optional |
| First-token response latency in streaming-response scenarios [4] | integer |
| Recommended |
[1] gen_ai.span.kind: Dedicated enumeration for LLM spanKind. In an Entry span, this value must be ENTRY.
[2] gen_ai.input.messages: Records the input content for the LLM call. Messages must be provided in the order sent to the model or agent. Follow gen_ai.input.messages.json.
Collect only when the OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT flag is enabled. This flag is enabled by default.
[3] gen_ai.output.messages: Records the model's output content. Messages must be provided in the order sent to the model or agent. Follow gen_ai.output.messages.json.
Collect only when the OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT flag is enabled. This flag is enabled by default.
[4] gen_ai.response.time_to_first_token: Time from the server's receipt of the user request until the first response packet returns. Unit: nanoseconds.
ReAct Step
A Step span marks a Reasoning-Acting iteration process within an Agent.
Name the span react step. Other naming formats are acceptable in special cases.
The OpenTelemetry community has not yet defined a semantic convention for this span type. Therefore, gen_ai.operation.name may change.
Attributes
AttributeKey | Description | Type | Example | Requirement Level |
| Operation type [1] | string |
| Required |
| Secondary operation type | string |
| Recommended |
| Reason for this ReAct round's termination | string |
| Recommended |
| Round number for this ReAct iteration [2] | integer |
| Recommended |
[1] gen_ai.span.kind: Dedicated enumeration for LLM spanKind. In a ReAct Step span, this value must be STEP.
[2] gen_ai.react.round: ReAct round numbers should start from 1 and increment by 1 for each iteration.