模擬資料Demo - Cloud Monitor

本模板基於模擬資料（Mock Data）類比 AI Agent 日誌，提供 4 個由淺入深的 Pipeline 示範情境，用於快速上手 Pipeline 的核心能力。

模擬資料範本簡介

本模板使用模擬資料類比 4 種 AI Agent 情境（安全審計、問數助手、日誌分析、智能客服），用於 Pipeline 功能示範和運算元驗證。

與模板總覽中的其他模板基於 OT-AI Trace（嵌套 JSON attributes）不同，本模板使用扁平 SLS 事件模型：22 個頂層 text 欄位、無 JSON 嵌套，project 映射零成本，適合新使用者快速上手。

適用人群

新使用者：通過 Demo 1~2 快速理解 Pipeline 基本能力。
演算法工程師：通過 Demo 3 評估 Agent 輸出品質。
資料平台團隊：通過 Demo 4 瞭解全流程編排模式。

前提條件

在使用本模板前，請確認已完成以下準備工作：

已開通Log Service SLS。如果尚未開通，請參見Pipeline 概述瞭解服務開通方式。
已建立目標 Project 和 Logstore，用於儲存模擬資料。本模板使用的 Project 為 ali-pub-cn-hangzhou-staging-sls-admin，Logstore 為 ai_test。
已開通 Pipeline 功能。Pipeline 功能需要在 SLS 控制台中單獨開通，詳情請參見Pipeline 概述。
（可選，僅 Demo 4 API 配置需要）已建立 Dataset Workspace。在 Demo 4 的 API 配置中，sink.dataset.workspace 參數需要填寫您的 Workspace 名稱。您可以在 SLS 控制台的資料集管理頁面查看或建立 Workspace。

資料來源說明

目標資源

資源	值
SLS Project	`ali-pub-cn-hangzhou-staging-sls-admin`
SLS Logstore	`ai_test`

與 OT-AI Trace 格式的差異

本模板的模擬資料與其他模板使用的 OT-AI Trace 資料在格式上存在以下差異：

維度	OT-AI Trace	Mock 模擬資料（本模板）
資料格式	OpenTelemetry Span（嵌套 JSON attributes）	扁平 SLS 事件（22 個頂層 text 欄位）
欄位引用	`json_extract_scalar(attributes, '$.xxx')`	直接引用：`question`、`model`、`output`
彙總鍵	`spanId` / `traceId`	`trace_id`（單輪）/ `conversation_id`（多輪）
事件區分	`gen_ai.span.kind` = LLM / TOOL / AGENT	`event_type` = user_query / tool_call / assistant_content 等

欄位可用性矩陣

22 個欄位在不同 event_type 下的取值情況如下表所示。

欄位	user_query	system_prompt	tool_call	tool_result	assistant_content	completion
question	有值（使用者提問）	-	-	-	-	-
output	-	-	-	有值（工具結果）	有值（Agent 回答）	-
model	-	有值	有值	-	有值	有值
tool_name	-	-	有值	有值	-	-
latency_ms	-	-	-	有值（工具耗時）	-	有值（總耗時）
agent_name	有值	有值	有值	有值	有值	有值

說明

所有欄位類型均為 text。空值為空白字串 "" 而非 NULL。數值運算需使用 CAST，空值過濾需使用 NULLIF。

樣本資料概覽

情境	agent_name	典型 question	工具
安全審計	security_audit_agent	"過去24小時內SSH登入日誌裡有沒有異常高頻失敗的IP？"	sls_execute_sql, get_threat_intel, create_alert
問數助手	text2sql_agent	"上季度的總GMV是多少？和去年同期比漲了多少？"	execute_sql, create_chart
日誌分析	logexplorer_sql_agent	"payment-service的pod一直在CrashLoopBackOff，能幫我看看為啥起不來嗎？"	sls_execute_sql, search_logs
智能客服	customer_service_agent	"系統顯示'已簽收'但本人沒收到快遞！"	query_order, search_knowledge_base, create_ticket

Demo 情境

#	情境	複雜度	運算元鏈	資料粒度	使用 make-instance
1	使用者提問三級去重	低	project > where > dedup-exact > dedup-fuzzy > dedup-semantic	事件級	否
2	問題聚類 + 情境標註	中	project > where > dedup-exact > semantic-cluster > sample > llm-call	事件級	否
3	Agent 回答品質評估	中	project > where > make-instance > dedup-exact > sample > llm-call > doc-stats	Trace 級	是
4	端到端全流程	高	project > where > make-instance > extend > dedup-exact > dedup-fuzzy > dedup-semantic > semantic-cluster > sample > llm-call x2 > doc-stats	Trace 級	是

Demo 1：使用者提問三級去重

利用模擬資料中帶 dedup_tag 標記的重複資料（exact / fuzzy / semantic），示範三級去重鏈的逐級過濾效果。無 LLM 和 GPU 成本。

定製建議

定製點	操作
去重嚴格度	`dedup-fuzzy` 的 `threshold`：1=極嚴格，5=較寬鬆
語義閾值	`dedup-semantic` 的 `threshold`：0.05=嚴格，0.15=寬鬆
跨批次全域去重	給 dedup-fuzzy/semantic 添加 `"global": true, "workspace": "...", "dataset": "..."` 參數
僅驗證某級去重	刪除不需要的 dedup 節點即可

Demo 2：問題聚類 + 情境標註

對去重後的使用者提問進行語義聚類，按簇採樣後用 LLM 自動標註意圖、複雜度等維度，發現 Agent 使用的情境分布。包含 1 輪 LLM 調用。

SPL 文法

* | project question=question,
          trace_id=trace_id,
          agent=agent_name,
          user_id=user_id,
          event_type=event_type
  | where event_type = 'user_query' AND length(question) > 0
  | dedup-exact -field=question
  | dedup-semantic -field=question -threshold='0.1'
  | semantic-cluster -field=__dedup_emb -n=5
  | sample -n=3 by __cluster_id
  | llm-call -prompt='@anno/scene-label.md' -fields=question -format=json as anno

標註輸出樣本

{
  "意圖類型": "問題診斷",
  "任務複雜度": "中等",
  "業務情境": "安全審計",
  "補充標籤": ["日誌分析", "異常檢測", "IP維度"]
}

定製建議

定製點	操作
聚類數量	`semantic-cluster` 的 `n`：小資料用 3~5，巨量資料用 50~200
每簇採樣量	`sample` 的 `n`：1=最小代表性，5=更充分
標註維度	修改 `prompts/scene-label.md` 中的標註維度和可選值
跳過去重	刪除 dedup 節點，直接對全量提問聚類標註

Demo 3：Agent 回答品質評估

將離散事件記錄按 trace_id 彙總為"一問一答"執行個體，通過 LLM 多維度評估 Agent 回答品質。包含 1 輪 LLM 調用。

text 欄位的彙總處理

模擬資料所有欄位類型為 text，空值為 "" 而非 NULL。在使用 make-instance 彙總時需要注意以下問題：

問題	原因	正確做法
`count(tool_name)` 統計偏高	`""` 不等於 NULL，COUNT 不過濾空串	`count_if(event_type = 'tool_call')`
`sum(token_input)` 報類型錯誤	text 類型無法直接求和	`sum(cast(NULLIF(token_input, '') as bigint))`
`max(latency_ms)` 返回字串最大值	text 按字典序比較，非數值	`max(cast(NULLIF(latency_ms, '') as bigint))`

重要

對 text 類型的數值欄位，統一使用 cast(NULLIF(col, '') as bigint) 封裝——先 NULLIF 過濾空串，再 CAST 轉數值。

SPL 文法

* | project question=question,
          output=output,
          model=model,
          tool_name=tool_name,
          token_input=token_input,
          token_output=token_output,
          latency_ms=latency_ms,
          event_type=event_type,
          trace_id=trace_id,
          session_id=session_id,
          agent_name=agent_name
  | where event_type IN ('user_query','tool_call','tool_result','assistant_content','completion')
  | make-instance
      question=first(question),
      answer=last(output),
      model=any(model),
      agent=any(agent_name),
      tool_chain=join(tool_name, ' → '),
      tool_count=count_if(event_type = 'tool_call'),
      total_tokens=sum(cast(NULLIF(token_input, '') as bigint)),
      latency=max(cast(NULLIF(latency_ms, '') as bigint))
      by session_id,trace_id
  | where question IS NOT NULL AND answer IS NOT NULL
  | dedup-exact -field=question
  | sample -n=20
  | llm-call -prompt='@eval/agent-quality.md' -fields=question,answer,tool_chain -format=json as eval
  | doc-stats -field=answer

評估輸出樣本

{
  "需求理解": {"score": 5, "reason": "準確理解了使用者要求建立基於IP網段的警示規則"},
  "回答品質": {"score": 4, "reason": "警示規則配置完整，但未說明觸發頻率限制"},
  "邏輯連貫": {"score": 5, "reason": "從查詢到驗證到建立，步驟清晰"},
  "格式規範": {"score": 5, "reason": "警示配置以結構化格式呈現"},
  "安全合規": {"score": 5, "reason": "未泄露敏感資訊"}
}

定製建議

定製點	操作
彙總粒度	`by` 改為 `conversation_id` 可按多輪會話彙總
增加統計列	在 make-instance 中增加 `err_count=count_if(status = 'error')` 等
評估維度	修改 `prompts/eval-prompt.md`
採樣量	`sample` 的 `n`：控制 LLM 調用成本

Demo 4：端到端全流程

資料治理全流程流水線：欄位提取、事件彙總、指標派生、三級去重、聚類採樣、品質評估 + 情境標註、文檔統計。覆蓋全部運算元，包含 2 輪 LLM 調用。

SPL 文法

* | project question=question,
          output=output,
          model=model,
          tool_name=tool_name,
          tool_args=tool_args,
          tool_success=tool_success,
          token_input=token_input,
          token_output=token_output,
          latency_ms=latency_ms,
          status=status,
          event_type=event_type,
          trace_id=trace_id,
          session_id=session_id,
          conversation_id=conversation_id,
          agent_name=agent_name,
          user_id=user_id,
          region_id=region_id
  | where event_type IN ('user_query','system_prompt','tool_call','tool_result','assistant_content','completion')
  | make-instance
      question=first(question),
      answer=last(output),
      model=any(model),
      agent=any(agent_name),
      user_id=any(user_id),
      region=any(region_id),
      tool_chain=join(tool_name, ' → '),
      tools=array_distinct(tool_name),
      tool_count=count_if(event_type = 'tool_call'),
      has_error=bool_or(status = 'error'),
      total_input_tokens=sum(cast(NULLIF(token_input, '') as bigint)),
      total_output_tokens=sum(cast(NULLIF(token_output, '') as bigint)),
      latency=max(cast(NULLIF(latency_ms, '') as bigint))
      by session_id,trace_id,conversation_id
  | where question IS NOT NULL AND answer IS NOT NULL
  | extend token_total=total_input_tokens + total_output_tokens,
          answer_preview=substr(answer, 1, 500)
  | dedup-exact -field=question
  | dedup-fuzzy -field=question -threshold='3'
  | dedup-semantic -field=question -threshold='0.1'
  | semantic-cluster -field=__dedup_emb -n=5
  | sample -n=3 by __cluster_id
  | llm-call -prompt='@eval/agent-quality.md' -fields=question,answer,tool_chain -format=json as eval
  | llm-call -prompt='@anno/scene-label.md' -fields=question -format=json as anno
  | doc-stats -field=answer

API 配置（JSON）

{
  "name": "mock_data_demo_full",
  "description": "模擬資料端到端全流程 Demo：彙總、去重、聚類、評估、標註，覆蓋全部運算元能力",
  "source": {
    "type": "logstore",
    "logstore": {
      "project": "ali-pub-cn-hangzhou-staging-sls-admin",
      "logstore": "ai_test",
      "query": "*"
    }
  },
  "pipeline": {
    "nodes": [
      {"id": "extract", "type": "project", "parameters": {"question": "question", "output": "output", "model": "model", "tool_name": "tool_name", "tool_args": "tool_args", "tool_success": "tool_success", "token_input": "token_input", "token_output": "token_output", "latency_ms": "latency_ms", "status": "status", "event_type": "event_type", "trace_id": "trace_id", "session_id": "session_id", "conversation_id": "conversation_id", "agent_name": "agent_name", "user_id": "user_id", "region_id": "region_id"}},
      {"id": "filter_events", "type": "where", "parameters": {"filter": "event_type IN ('user_query','system_prompt','tool_call','tool_result','assistant_content','completion')"}},
      {"id": "assemble", "type": "make-instance", "parameters": {"question": "first(question)", "answer": "last(output)", "model": "any(model)", "agent": "any(agent_name)", "user_id": "any(user_id)", "region": "any(region_id)", "tool_chain": "join(tool_name, ' → ')", "tools": "array_distinct(tool_name)", "tool_count": "count_if(event_type = 'tool_call')", "has_error": "bool_or(status = 'error')", "total_input_tokens": "sum(cast(NULLIF(token_input, '') as bigint))", "total_output_tokens": "sum(cast(NULLIF(token_output, '') as bigint))", "latency": "max(cast(NULLIF(latency_ms, '') as bigint))", "by": "session_id,trace_id,conversation_id"}},
      {"id": "filter_valid", "type": "where", "parameters": {"filter": "question IS NOT NULL AND answer IS NOT NULL"}},
      {"id": "derive_metrics", "type": "extend", "parameters": {"token_total": "total_input_tokens + total_output_tokens", "answer_preview": "substr(answer, 1, 500)"}},
      {"id": "exact_dedup", "type": "dedup-exact", "parameters": {"field": "question"}},
      {"id": "fuzzy_dedup", "type": "dedup-fuzzy", "parameters": {"field": "question", "threshold": "3"}},
      {"id": "semantic_dedup", "type": "dedup-semantic", "parameters": {"field": "question", "threshold": "0.1"}},
      {"id": "cluster", "type": "semantic-cluster", "parameters": {"field": "__dedup_emb", "n": 5}},
      {"id": "sample_per_cluster", "type": "sample", "parameters": {"n": 3, "by": "__cluster_id"}},
      {"id": "evaluate", "type": "llm-call", "parameters": {"prompt": "@eval/agent-quality.md", "fields": "question,answer,tool_chain", "format": "json", "as": "eval"}},
      {"id": "annotate", "type": "llm-call", "parameters": {"prompt": "@anno/scene-label.md", "fields": "question", "format": "json", "as": "anno"}},
      {"id": "text_stats", "type": "doc-stats", "parameters": {"field": "answer"}}
    ]
  },
  "sink": {
    "type": "dataset",
    "dataset": {"workspace": "<your-workspace-name>", "dataset": "mock_demo_full"}
  },
  "executePolicy": {
    "mode": "run_once",
    "run_once": {"fromTime": 1772150000, "toTime": 1772240000}
  }
}

說明

上述 JSON 配置中的 <your-workspace-name> 需要替換為您實際的 Workspace 名稱。您可以在 SLS 控制台的資料集管理頁面查看已有的 Workspace，或建立新的 Workspace。

全流程資料量變化

步驟	運算元	資料量	列數	說明
1	project	140	17	欄位選取
2	where	約 120	17	過濾 system_prompt 等次要事件
3	make-instance	約 12	15	按 trace_id 彙總（事件級轉為執行個體級）
4	where	約 12	15	過濾 question/answer 為空白的執行個體
5	extend	約 12	+2	派生 token_total、answer_preview
6~8	dedup x3	約 8	+5	精確/近似/語義三級去重
9~10	cluster + sample	約 8	+1	聚 5 簇，每簇 3 條
11~13	llm-call x2 + doc-stats	約 8	+3	品質評分 + 情境標註 + 文本統計

常見問題

在使用 Demo 3 和 Demo 4 時，可能遇到以下常見問題：

問題	可能原因	排查方向
LLM 調用逾時或失敗	llm-call 運算元調用大模型時，因網路波動或模型服務負載較高導致請求逾時。	Pipeline 內建重試機制，預設會自動重試失敗的 LLM 請求。如果多次重試仍然失敗，請檢查模型服務的可用性和網路連通性，或適當減小 sample 的採樣量以降低並發調用數。
make-instance 彙總結果為空白	輸入資料中缺少必要的事件類型，或 trace_id 欄位為空白導致無法按 Trace 彙總。	檢查輸入資料是否包含 user_query 和 assistant_content 類型的事件，確認 trace_id 欄位非空。可先單獨運行 project + where 運算元，驗證過濾後的資料是否符合預期。
去重後資料量過少	dedup-semantic 的 threshold 設定過於嚴格（值過小），導致語義相似的提問被過度去重。	適當調大 dedup-semantic 的 threshold 參數（建議範圍 0.05~0.15）。也可以暫時移除 dedup-semantic 節點，僅保留 dedup-exact 和 dedup-fuzzy，觀察資料量變化。
Logstore 不存在報錯	API 配置或 SPL 中指定的 Project 或 Logstore 名稱拼字有誤，或資源尚未建立。	檢查 source.logstore.project 和 source.logstore.logstore 參數是否與 SLS 控制台中實際建立的資源名稱一致，注意區分大小寫。

運算元覆蓋矩陣

運算元	Demo 1	Demo 2	Demo 3	Demo 4	運算元文檔
project	使用	使用	使用	使用	project
where	使用	使用	使用	使用	where
make-instance	-	-	使用	使用	make-instance
extend	-	-	-	使用	extend
dedup-exact	使用	使用	使用	使用	dedup-exact
dedup-fuzzy	使用	-	-	使用	dedup-fuzzy
dedup-semantic	使用	使用	-	使用	dedup-semantic
semantic-cluster	-	使用	-	使用	semantic-cluster
sample	-	使用	使用	使用	sample
llm-call	-	使用	使用	使用 x2	llm-call
doc-stats	-	-	使用	使用	doc-stats

說明

embedding 未在 Demo 中顯式使用，因為 dedup-semantic 和 semantic-cluster 內部自動完成了 embedding 計算。

Pipeline 編排原則

原則	說明
project 前置	首運算元 `project` 聲明 Pipeline Schema，與原始日誌列名解耦。
事件過濾先行	`where` 緊跟 `project`，先過濾無關事件再彙總處理。
text 欄位 CAST	模擬資料所有欄位為 text，數值運算必須 `cast(NULLIF(col, '') as bigint)`。
空串不等於 NULL	模擬資料空值為 `""`，使用 NULLIF 轉 NULL 後再彙總。
先減後增	先去重/採樣（行數遞減），再 LLM 處理（列數遞增）。
擴充列複用	`dedup-semantic` 的 `__dedup_emb` 被 `semantic-cluster` 直接複用。

模擬資料與生產環境的差異

情境	說明
模擬資料量較小（約 140 條）	聚類和採樣效果有限，建議 `n` 參數設小
生產資料欄位類型	若生產環境欄位為 bigint/double，無需 CAST 封裝
LLM 調用成本	Demo 3 約 12 次，Demo 4 約 16 次（2 輪 x 8 條），成本極低
`dedup_tag` 欄位	僅在 `--dedup-ratio > 0` 產生資料時存在，生產環境無此欄位
make-instance 空值處理	`any`/`first`/`last` 文法糖自動 NULLIF 處理空串，`sum`/`count`/`max` 等 SQL 函數需手動 CAST

模擬資料範本簡介

適用人群

前提條件

資料來源說明

目標資源

與 OT-AI Trace 格式的差異

欄位可用性矩陣

樣本資料概覽

Demo 情境

Demo 1：使用者提問三級去重

Demo 2：問題聚類 + 情境標註

Demo 3：Agent 回答品質評估

text 欄位的彙總處理

Demo 4：端到端全流程

常見問題

運算元覆蓋矩陣

Pipeline 編排原則

模擬資料與生產環境的差異

相關文檔