All Products
Search
Document Center

Cloud Monitor:Custom evaluation tasks

Last Updated:Sep 29, 2025

AI Application Observability provides a set of built-in evaluation templates for common scenarios. If you have custom evaluation metrics, you can use the real-time computing capabilities of the observability service to make custom calls.

Query built-in evaluation templates

Use the following statement to query all Chinese evaluation templates. You can modify the built-in templates for specific scenarios.

* | select * from "resource.llm_evaluation" where lang='cn'

Access Alibaba Cloud Model Studio

http_call is a scalar function in SQL/Structured Process Language (SPL). It accepts information, such as the body and header, to access external services. You can use this function to access Alibaba Cloud Model Studio and process data.

Syntax

http_call(url, method, headers, params, body, timeout)

Parameters

Note

The Alibaba Cloud observability product charges for token consumption. To use your own account, you can specify the bearer key in the header.

Parameter

Type

Description

url

string

The URL to access. Only access to Alibaba Cloud Model Studio is supported. Enter the HTTPS address for Model Studio: https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions.

method

string in JSON format

The HTTP request method.

headers

string in JSON format

The header for accessing Alibaba Cloud Model Studio. Use your own Model Studio key in the header.

params

string in JSON format

Used for GET requests. Leave this empty for POST requests.

body

message in JSON format

The prompt for accessing Alibaba Cloud Model Studio. Specify the model to access in the prompt.

timeout

number

The timeout period in milliseconds.

Example

http_call(
            'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions',
            'POST',
            '{ "Content-Type":"application/json"}',
            '',
            '{
                "model": "qwen-plus",
                "messages": [
                    {
                        "role": "system",
                        "content": "You are a helpful assistant."
                    },
                    {
                        "role": "user", 
                        "content": "Who are you?"
                    }
                ]
            }',
            60 * 1000
        )

Custom evaluation solutions

The following SQL statement summarizes data based on a custom evaluation template. You can specify the evaluation instructions in the SQL statement.

(* and id: 999 and type: dca)| set
  session velox_use_io_executor = true;
with t1 as (
    select
      __time__,
      "sentence1" as trans,  -- The data to be processed.
      '{{query}}' as targets,  -- The placeholder in the evaluation template.
      '{"model":"<QWEN_MODEL>","input":{"messages":[{"role":"system","content":"<SYSTEM_PROMPT>"},{"role":"user","content":"<USER_PROMPT>"}]}}' as body_template, -- The body template for accessing Alibaba Cloud Model Studio.
      cast('Summarize the following content in one sentence: {{query}}' as varchar) as eval_prompt -- The evaluation template, which includes instructions and placeholders.
    FROM      log
  ),
  t1_1 as (
    select
      __time__,
      body_template,
      eval_prompt,
      replace(eval_prompt, targets, trans) as eval_content,
      trans
    FROM      t1
  ),
  t2 as (
    select
      __time__,
      body_template,
      eval_content,
      trans
    FROM      t1_1
  ),
  t3 as (
    select
      __time__,
      trans as oldeval,
      body_template,
      replace(
        replace(
          replace(
            replace(
              replace(
                replace(replace(eval_content, chr(92), '\\'), '"', '\"'),
                chr(8),
                '\b'
              ),
              chr(12),
              '\f'
            ),
            chr(10),
            '\n'
          ),
          chr(13),
          '\r'
        ),
        chr(9),
        '\t'
      ) as eval,
      trans
    FROM      t2
  ),
  t4 as (
    select
      __time__,
      replace(
        replace(
          replace(body_template, '<QWEN_MODEL>', 'qwen-turbo'),
          '<SYSTEM_PROMPT>',
          'You are a helpful assistant.'
        ),
        '<USER_PROMPT>',
        eval
      ) as body,
      oldeval,
      trans
    FROM      t3
  ),
  t5 as (
    select
      __time__,
      http_call(
        'https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation',
        'POST',
        '{ "Content-Type":"application/json"}',
        '',
        body,
        60 * 1000
      ) as response,
      body,
      oldeval,
      trans
    FROM      t4
  ),
  t6 as (
    select
      __time__,
      oldeval,
      body,
      response.code,
      response.header,
      response.body body_res,
      response.error_msg as error,
      trans
    FROM      t5
  )
select
  __time__,
  trans as "Original Text",
  replace(
    replace(
      json_extract_scalar(body_res, '$.output.text'),
      '```json',
      ' '
    ),
    '```',
    ' '
  ) as "Summary",
  error,
  json_extract_scalar(body_res, '$.usage.total_tokens') as "Tokens Consumed"
FROM  t6

You can also evaluate data using SPL:

.let t3=.logstore with(query='attributes.gen_ai.span.kind: LLM and  attributes.input.value: *  and  *') 
|extend trans = "attributes.input.value",targets = '{{query}}',body_template='{"model":"<QWEN_MODEL>","input":{"messages":[{"role":"system","content":"<SYSTEM_PROMPT>"},{"role":"user","content":"<USER_PROMPT>"}]}}' ,eval_prompt= cast('Summarize the following content in one sentence: {{query}}' as varchar)  
|project __time__ ,trans,targets,body_template,eval_prompt;
$t3 |extend eval_content=replace(eval_prompt, targets, trans) 
| extend eval = replace(replace(replace(replace(replace(replace(replace(eval_content, chr(92), '\\'), '"', '\"'), chr(8), '\b'), chr(12), '\f'), chr(10), '\n'), chr(13), '\r'), chr(9), '\t') 
| extend body = replace(
            replace(
                replace(body_template, '<QWEN_MODEL>', 'qwen-max'),
                '<SYSTEM_PROMPT>',
                'You are a helpful assistant.'
            ),
            '<USER_PROMPT>',
            eval
        ) 
| extend response = http_call(
            'https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation',
            'POST',
            '{ "Content-Type":"application/json"}',
            '',
             body,
            60 * 1000
        ) 
| extend code=response.code, header=response.header,body_res = response.body,error = response.error_msg 
| extend  evaluationResult = replace(replace(json_extract_scalar(body_res ,'$.output.text'),'```json',' '),'```',' ') ,evaluationTemplate='mcp_tool_poisoning_attack_cn' 
| limit  10000
| where evaluationResult <> 'null'
| project __time__,evaluationResult, evaluationTemplate, error ,body_res