Custom metric rules - DataWorks - Alibaba Cloud Documentation Center

Use custom metric rules when the built-in metric types don't cover your business logic. Define the metric's SQL query inline in the rule configuration and reference it by name in the assertion.

Define a custom metric rule

In the assertion field, write a condition using a custom metric name of your choice. Then, in the same rule block, add a key matching that name and provide its SQL query under query.

datasets:
  - type: Table
    tables:
      - tb_d_spec_demo
    filter: "dt='$[yyyymmdd]' AND hh='$[hh24-1/24]'"
    dataSource:
      name: odps_first
      envType: Dev

rules:
  - assertion: "avg_order_span between 5 and 10"
    avg_order_span:
      query: "SELECT COUNT(safety_stock_level - days_to_manufacture) FROM dim_product;"

The custom metric name avg_order_span appears twice and plays two roles:

Role	Location	Purpose
Assertion operand	`assertion: "avg_order_span between 5 and 10"`	Defines the pass/fail threshold
Configuration key	`avg_order_span:` (sibling of `assertion`)	Binds the SQL query to the metric

DataWorks runs the SQL query to produce a value, then evaluates whether that value satisfies the assertion condition. If it does, the check passes.

Store failed rows when the metric value is out of range

To retain the rows that caused a check failure for later inspection, set collectFailedRows: true and add a failedRowsQuery.

These two fields serve different purposes: query counts the metric value used in the assertion, while failedRowsQuery fetches the actual rows to store as samples. Both are required when you want to retain failed data.

datasets:
  - type: Table
    tables:
      - tb_d_spec_demo
    filter: "dt='$[yyyymmdd]' AND hh='$[hh24-1/24]'"
    dataSource:
      name: odps_first
      envType: Dev

rules:
  - assertion: "id_for_belgium between 5 and 10"
    id_for_belgium:
      query: "SELECT count(*) FROM product_b;"
    collectFailedRows: true
    failedRowsQuery: "SELECT id FROM product_b WHERE id IS NULL"