Use custom metric rules when the built-in metric types don't cover your business logic. Define the metric's SQL query inline in the rule configuration and reference it by name in the assertion.
Define a custom metric rule
In the assertion field, write a condition using a custom metric name of your choice. Then, in the same rule block, add a key matching that name and provide its SQL query under query.
datasets:
- type: Table
tables:
- tb_d_spec_demo
filter: "dt='$[yyyymmdd]' AND hh='$[hh24-1/24]'"
dataSource:
name: odps_first
envType: Dev
rules:
- assertion: "avg_order_span between 5 and 10"
avg_order_span:
query: "SELECT COUNT(safety_stock_level - days_to_manufacture) FROM dim_product;"The custom metric name avg_order_span appears twice and plays two roles:
| Role | Location | Purpose |
|---|---|---|
| Assertion operand | assertion: "avg_order_span between 5 and 10" | Defines the pass/fail threshold |
| Configuration key | avg_order_span: (sibling of assertion) | Binds the SQL query to the metric |
DataWorks runs the SQL query to produce a value, then evaluates whether that value satisfies the assertion condition. If it does, the check passes.
Store failed rows when the metric value is out of range
To retain the rows that caused a check failure for later inspection, set collectFailedRows: true and add a failedRowsQuery.
These two fields serve different purposes: query counts the metric value used in the assertion, while failedRowsQuery fetches the actual rows to store as samples. Both are required when you want to retain failed data.
datasets:
- type: Table
tables:
- tb_d_spec_demo
filter: "dt='$[yyyymmdd]' AND hh='$[hh24-1/24]'"
dataSource:
name: odps_first
envType: Dev
rules:
- assertion: "id_for_belgium between 5 and 10"
id_for_belgium:
query: "SELECT count(*) FROM product_b;"
collectFailedRows: true
failedRowsQuery: "SELECT id FROM product_b WHERE id IS NULL"