Creates a data quality monitoring run instance.
Operation description
DataWorks Basic Edition or a higher edition is required.
Try it now
Test
RAM authorization
Request syntax
POST HTTP/1.1
Request parameters
|
Parameter |
Type |
Required |
Description |
Example |
| Id |
integer |
No |
The data quality monitoring run record ID. |
1006059507 |
Response elements
|
Element |
Type |
Description |
Example |
|
object |
The response. |
||
| RequestId |
string |
The request ID. |
0bc14115****159376359 |
| DataQualityScanRun |
object |
Data quality monitoring running records. |
|
| Id |
integer |
The running record ID. |
1016440997 |
| CreateTime |
integer |
The time when the data quality monitor starts running. |
1706247622000 |
| FinishTime |
integer |
The time when the data quality monitor stops. |
1706247622000 |
| Status |
string |
The current running status.
|
Fail |
| Scan |
object |
The snapshot of the data quality monitor configuration at the start of the validation. |
|
| Id |
integer |
The data quality monitor ID. |
21077 |
| Name |
string |
The name of the data quality validation task. It can contain digits, letters, Chinese characters, and both half-width and full-width punctuation marks, with a maximum length of 255 characters. |
Hourly partition quality monitoring |
| Description |
string |
The description of the data quality validation task. Maximum length: 65,535 characters. |
This is a hourly run data quality evaluation plan. |
| ProjectId |
integer |
The project ID. |
164024 |
| CreateTime |
integer |
The creation time of the data quality monitor. |
1706247622000 |
| ModifyTime |
integer |
The last update time of the data quality monitor. |
1706247622000 |
| CreateUser |
string |
The creator of the data quality monitor. |
7892346529452 |
| ModifyUser |
string |
The last updater of the data quality monitor. |
7892346529452 |
| Owner |
string |
The owner of the data quality monitor. |
7892346529452 |
| Spec |
string |
The data quality monitor Spec. For more information, see Data quality Spec configuration description. |
{ "datasets": [ { "type": "Table", "dataSource": { "name": "odps_first", "envType": "Prod" }, "tables": [ "ods_d_user_info" ], "filter": "pt = $[yyyymmdd-1]" } ], "rules": [ { "assertion": "row_count > 0" }, { "templateId": "SYSTEM:field:null_value:fixed", "pass": "when = 0", "name": "The id cannot be empty.", "severity": "High", "identity": "a-customized-data-quality-rule-uuid" } ] } |
| Parameters |
array<object> |
The parameter settings of the data quality monitor. |
|
|
object |
The parameter settings of the data quality monitor. |
||
| Value |
string |
The parameter value. |
$[yyyy-mm-dd-1] |
| Name |
string |
The parameter name. |
dt |
| ComputeResource |
object |
The computing resource settings of the data quality monitor. |
|
| Name |
string |
The name of the computing resource, which corresponds to the Name attribute in the ComputeResource data structure of the computing resource API. |
emr_cluster_001 |
| Runtime |
object |
The additional runtime settings of the data quality monitor. |
|
| Engine |
string |
The type of the compute engine. Only EMR compute engines support these settings.
|
Hive |
| SparkConf |
object |
Additional parameters for the Spark engine. Currently, only spark.yarn.queue is supported to specify the queue. |
spark.yarn.queue=dq_queue |
| HiveConf |
object |
Additional parameters for the Hive engine. Currently, only mapreduce.job.queuename is supported to specify the queue. |
mapreduce.job.queuename=dq_queue |
| EnvType |
string |
The workspace environment to which the compute engine belongs.
|
Dev |
| RuntimeResource |
object |
The resource group used for running the data quality monitor. |
|
| Id |
string |
The resource group ID. |
60597 |
| Cu |
number |
Reserved CUs for the resource group. |
1 |
| Image |
string |
The image ID of the run configuration. |
i-xxxx |
| Trigger |
object |
The trigger configurations of the data quality monitor. |
|
| Type |
string |
The trigger method of the data quality monitor.
|
BySchedule |
| TaskIds |
array |
If the trigger mode is set to BySchedule, the scheduling task ID must be specified. |
|
|
integer |
The scheduling task ID. |
1014217266 |
|
| Hooks |
array<object> |
The hook configurations after the data quality monitor stops. |
|
|
object |
The hook configurations after the data quality monitor stops. |
||
| Condition |
string |
The hook trigger condition. Currently, only one type of expression syntax is supported:
|
results.any { r -> r.status == 'fail' && r.rule.severity == 'High' } |
| Type |
string |
The type of the hook.
|
BlockTaskInstance |
| Parameters |
array<object> |
The parameter settings used during the actual running. |
|
|
object |
Parameter settings. |
||
| Value |
string |
The parameter value. |
$[yyyy-mm-dd-1] |
| Name |
string |
The parameter name. |
dt |
| Results |
array<object> |
The validation results of each rule. |
|
|
array<object> |
The validation result of the rule. |
||
| Status |
string |
The validation result status.
|
Fail |
| Details |
array<object> |
The information about the data quality check. |
|
|
object |
The information about the data quality check. |
||
| Status |
string |
The final comparison result status.
|
Fail |
| ReferenceValue |
string |
The reference sample used as the baseline for calculating the CheckedValue. |
0.0 |
| CheckValue |
string |
The final value used for comparison with the threshold. |
100.0 |
| Sample |
string |
The sample value used in the validation. |
{ "value": "100.0" } |
| CreateTime |
integer |
The time when the validation result is generated. |
1725506795000 |
| Rule |
string |
The snapshot of the rule Spec at the start of the validation. |
{ "templateId": "SYSTEM:field:null_value:fixed", "pass": "when = 0", "name": "The id cannot be empty.", "severity": "High", "identity": "a-customized-data-quality-rule-uuid" } |
Examples
Success response
JSON format
{
"RequestId": "0bc14115****159376359",
"DataQualityScanRun": {
"Id": 1016440997,
"CreateTime": 1706247622000,
"FinishTime": 1706247622000,
"Status": "Fail",
"Scan": {
"Id": 21077,
"Name": "Hourly partition quality monitoring",
"Description": "This is a hourly run data quality evaluation plan.",
"ProjectId": 164024,
"CreateTime": 1706247622000,
"ModifyTime": 1706247622000,
"CreateUser": "7892346529452",
"ModifyUser": "7892346529452",
"Owner": "7892346529452",
"Spec": "{\n \"datasets\": [\n {\n \"type\": \"Table\",\n \"dataSource\": {\n \"name\": \"odps_first\",\n \"envType\": \"Prod\"\n },\n \"tables\": [\n \"ods_d_user_info\"\n ],\n \"filter\": \"pt = $[yyyymmdd-1]\"\n }\n ],\n \"rules\": [\n {\n \"assertion\": \"row_count > 0\"\n }, {\n \"templateId\": \"SYSTEM:field:null_value:fixed\",\n \"pass\": \"when = 0\",\n \"name\": \"The id cannot be empty.\",\n \"severity\": \"High\",\n \"identity\": \"a-customized-data-quality-rule-uuid\"\n }\n ]\n}",
"Parameters": [
{
"Value": "$[yyyy-mm-dd-1]",
"Name": "dt"
}
],
"ComputeResource": {
"Name": "emr_cluster_001",
"Runtime": {
"Engine": "Hive",
"SparkConf": {
"test": "test",
"test2": 1
},
"HiveConf": {
"test": "test",
"test2": 1
}
},
"EnvType": "Dev"
},
"RuntimeResource": {
"Id": "60597",
"Cu": 1,
"Image": "i-xxxx"
},
"Trigger": {
"Type": "BySchedule",
"TaskIds": [
1014217266
]
},
"Hooks": [
{
"Condition": "results.any { r -> r.status == 'fail' && r.rule.severity == 'High' }",
"Type": "BlockTaskInstance"
}
]
},
"Parameters": [
{
"Value": "$[yyyy-mm-dd-1]",
"Name": "dt"
}
],
"Results": [
{
"Status": "Fail",
"Details": [
{
"Status": "Fail",
"ReferenceValue": "0.0",
"CheckValue": "100.0"
}
],
"Sample": "{\n \"value\": \"100.0\"\n}\n",
"CreateTime": 1725506795000,
"Rule": "{\n \"templateId\": \"SYSTEM:field:null_value:fixed\",\n \"pass\": \"when = 0\",\n \"name\": \"The id cannot be empty.\",\n \"severity\": \"High\",\n \"identity\": \"a-customized-data-quality-rule-uuid\"\n}"
}
]
}
}
Error codes
See Error Codes for a complete list.
Release notes
See Release Notes for a complete list.