All Products
Search
Document Center

DataWorks:GetDataQualityRule

Last Updated:Mar 14, 2025

Queries the information about a data quality monitoring rule.

Operation description

This API operation is available for all DataWorks editions.

Debugging

You can run this interface directly in OpenAPI Explorer, saving you the trouble of calculating signatures. After running successfully, OpenAPI Explorer can automatically generate SDK code samples.

Authorization information

There is currently no authorization information disclosed in the API.

Request parameters

ParameterTypeRequiredDescriptionExample
IdlongYes

The rule ID.

19715

Response parameters

ParameterTypeDescriptionExample
object

The response parameters.

RequestIdstring

The request ID.

691CA452-D37A-4ED0-9441
DataQualityRuleobject

The information about the rule.

Idlong

The rule ID.

16033
Namestring

The rule name.

The table cannot be empty.
ProjectIdlong

The DataWorks workspace ID.

1948
Enabledboolean

Indicates whether the rule is enabled.

true
Severitystring

Rule for the business level (corresponding to the strong and weak rules on the page), optional enumeration value:

  • Normal
  • High
High
Descriptionstring

The description of the rule. The description can be up to 500 characters in length.

this is a odps _sql task
Targetobject

The monitored object of the rule.

Typestring

Monitoring object type

  • Table
Table
DatabaseTypestring

The dataset of the table type. The database type to which the table belongs.

  • maxcompute
  • emr
  • cdh
  • hologres
  • analyticdb_for_postgresql
  • analyticdb_for_mysql
  • starrocks
maxcompute
TableGuidstring

The ID of the table that is limited by the rule in Data Map.

odps.unit_test.tb_unit_test
PartitionSpecstring

The configuration of the partitioned table.

ds=$[yyyymmdd-1]
TemplateCodestring

The ID of the template used by the rule.

system::user_defined
SamplingConfigobject

The sampling settings.

Metricstring

The metrics used for sampling. Valid values:

  • Count: the number of rows in the table.
  • Min: the minimum value of the field.
  • Max: the maximum value of the field.
  • Avg: the average value of the field.
  • DistinctCount: the number of unique values of the field after deduplication.
  • DistinctPercent: the proportion of the number of unique values of the field after deduplication to the number of rows in the table.
  • DuplicatedCount: the number of duplicated values of the field.
  • DuplicatedPercent: the proportion of the number of duplicated values of the field to the number of rows in the table.
  • TableSize: the table size.
  • NullValueCount: the number of rows in which the field value is null.
  • NullValuePercent: the proportion of the number of rows in which the field value is null to the number of rows in the table.
  • GroupCount: the field value and the number of rows for each field value.
  • CountNotIn: the number of rows in which the field values are different from the referenced values that you specified in the rule.
  • CountDistinctNotIn: the number of unique values that are different from the referenced values that you specified in the rule after deduplication.
  • UserDefinedSql: indicates that data is sampled by executing custom SQL statements.
Max
MetricParametersstring

The parameters required for sampling.

{ "Columns": [ "id", "name" ] , "SQL": "select count(1) from table;"}
SettingConfigstring

The statements that are used to configure the parameters required for sampling before you execute the sampling statements. The statements can be up to 1,000 characters in length. Only the MaxCompute database is supported.

SET odps.sql.udf.timeout=600s; SET odps.sql.python.version=cp27;
SamplingFilterstring

The statements that are used to filter unnecessary data during sampling. The statements can be up to 16,777,215 characters in length.

id IS NULL
CheckingConfigobject

The check settings for sample data.

Typestring

The threshold calculation method. Valid values:

  • Fixed
  • Fluctation
  • FluctationDiscreate
  • Auto
  • Average
  • Variance
Fixed
ReferencedSamplesFilterstring

The method that is used to query the referenced samples. To obtain some types of thresholds, you need to query reference values. In this example, an expression is used to indicate the query method of referenced samples.

{ "bizdate": [ "-1", "-7", "-1m" ] }
Thresholdsobject

The threshold settings.

Expectedobject

The expected threshold setting.

Operatorstring

The comparison operator. Valid values:

  • >
  • >=
  • <
  • <=
  • !=
  • =
>
Valuestring

The threshold value.

100.0
Expressionstring

The threshold expression.

$checkValue <= 0.01
Warnedobject

The threshold settings for normal alerts.

Operatorstring

The comparison operator. Valid values:

  • >
  • >=
  • <
  • <=
  • !=
  • =
>
Valuestring

The threshold value.

100.0
Expressionstring

The threshold expression.

$checkValue > 0.01
Criticalobject

The threshold settings for critical alerts.

Operatorstring

The comparison operator. Valid values:

  • >
  • >=
  • <
  • <=
  • !=
  • =
>
Valuestring

The threshold value.

100.0
Expressionstring

The threshold expression.

$checkValue > 0.05
ErrorHandlersarray<object>

The operations that you can perform after the rule-based check fails.

ErrorHandlerobject

The operation that you can perform after the rule-based check fails.

Typestring

Processor type:

  • SaveErrorData
SaveErrorData
ErrorDataFilterstring

The SQL statement that is used to filter failed tasks. If you define the rule by using custom SQL statements, you must specify an SQL statement to filter failed tasks.

SELECT * FROM tb_api_log WHERE id IS NULL

Examples

Sample success responses

JSONformat

{
  "RequestId": "691CA452-D37A-4ED0-9441\n",
  "DataQualityRule": {
    "Id": 16033,
    "Name": "The table cannot be empty.",
    "ProjectId": 1948,
    "Enabled": true,
    "Severity": "High",
    "Description": "this is a odps _sql task\n",
    "Target": {
      "Type": "Table",
      "DatabaseType": "maxcompute",
      "TableGuid": "odps.unit_test.tb_unit_test\n",
      "PartitionSpec": "ds=$[yyyymmdd-1]\n"
    },
    "TemplateCode": "system::user_defined\n",
    "SamplingConfig": {
      "Metric": "Max",
      "MetricParameters": "{ \"Columns\": [ \"id\", \"name\" ] , \"SQL\": \"select count(1) from table;\"}",
      "SettingConfig": "SET odps.sql.udf.timeout=600s; \nSET odps.sql.python.version=cp27;\n",
      "SamplingFilter": "id IS NULL\n"
    },
    "CheckingConfig": {
      "Type": "Fixed",
      "ReferencedSamplesFilter": "{ \"bizdate\": [ \"-1\", \"-7\", \"-1m\" ] }\n",
      "Thresholds": {
        "Expected": {
          "Operator": ">",
          "Value": "100.0",
          "Expression": "$checkValue <= 0.01"
        },
        "Warned": {
          "Operator": ">",
          "Value": "100.0",
          "Expression": "$checkValue > 0.01"
        },
        "Critical": {
          "Operator": ">",
          "Value": "100.0",
          "Expression": "$checkValue > 0.05"
        }
      }
    },
    "ErrorHandlers": [
      {
        "Type": "SaveErrorData\n",
        "ErrorDataFilter": "SELECT * FROM tb_api_log WHERE id IS NULL\n"
      }
    ]
  }
}

Error codes

For a list of error codes, visit the Service error codes.

Change history

Change timeSummary of changesOperation
2024-12-19The response structure of the API has changedView Change Details
2024-12-19The internal configuration of the API is changed, but the call is not affectedView Change Details