All Products
Search
Document Center

Dataphin:View data profiling reports and records

Last Updated:Mar 05, 2025

After completing the data profiling task, you can examine the generated reports for various field data types and the statistical distribution of fields. This topic guides you through accessing data profiling reports and records.

Prerequisites

The Data Quality feature module must be enabled to utilize the data profiling feature.

Permission description

  • Super administrators and operation administrators have access to profiling reports and records for all data tables. The current data table owner can only access the profiling reports and records for the tables they manage.

  • Project administrators can access the profiling reports and records of physical tables within their projects.

  • Section administrators can access the profiling reports and records of logical tables within their sections.

  • Ordinary members can view profiling reports and records for which they have viewing permissions.

View data profiling results

  1. On the Dataphin home page, select Administration > Asset Inventory from the top menu bar.

  2. Click the Table tab. Here, you can filter by table type, including physical tables, logical tables, physical views, logical views, and materialized views.

  3. In the table list, click the target table name or the Actions column's image icon to go to the object details page.

  4. On the Object Details page, click the Data Profile tab to view profiling results that have been successfully run and for which you have viewing permissions.

    • Profiling Record: Shows profiling records that have run successfully and have viewing permissions, including details on profiling partitions, scope, number of fields, and records profiled.

    • View Profiling Configuration: Allows you to review the configured profiling task details.

    • View Logs: Enables you to inspect the run logs for the profiling task associated with the selected record.

View data profiling reports

Note

To enhance the security of sensitive data, if a field has a desensitization rule applied, the original value before desensitization is used for statistics, while the desensitized value is shown in the profiling report.

For profiling records that have been successfully executed, you can view the corresponding reports, which present the profiling outcomes for each field under various profiling scenarios.

image

Field value distribution

Statistics on field value distribution are compiled, and a corresponding graph is created to provide a quick overview of the value distribution, aiding in data development and application within the data pipeline. All data types are supported, with specific statistical indicators for different field data types.

The details of field value histograms and bar charts are as follows:

  • For numeric fields, an approximate histogram is displayed, dividing the record values into 20 intervals. A line chart illustrates the record count and average value for each interval.

  • For text, datetime, or Boolean fields, the bar chart's other values category shows values outside the Top 20 duplicates and the count of Null value records.

    image.png

    image.png

  • For numeric fields: Statistics include maximum (Max), minimum (Min), average (Avg), Null value count, unique value count, standard deviation, 25% quantile, median, and 75% quantile.

  • For text fields: Statistics include maximum and minimum character length, average character length, Null value count, and unique value count.

  • For datetime fields: Statistics include maximum (Max), minimum (Min), Null value count, and unique value count.

  • For Boolean fields: Statistics include the Null value count.

Null value statistics

This analysis helps identify the presence of Null values or other abnormal records in fields, which is crucial for avoiding errors in task execution or impacting the accuracy of downstream data calculations. It is particularly recommended for primary key fields or fields that should not contain Null values. All data types are supported, with additional statistics for zero value records in numeric fields and empty string records in text fields.

The details of the Null value statistics donut chart are as follows:

  • For numeric fields: Statistics include the total number of records profiled, Null value count, Null value rate, zero value count, zero value rate, and other values. The donut chart provides an overview of these indicators.

  • For text fields: Statistics include the total number of records profiled, Null value count, Null value rate, empty string count, empty string rate, and other values. The donut chart provides an overview of these indicators.

  • For datetime and Boolean fields: Statistics include the total number of records profiled, Null value count, Null value rate, and other values. The donut chart provides an overview of these indicators.

image.png

If Null values or empty strings are present in a field, the following administration suggestions apply:

  • If the field is a primary key (or should not contain Null values) and is numeric or text, and Null values are detected, it is advisable to set up a field Null value check quality monitoring rule to prevent disruption to downstream business processes.

  • If the field is a primary key (or should not contain Null values) and is text, and Null or empty string values are detected, it is advisable to set up a field Null value check or field empty string check quality monitoring rule to prevent disruption to downstream business processes.

Unique value statistics

Statistics include the count of unique value records and the top 5 field values with the highest frequency of duplicates. This profiling scenario is recommended for primary key fields with unique values or fields with frequent occurrences. Note that unique value statistics are not supported for Boolean fields.

image.png

If duplicate values are found in a field, the following administration suggestion applies:

If the field serves as a primary key and profiling indicates duplicate values, it is advisable to set up a field value uniqueness quality monitoring rule to ensure the smooth operation of downstream business processes.

View data profiling records

  1. Click the View Profiling Records button to access the View Profiling Records panel.

  2. In the View Profiling Records panel, you'll find the names, profiling types, statuses, and execution durations of the profiling records.

  3. You can search for a specific profiling record by name or filter by profiling status and type.

  4. In the target profiling record operation column, you can perform various operations.

    Operation

    Description

    View Profiling Results

    After the profiling task runs successfully, you can view the profiling report.

    View Profiling Configuration

    You can view the configuration information of the profiling task. If it is a manual profiling task, you can click the Initiate Profiling Based On Current Configuration button at the bottom to quickly modify some information and initiate a new profiling.

    View Run Logs

    You can view the run logs of the profiling task corresponding to the selected profiling record.

    Stop

    For manual or automatic profiling tasks that are running or waiting, you can Stop the task.

    Initiate Profiling Based On Current Configuration

    For manual profiling tasks, you can quickly fill in the configuration based on this profiling task and initiate a new profiling. If the task is in progress, it cannot be initiated again.