All Products
Search
Document Center

Dataphin:Data exploration global configuration

Last Updated:Jan 21, 2025

This topic outlines how to globally configure data exploration to manage the scope of automatically explored data tables, oversee exploration records, regulate concurrent tasks, set task durations, and optimize resource usage.

Prerequisites

To utilize the data exploration feature, the Data Quality feature must be activated within the domain.

Limits

Data exploration is exclusively available for Dataphin data tables.

The feature is not supported when the compute engine is set to AnalyticDB for PostgreSQL, ArgoDB, or StarRocks.

Permission description

Global configuration for data exploration is accessible to super administrators, operation administrators, and custom global roles with the requisite permissions.

Data exploration configuration

  1. Navigate to the Dataphin home page, single click Administration > Metadata from the top menu bar.

  2. In the left-side navigation pane, single click Data Profile to access the Data Exploration Configuration edit page, then click the Edit button at the bottom.

    image

  3. On the Data Exploration Configuration edit page, set the parameters as follows:

    Parameter

    Description

    Automatic exploration configuration: Define the scope of data tables eligible for automatic data exploration.

    Important

    Data exploration consumes compute resources from the project or section where the data table resides. Configure judiciously based on actual business needs.

    Physical Table Range

    Choose the range of physical tables and views for automatic exploration by project. Options include all projects, all production projects (Basic and Prod), or specific projects.

    • All Projects: Includes all physical tables and views under every project, both existing and future, for automatic exploration.

    • All Production Projects (basic And Prod): Encompasses all physical tables and views under production projects, both existing and future, for automatic exploration.

    • Specified Projects: Allows selection of specific projects for automatic exploration, with support for multiple selections.

    Logical Table Range

    Select the range of logical tables and views for automatic exploration by data section. Options include all sections, all production sections (Basic and Prod), or specific sections.

    • All Sections: Covers all logical tables and views under every section, both existing and future, for automatic exploration.

    • All Production Sections (basic And Prod): Includes all logical tables and views under production sections, both existing and future, for automatic exploration.

    • Specified Sections: Allows selection of specific sections for automatic exploration, with support for multiple selections.

    System configuration

    Profiling Record

    Manage exploration records using two methods:

    • Only Retain The Latest Exploration Record And Report:

      • If the latest run is successful and generates a report, all previous records, both successful and failed, will be deleted.

      • If the latest run fails, only the failed record and the most recent successful report will be kept, while other failed records are deleted. If no successful records exist, only the current failed record is retained.

    • Retain The Latest N Days Of Exploration Records: Keep all records and reports from the past n days, both successful and failed. The default is 15 days, and you can set any integer between 1 and 90 days.

    Concurrent Rate Limiting

    Set the maximum number of concurrent exploration tasks, with a minimum of 1 and a maximum of 5. Enter an integer within this range.

    Exploration Timeout

    Limit the maximum duration of exploration tasks to prevent extended resource use. Tasks exceeding the set time will be marked as failed. Set any value from 1 to 24 hours, with precision up to one decimal place.

    Advanced Parameter Configuration

    Enabling this allows for the customization of set parameters for global exploration tasks to enhance performance or accommodate specific compute engines.

    • Click the Reference Example box to view and copy the example statement.

    • Click Typical Scenario Description to learn about common errors encountered during exploration tasks and their resolution through parameter adjustments. For more information, see the referenced document or .

  4. To finalize the global configuration for data exploration, single click Confirm.

    Note

    Note that if the scope of automatically explored data tables changes within certain projects or sections, or if the tables are deleted before configuration, automatic exploration will be disabled for those tables. Ongoing tasks will not be affected.

What to do next

Upon finalizing the data exploration configuration, you can initiate automatic exploration of Dataphin data tables. For more information, see the referenced document or .