All Products
Search
Document Center

Dataphin:Metadata collection overview

Last Updated:Nov 18, 2025

Metadata supports various collection source types, including traditional databases such as MySQL and Oracle, big data storage systems such as Hive, Hologres, and application systems. You can view information about the number of collection tasks created and the types of collection objects for different data source types or application systems.

Prerequisites

You need to create an application system in Management Hub > Datasource Management > Application System before you can use the application system type as a collection source.

Limits

By default, metadata collection for relational databases is supported. To collect metadata from other data source types, you need to purchase the corresponding features.

In versions earlier than 5.3, collecting metadata from some data sources, such as AnalyticDB for MySQL 3.0, PolarDB-X (formerly DRDS), SAP HANA, and Hologres, required you to initialize the Metadata Center in the metadata warehouse tenant. In version 5.3 and later, this initialization is not required, and you can configure collection tasks directly.

Metadata collection workflow description

If the network environment of the data source is not connected to the network environment where the Dataphin cluster is located, you need to rely on the registered scheduling cluster feature. The collected data will be written to the Object Storage Service system (such as OSS) that Dataphin deployment depends on as a transit, and then written to the Dataphin system. This will incur additional storage costs.

Procedure

  1. In the top menu bar of the Dataphin homepage, select Administration > Metadata.

  2. In the navigation pane on the left, select Metadata Collection > Collection Overview.

  3. On the Welcome To Metadata Collection And Management page, Dataphin displays information such as the number of collection tasks configured for different data sources or application systems and the supported collection object types in card format.

    • Data Source: Supports various data source types, such as relational databases and big data storage databases. For more information, see Data sources supported by Dataphin.

      The supported versions of MySQL, Oracle, and Hive (MySQL metadatabase, HMS metadata) are as follows:

      • MySQL: MySQL 5.1.43, MySQL 5.6/5.7, MySQL 8, and RDS MySQL.

      • Oracle: Oracle 11g, Oracle 12c, Oracle 18c, Oracle 19c, Oracle 21c, and Oracle 23c.

      • Hive (MySQL metadatabase, HMS metadata): CDH 5.x Hive 1.1.0, EMR 3.x Hive 2.3.5, EMR 5.x 3.1.x, CDH 6.x Hive 2.1.1, FusionInsight 8.x Hive 3.1.0, CDP 7.x Hive 3.1.3, and AsiaInfo DP 5.x Hive 3.1.0.

    • Application System: Supports Quick BI.

  4. You can quickly create collection tasks for the target data source or application system.

    Create Collection Task: Hover over a card to quickly create a collection task. For more information, see Create and manage metadata collection tasks.

    Note

    Only one collection task can be configured for a data source. Two different environment sources (development environment and production environment) of the same data source can be configured with separate collection tasks.