All Products
Search
Document Center

DataWorks:Concepts related to metadata entities

Last Updated:Oct 27, 2025

DataWorks API operations (2024-05-18) support queries of various metadata entities. This topic describes the concepts related to the metadata entities.

Metadata entity objects

Data Map collects and manages metadata entity objects of different types and levels (subtypes) by using metadata crawlers. For more information about the supported crawler types, see Supported crawler types.

Data Map supports the following metadata entity levels based on the metadata level structure:

  • Catalog

  • Database

  • Schema

  • Table

  • Column

Entity levels vary based on the crawler types.

Supported crawler types

Identifier

Display name

Supported metadata entity levels

Remarks

Catalog

DataBase

Schema

Table

Column

maxcompute

MaxCompute

  • A default crawler is provided to identify all metadata entities within your Alibaba Cloud account.

  • In MaxCompute, objects at the database level are projects. You cannot call API operations to query projects.

  • Whether the schema level is optional depends on whether three-layer model is enabled for your MaxCompute project.

dlf

Data Lake Formation

A default crawler is provided to identify all metadata entities within your Alibaba Cloud account.

hms

HMS

  • This type of crawler uses Hive Metastore Service (HMS) to manage metadata.

  • This type of crawler can be used to collect metadata from E-MapReduce (EMR) and CDH_HIVE clusters.

holo

Hologres

-

mysql

MySQL

-

oracle

Oracle

-

postgresql

PostgreSQL

-

sqlserver

SQL Server

-

analyticdb_for_mysql

AnalyticDB MySQL

This type of crawler can be used to collect metadata from analyticdb_for_mysql and analyticdb_for_spark data sources.

ads

AnalytidDB MySQL 2.0

-

hybriddb_for_postgresql

AnalyticDB PostgreSQL

-

ots

OTS

-

clickhouse

ClickHouse

-

starrocks

StarRocks

Catalogs are supported. This type of crawler can be used to query metadata entities only in internal catalogs.

lindorm_for_engine

Lindorm

-

Entity type (EntityType)

EntityType is the identifier of a metadata entity type. The value of EntityType is in the ${CrawlerType}-${SubType} format.

  • CrawlerType is the identifier of a crawler type. For example, the value of CrawlerType can be mysql, maxcompute, dlf, or holo.

  • SubType is the identifier of a metadata entity subtype. For example, the value of SubType can be catalog, database, schema, table, or column.

If a MaxCompute table is used, the value of EntityType is maxcompute-table.

Metadata entity ID (MetaEntityId)

MetaEntityId: indicates the identifier of a metadata entity object. The identifier has the characteristics of readability, uniqueness, and extensibility.

Crawler metadata instances and entity objects of catalogs, databases, schemas, tables, and columns are supported.

A metadata entity ID serves as the unique identifier of the entity. You can separate identifiers at each level with colons (:). Empty strings are used as placeholders for unsupported levels.

Crawler metadata instances

Crawler metadata entity ID: the unique identifier of the metadata entity ID.

  • For MaxCompute and DLF crawler types, a default crawler is provided for all metadata entities within the tenant or Alibaba Cloud account. The crawler metadata entity ID is in the ${CrawlerType} format.

  • For other types of crawlers that you must manually create, the crawler metadata entity ID is in the ${CrawlerType}:${MetaSourceId} format.

    • CrawlerType: the identifier of a crawler type. For example, the value of CrawlerType can be holo or mysql.

    • MetaSourceId: the identifier of a metadata source.

      • Instance mode: corresponds to an instance ID or a cluster ID.

      • URL mode: corresponds to the URL-encoded URL (Jdbc Url or Endpoint).

Examples:

  • For MaxCompute type, the crawler metadata entity ID is maxcompute.

  • For Hologres type in instance mode, if the instance ID is i-z6j3kxxx7, the crawler metadata entity ID is holo:i-z6j3kxxx7.

  • For MySQL type in URL mode, if the URL is jdbc:mysql://47.0.X.X:3306/test_db, the crawler metadata entity ID is mysql:jdbc%3Amysql%3A%2F%2F47.0.X.X%3A3306%2Ftest_db.

Data table related metadata entities

The metadata entity ID format is ${EntityType}:${MetaSourceId}:${Catalog}:${Database}:${Schema}:${Table}:${Column}. It includes the following elements:

Level

Property

Description

-

EntityType

The identifier of the entity type.

-

MetaSourceId

  • In instance mode, MetaSourceId corresponds to an instance ID or a cluster ID.

  • In URL mode, corresponds to the URL-encoded URL (Jdbc Url or Endpoint).

For MaxCompute and DLF types, an empty string is used as a placeholder.

Catalog

Catalog

The catalog identifier.

For StarRocks type, this is the catalog name. For DLF type, this is the catalog ID. For other types, an empty string is used as a placeholder.

Database

Database

The database name.

Schema

Schema

The schema name.

For types that do not support schema, an empty string is used as a placeholder.

For MaxCompute type, when the schema model is enabled, the schema name must be provided. When it is not enabled, an empty string is used as a placeholder.

Table

Table

The data table name.

Column

Column

The field name.

Metadata entity examples

The following are examples of metadata entity IDs at various levels including MaxCompute, DLF, HMS, Hologres, and MySQL.

Note

In the following examples of IDs, you can separate identifiers at each level with colons (:). Empty strings are used as placeholders for unsupported levels.

MaxCompute

Note
  • Only MaxCompute projects with the schema model enabled support the schema level, and require the schema name to be provided in the corresponding position in the data table and field IDs.

  • MaxCompute projects without the schema model enabled do not support the schema level, and an empty string is used as a placeholder in the corresponding position in the data table and field IDs.

For a project project_name (with the schema model enabled), schema schema_name, table table_name, and field column_name, the entity IDs at each level are as follows:

Level

ID

Crawler metadata instance

maxcompute

Project

maxcompute-project:::project_name

Schema

maxcompute-schema:::project_name:schema_name

Data table

maxcompute-table:::project_name:schema_name:table_name

Column

maxcompute-column:::project_name:schema_name:table_name:column_name

For a project project_name (without the schema model enabled), table table_name, and field column_name, the entity IDs at each level are as follows:

Level

ID

Crawler metadata instance

maxcompute

Project

maxcompute-project:::project_name

Data table

maxcompute-table:::project_name::table_name

Column

maxcompute-column:::project_name::table_name:column_name

DLF

For a catalog catalog_id, database database_name, table table_name, and field column_name, the entity IDs at each level are as follows:

Level

ID

Crawler metadata instance

dlf

Catalog

dlf-catalog::catalog_id

Database

dlf-database::catalog_id:database_name

Data table

dlf-table::catalog_id:database_name::table_name

Column

dlf-column::catalog_id:database_name::table_name:column_name

HMS

For an EMR cluster instance c-a1b2c3xxx, database test_db, table test_tbl, and field test_col, the entity IDs at each level are as follows:

Level

ID

Crawler metadata instance

hms:c-a1b2c3xxx

Database

hms-database:c-a1b2c3xxx::test_db

Data table

hms-table:c-a1b2c3xxx::test_db::test_tbl

Column

hms-column:c-a1b2c3xxx::est_tdb::test_tbl:test_col

Hologres

In this example, the Hologres instance hgpostcn-cn-a1b2c3xxx, database test_db, schema test_schema, data table test_tbl, and column test_col are used. The following table describes the entity IDs at each level.

Level

ID

Crawler metadata instance

maxcompute

Project

maxcompute-project:123456XXX::test_project

Schema

maxcompute-schema:123456XXX::test_project:default

Data table

maxcompute-table:123456XXX::test_project:default:test_tbl

Column

maxcompute-column:123456XXX::test_project:default:test_tbl:test_col

MySQL

For a MySQL data source connection string jdbc:mysql://47.0.X.X:3306/test_db, database test_db, table test_tbl, and field test_col, the entity IDs at each level are as follows (MetaSourceId is generated by URL-encoding the JDBC connection string):

Level

ID

Crawler metadata instance

mysql:jdbc%3Amysql%3A%2F%2F47.0.X.X%3A3306%2Ftest_db

Database

mysql-database:jdbc%3Amysql%3A%2F%2F47.0.X.X%3A3306%2Ftest_db::test_db

Data table

mysql-table:jdbc%3Amysql%3A%2F%2F47.0.X.X%3A3306%2Ftest_db::test_db::test_tbl

Column

mysql-column:jdbc%3Amysql%3A%2F%2F47.0.X.X%3A3306%2Ftest_db::test_db::test_tbl:test_col