This topic describes when new versions are generated during operations on metadata from different data sources and what information is obtained from different types of metadata objects.
Metadata versions
General description
Modifying the following information does not generate a new metadata version:
Partition definition changes: Changes to the number of partitions, partition field types, and partition filter expressions.
Property definition changes: Changes to custom properties.
Collection policy configuration: When the collection task is configured to retain deleted metadata, and two consecutive collections only have field deletion changes, a new metadata version is not generated.
Modifying the following information generates a new metadata version:
Table information changes: Including changes to table descriptions and metadata types.
Table structure changes: Including modifications to partition fields, changes from non-partitioned tables to partitioned tables (whether it is a partitioned table), and view DDL changes.
Field structure changes: Including adding/updating/deleting fields, changing field order, field types, and field key types.
Field detail changes: Including changes to field default values, nullable status, and field descriptions.
Additionally, changes to specific properties of some data sources also affect whether a new version is generated, as shown in the table below.
Data source type | Change method |
Hologres | Changes to the partition table type, such as converting a parent table to a child table or vice versa, the name of the parent table, or the conditions of a child table. |
StarRocks | Changes to the table type, materialized view refresh type, invalid status, sort fields, bucketing method, or bucketing fields. However, a change to the number of buckets does not generate a new version. |
OceanBase, Greenplum | Partition field type changes. |
Amazon Redshift | Changes to the table storage class, distribution key fields, sorting method, or sort fields. |
SelectDB, Doris | Changes to the table storage class, bucketing method, bucketing fields, sort fields, or distribution fields. |
Dameng (DM) | Table storage class changes. |
DolphinDB | Changes to sort fields. |
openGauss | Changes to the table storage class or partition field type. |
Metadata properties
The following information is displayed for all data sources:
Table technical properties: Including description, number of table rows, storage class, and storage volume information.
Field list: Including the name (primary key fields are identified with a key symbol), data type, description, nullable status, and default value information of fields in the specified version.
Partition information: Includes partition fields and partition details. For partition fields, the field name, data type, and description are provided. For partition details, the partition name, record count, storage size, and creation time are provided.
When the data source type is a relational database and the metadata type is a view, only description information can be viewed in technical properties.
Amazon Redshift collects data using system tables. Super administrators can access all row data (all metadata) by default. Regular users can only view their own data. If you do not want to configure a super administrator account on the data source but still want to collect all data, you can grant usage permissions on pg_catalog to a specific account. The authorization statement is:
GRANT USAGE ON SCHEMA ${schema, such as pg_catalog} TO ${username}.Users configured in Dameng (DM) and Oracle data sources need to have select permissions on the relevant tables to collect metadata from those tables. To grant permissions to users, use the statement:
grant SOI to your_username;.DolphinDB collects relevant metadata by calling functions. The scope of the collected data depends on the role of the user account in DolphinDB, as specified in the data source configuration:
Cluster administrator: Can collect tables from all databases.
DB_OWNER, and accounts with DB_MANAGE or DB_READ permissions for specific databases: Can collect all tables from the corresponding databases.
Other accounts: Can collect tables for which they have TABLE_READ permission.
SelectDB and Doris data sources can collect asynchronous materialized view metadata types, which the system will detect as tables. Synchronous materialized view metadata types cannot be collected at this time.
The table below shows the differences in information displayed for specific data sources:
Data source type | Table technical properties | Field list | Partition information |
MySQL, Amazon RDS for MySQL, TDSQL for MySQL | Cannot collect storage class. | The default value of view fields is displayed as the default value of source table fields. |
|
Microsoft SQL Server, Amazon RDS for SQL Server | Cannot collect storage class. | / |
|
AnalyticDB for MySQL 3.0 | / | The default value of view fields is displayed as the default value of source table fields. |
|
OceanBase | Collects information about whether the table is partitioned, the creation time, the last update time (MySQL tenants only), and the last DDL update time (Oracle tenants only). The collected information varies by table type:
| Regular fields display unique key, foreign key, and CHECK constraint information. Unique keys are identified with UNI (only displayed for Oracle tenant). Foreign keys are identified with FOR. CHECK constraints are identified with CHK (only displayed for Oracle tenant). |
Note Only OceanBase Oracle tenant mode supports displaying partition information. |
PolarDB-X (formerly DRDS) | / | The default value of view fields is displayed as the default value of source table fields. |
|
IBM DB2, Amazon RDS for DB2 | / | Cannot collect default value information |
|
ClickHouse | Collects information about whether the table is partitioned, the table engine, and the last DDL update time. The collected information varies by table type:
| Displays sampling key and sort key information. Sampling keys are identified with SMP. Sort keys are identified with SOR. |
|
Hive (MySQL metadatabase, HMS metadata) | Collects information about whether the table is partitioned, whether it is a lake table, the lake table format, the table storage mode, the storage format, the creation time, and the last DDL update time. The collected information varies by table type:
| Cannot collect nullable status and default value information. |
|
Hologres | Collects information about whether the table is partitioned, the parent table name (partition child tables only), partition conditions (partition child tables only), the number of child tables (partition parent tables only), the storage format, the creation time, the last update time, the last DDL update time, and the lifecycle. The collected information varies by table type:
| / |
|
StarRocks | Collects information about the table type (internal tables only), materialized view type (materialized views only), invalid status (materialized views only), whether the table is partitioned, whether it is a lake table, the lake table format, the storage format, the creation time, and the last update time. The collected information varies by table type:
| Displays bucketing key and sort key information. |
|
Greenplum | Collects information about whether the table is partitioned and the Location (foreign tables only). The collected information varies by table type:
| Regular fields display unique key, foreign key, and CHECK constraint information. Unique keys are identified with UNI (only displayed for Oracle tenant). Foreign keys are identified with FOR. CHECK constraints are identified with CHK (only displayed for Oracle tenant). |
|
Amazon Redshift | Collects information about whether the table is partitioned and the Location (foreign tables only). The number of rows and storage size cannot be collected. The collected information varies by table type:
| Displays bucketing key and sort key information. If all returned sort key values are positive, it is a regular sort. If both positive and negative values are returned, it is an interlaced sort. | Only foreign tables display partition information.
|
SelectDB, Doris | Collects information about whether the table is partitioned, whether it is a lake table (only if the storage format is Hudi or Iceberg), the lake table format (lake tables only), the storage format, the table type, the creation time, and the last data update time. The differences in information displayed for different table types are as follows:
|
|
|
Dameng (DM) | Collects information about whether the table is partitioned, the storage class, the storage format, the Location (foreign tables only), the creation time, and the last DDL update time. The differences in information displayed for different table types are as follows:
| Regular fields display unique key, reference key, and CHECK constraint information. Unique keys are identified with UNI. Reference keys are identified with REF. CHECK constraints are identified with CHK. |
|
Elasticsearch | Only table objects (indices) support collecting document count and storage volume information. | Only table objects (indices) support collecting field name and data type information. |
|
DolphinDB | Collects information about whether the table is partitioned, the storage engine, the storage format, and the creator. The storage class cannot be collected. |
|
|
openGauss | Collects information about whether the table is partitioned, the storage format, the Location (foreign tables only), and the view DDL (views and materialized views only). The collected information varies by table type:
| Regular fields display primary key, CHECK constraint, unique key, and trigger information. CHECK constraints are identified by CHK. Unique keys are identified by UNI. |
|