The data opening feature of DataWorks provides tables and views in various dimensions for you to collect metadata. This topic provides a list of such tables and views and describes the structures of these tables and views.
- Metadata
- RPT metrics
- Metrics that are related to metadata details
- Metrics in the raw_v_meta_database table
- Metrics in the raw_v_meta_table table
- Metrics in the raw_v_meta_view table
- Metrics in the raw_v_meta_column table
- Metrics in the raw_v_meta_partition table
- Metrics in the raw_v_meta_table_lineage table
- Metrics in the raw_v_meta_table_output table
- Metrics in the raw_v_meta_table_usage table
- Metrics in the raw_v_meta_column_usage table
- Metrics in the raw_v_meta_biz_table_wiki table
- Metrics in the raw_v_meta_table_join_map table
- Metrics in the raw_v_meta_table_detail_log table
- Metrics in the raw_v_meta_category table
- Scheduling metadata
- Tenant metadata
Core metrics in the rpt_v_meta_ind_table_core table
Metric name | Data type | Description |
---|---|---|
tenant_id | bigint | The ID of the DataWorks tenant. |
project_id | bigint | The ID of the DataWorks workspace. |
catalog_name | string | The catalog to which the table belongs. This metric is set to odps for MaxCompute projects. |
database_name | string | The name of the database or MaxCompute project. |
table_name | string | The name of the table. |
table_uuid | string | The unique ID of the table. |
owner_yun_acct | string | The Alibaba Cloud account of the table owner. |
dim_life_cycle | bigint | The time to live (TTL). Unit: days.
|
is_partition_table | boolean | Specifies whether the table is a partitioned table.
|
entity_type | bigint | The entity type.
|
categories | string | The detailed information about the categories. |
last_access_time | bigint | The last time when the table was accessed. The metric value is a 10-digit UNIX timestamp. |
size | bigint | The size of the table, which indicates the logical storage space that is occupied by data in the table. Unit: byte. This metric is set to NULL for a view. |
column_count | bigint | The number of fields in the table. Partition key columns are included. |
partition_count | bigint | The number of partitions in the table. This metric is set to NULL for a non-partitioned table. |
detail_view_count | bigint | The number of times that table details are viewed on the page. |
favorite_count | bigint | The number of times that the table is added to favorites. |
Additional metrics in the rpt_v_meta_ind_table_extra table
Metric name | Data type | Description |
---|---|---|
tenant_id | bigint | The ID of the DataWorks tenant. |
table_uuid | string | The unique ID of the table. |
read_count | bigint | The number of times that data is read by using SQL statements. The data includes that of non-scheduled nodes. |
read_count_30d | bigint | The number of times that data is read within 30 days by using SQL statements. The data includes that of non-scheduled nodes. |
write_count | bigint | The number of times that data is written by using SQL statements. The data includes that of non-scheduled nodes. |
join_count | bigint | The number of times that the table is joined. |
direct_upstream_count | bigint | The number of parent tables in the lineage. |
direct_downstream_count | bigint | The number of child tables in the lineage. |
output_task_count | bigint | The number of nodes that generate the data in the table. |
Metrics in the raw_v_meta_database table
Metric name | Data type | Description |
---|---|---|
tenant_id | bigint | The ID of the DataWorks tenant. |
project_id | bigint | The ID of the DataWorks workspace. |
env_type | bigint | The environment type.
|
catalog_name | string | The catalog to which the table belongs. This metric is set to odps for MaxCompute projects. |
database_name | string | The name of the database or MaxCompute project. |
database_comment | string | The description of the database or MaxCompute project. |
owner_name | string | The name of the owner. |
created_time_ts | bigint | The creation time. The metric value is a 13-digit timestamp. |
last_modified_time_ts | bigint | The last modification time. The metric value is a 13-digit timestamp. |
location | string | The storage path of the table in the database. |
extras | string | The additional information about the database, which is a JSON string.
If the table preview and table visibility range attributes are configured for a MaxCompute
project, you can use the allowDataPreview and projectVisibility keys to obtain the
values of the attributes.
|
biz_date | string | The data timestamp. |
Metrics in the raw_v_meta_table table
Metric name | Data type | Description |
---|---|---|
tenant_id | bigint | The ID of the DataWorks tenant. |
project_id | string | The ID of the DataWorks workspace. |
table_uuid | string | The unique ID of the table. |
table_name | string | The name of the table. |
table_type | string | The type of the table. |
catalog_name | string | The catalog to which the table belongs. This metric is set to odps for MaxCompute projects. |
database_name | string | The name of the database or MaxCompute project. |
partition_keys | string | The partition keys in the table. Multi-level partitions are separated by commas (,). This metric is set to an empty string for a non-partitioned table. |
table_comment | string | The description of the table. |
table_biz_comment | string | The business description of the table. |
visibility_scope | bigint | The visibility range of the table.
|
owner_name | string | The name of the owner. |
created_time_ts | bigint | The creation time. The metric value is a 13-digit timestamp. |
last_modified_time_ts | bigint | The last time when data was modified. The metric value is a 13-digit timestamp. |
last_meta_modified_time_ts | bigint | The last time when table metadata was modified. The metric value is a 13-digit timestamp. |
location | string | The storage path of the table. |
life_cycle | bigint | The TTL of the table. Unit: days. |
data_size | bigint | The logical storage volume of the table. Unit: byte. If the table is a partitioned table, this metric is set to NULL. You must collect statistics on the storage volume based on the partition list. |
biz_date | string | The data timestamp. |
Metrics in the raw_v_meta_view table
Metric name | Data type | Description |
---|---|---|
tenant_id | bigint | The ID of the DataWorks tenant. |
project_id | string | The ID of the DataWorks workspace. |
table_uuid | string | The unique ID of the table. |
table_name | string | The name of the table. |
catalog_name | string | The catalog to which the table belongs. This metric is set to odps for MaxCompute projects. |
database_name | string | The name of the database or MaxCompute project. |
table_comment | string | The description of the table. |
table_biz_comment | string | The business description of the table. |
visibility_scope | bigint | The visibility range of the table.
|
owner_name | string | The name of the owner. |
created_time_ts | bigint | The creation time. The metric value is a 13-digit timestamp. |
last_ddl_time_ts | bigint | The last time when the view was modified by using data definition language (DDL) statements. The metric value is a 13-digit timestamp. |
view_text | string | The SQL statement that is used to create a view. |
biz_date | string | The data timestamp. |
Metrics in the raw_v_meta_column table
Metric name | Data type | Description |
---|---|---|
tenant_id | bigint | The ID of the DataWorks tenant. |
project_id | bigint | The ID of the DataWorks workspace. |
catalog_name | string | The catalog to which the table belongs. This metric is set to odps for MaxCompute projects. |
database_name | string | The name of the database or MaxCompute project. |
table_name | string | The name of the table. |
column_name | string | The name of the field. |
column_comment | string | The description of the field. |
column_biz_comment | string | The business description of the field. |
column_type | string | The data type of the field. |
column_sequence | bigint | The sequence number of the field, which starts from 1. |
is_partition_key | boolean | Specifies whether the field is a partition key. |
is_primary_key | boolean | Specifies whether the field is a primary key. |
biz_date | string | The data timestamp. |
Metrics in the raw_v_meta_partition table
Metric name | Data type | Description |
---|---|---|
tenant_id | bigint | The ID of the DataWorks tenant. |
project_id | bigint | The ID of the DataWorks workspace. |
catalog_name | string | The catalog to which the table belongs. This metric is set to odps for MaxCompute projects. |
database_name | string | The name of the database or MaxCompute project. |
table_name | string | The name of the table. |
partition_name | string | The name of the partition. |
size | bigint | The logical size of the partition. Unit: byte. |
record_number | bigint | The number of records in the partition. |
created_time_ts | bigint | The creation time. The metric value is a 13-digit timestamp. |
last_modified_time_ts | bigint | The last modification time. The metric value is a 13-digit timestamp. |
biz_date | string | The data timestamp. |
Metrics in the raw_v_meta_table_lineage table
Metric name | Data type | Description |
---|---|---|
tenant_id | bigint | The ID of the DataWorks tenant. |
project_id | bigint | The ID of the DataWorks workspace. |
src_type | string | The type of the data source. |
src_data_source_id | string | The ID of the data source. |
src_database | string | The source database. |
src_table | string | The source table. |
dest_type | string | The type of the data destination. |
dest_data_source_id | string | The ID of the data destination. |
dest_database | string | The destination database. |
dest_table | string | The destination table. |
schedule_task_id | string | The ID of the scheduled node. |
schedule_instance_id | string | The instance ID of the scheduled node. |
schedule_task_owner | string | The owner of the scheduled node. |
job_start_time_ts | bigint | The start time of the node, which is a 13-digit timestamp. |
job_end_time_ts | bigint | The end time of the node, which is a 13-digit timestamp. |
execute_time | bigint | The time that is required to run the node. Unit: seconds. |
input_record_number | bigint | The number of records that were read from the source table. |
biz_date | string | The data timestamp. |
Metrics in the raw_v_meta_table_output table
Metric name | Data type | Description |
---|---|---|
tenant_id | bigint | The ID of the DataWorks tenant. |
project_id | bigint | The ID of the DataWorks workspace in which scheduled nodes are run. |
type | string | The type of the data source. |
data_source_id | string | The ID of the data source. |
database | string | The database. |
table | string | The name of the table. |
schedule_task_id | string | The ID of the scheduled node. |
schedule_instance_id | string | The instance ID of the scheduled node. |
schedule_task_owner | string | The owner of the scheduled node. |
job_start_time_ts | bigint | The start time of the node, which is a 13-digit timestamp. |
job_end_time_ts | bigint | The end time of the node, which is a 13-digit timestamp. |
execute_time | bigint | The time that is required to run the node. Unit: seconds. |
biz_date | string | The data timestamp. |
Metrics in the raw_v_meta_table_usage table
Metric name | Data type | Description |
---|---|---|
tenant_id | bigint | The ID of the DataWorks tenant. |
project_id | bigint | The ID of the DataWorks workspace in which scheduled nodes are run. |
catalog_name | string | The catalog to which the table belongs. This metric is set to odps for MaxCompute projects. |
database_name | string | The name of the database or MaxCompute project. |
table_name | string | The name of the table. |
schedule_task_id | string | The ID of the scheduled node. |
schedule_task_owner | string | The owner of the scheduled node. If the node is not scheduled in DataWorks, this metric is set to NULL. |
job_id | string | The node ID, which may not be the instance ID of the node that is scheduled in DataWorks. You can use this metric to count the number of times that data is read from the table and the number of times that data is written to the table. |
op_type | string | The operation type, which can be READ, WRITE, or UNKNOWN. |
extras | string | The additional information, which is a JSON string.
If a MaxCompute node is run to perform operations on a table, you can use the task_name
key to obtain the name of the MaxCompute node. If the ID of a node that is scheduled
in DataWorks is not empty, you can use the schedule_task_name key to obtain the name
of the scheduled node. Example: |
biz_date | string | The data timestamp. |
Metrics in the raw_v_meta_column_usage table
Metric name | Data type | Description |
---|---|---|
tenant_id | bigint | The ID of the DataWorks tenant. |
project_id | bigint | The ID of the DataWorks workspace in which scheduled nodes are run. |
catalog_name | string | The catalog to which the table belongs. This metric is set to odps for MaxCompute projects. |
database_name | string | The name of the database or MaxCompute project. |
table_name | string | The name of the table. |
column_name | string | The name of the field. |
schedule_task_id | string | The ID of the scheduled node. |
schedule_task_owner | string | The owner of the scheduled node. If the node is not scheduled in DataWorks, this metric is set to NULL. |
inst_id | string | The node ID, which may not be the instance ID of the node that is scheduled in DataWorks. |
op_type | string | The operation type, which can be SELECT, JOIN, GROUP BY, or WHERE. |
extras | string | The additional information, which is a JSON string.
If a MaxCompute node is run to perform operations on a table, you can use the task_name
key to obtain the name of the MaxCompute node. If the ID of a node that is scheduled
in DataWorks is not empty, you can use the schedule_task_name key to obtain the name
of the scheduled node. Example: |
biz_date | string | The data timestamp. |
Metrics in the raw_v_meta_biz_table_wiki table
Metric name | Data type | Description |
---|---|---|
tenant_id | bigint | The ID of the DataWorks tenant. |
project_id | bigint | The ID of the DataWorks workspace in which scheduled nodes are run. |
catalog_name | string | The catalog to which the table belongs. This metric is set to odps for MaxCompute projects. |
database_name | string | The name of the database or MaxCompute project. |
table_name | string | The name of the table. |
version | string | The version number of Wiki. |
operator | string | The final operator, which may be an owner of the table. |
content | string | The content of Wiki, which is written by using the Markdown syntax. |
update_time_ts | bigint | The modification time. The metric value is a 13-digit timestamp. |
biz_date | string | The data timestamp. |
Metrics in the raw_v_meta_table_join_map table
Metric name | Data type | Description |
---|---|---|
tenant_id | bigint | The ID of the DataWorks tenant. |
catalog_name | string | The catalog to which the table belongs. This metric is set to odps for MaxCompute projects. |
database_name | string | The name of the database or MaxCompute project. |
table_name | string | The name of the table. |
column_name | string | The name of the field. |
join_database_name | string | The name of the associated database or MaxCompute project. |
join_table_name | string | The name of the associated table. |
join_column_name | string | The name of the associated field. |
join_type | string | The type of the JOIN operation, which can be left, right, or inner. |
schedule_task_id | string | The ID of the scheduled node. |
schedule_task_owner | string | The owner of the scheduled node. |
job_id | string | The ID of the node at the engine layer. |
extras | string | The additional information, which is a JSON string. If a MaxCompute node is run to perform operations on a table, you can use the task_name key to obtain the name of the MaxCompute node. |
biz_date | string | The data timestamp. |
Metrics in the raw_v_meta_table_detail_log table
Metric name | Data type | Description |
---|---|---|
tenant_id | bigint | The ID of the DataWorks tenant. |
catalog_name | string | The catalog to which the table belongs. This metric is set to odps for MaxCompute projects. |
database_name | string | The name of the database or MaxCompute project. |
table_name | string | The name of the table. |
operator | string | The user who views table details. |
view_time_ts | bigint | The time when table details are viewed. The metric value is a 13-digit timestamp. |
biz_date | string | The data timestamp. |
Metrics in the raw_v_meta_category table
Metric name | Data type | Description |
---|---|---|
tenant_id | bigint | The ID of the DataWorks tenant. |
category_id | bigint | The ID of the category. |
category_name | string | The name of the category. |
category_pid | bigint | The ID of the parent category. This metric is set to 0 or NULL for a level 1 category. |
depth | bigint | The level of the category. This metric is set to 1 for a level 1 category. |
sort_field | double | The field based on which the categories are sorted. |
creator_account | string | The account that creates the category. |
created_time_ts | bigint | The creation time. The metric value is a 13-digit timestamp. |
last_modified_time_ts | bigint | The last modification time. The metric value is a 13-digit timestamp. |
biz_date | string | The data timestamp. |
Metrics in the raw_v_schedule_node table
Metric name | Data type | Description |
---|---|---|
tenant_id | bigint | The ID of the tenant. |
project_id | bigint | The ID of the DataWorks workspace. |
node_id | bigint | The ID of the node. |
node_name | string | The name of the node. |
node_type | bigint | The scheduling type of the node.
|
prg_type | bigint | The type of the node.
|
flow_id | bigint | The ID of the workflow. |
project_env | string | The environment type.
|
create_time | bigint | The creation time. The metric value is a 13-digit timestamp. |
create_user | string | The creator. |
modify_time | bigint | The last modification time. The metric value is a 13-digit timestamp. |
modify_user | string | The user who modifies the node. |
prg_name | string | The name of the node type. |
para_value | string | The execution parameter. |
file_id | bigint | The ID of the file. |
file_version | bigint | The file version. |
owner | string | The owner of the node. |
resgroup_id | bigint | The ID of the resource group. |
baseline_id | bigint | The ID of the baseline. |
cycle_type | bigint | The recurrence.
|
repeatable | bigint | The rerun identifier.
|
connection | string | The connection string of the data source. |
dqc_type | bigint | Specifies whether the node uses the Data Quality service.
|
dqc_description | string | The Data Quality rule. |
task_rerun_time | bigint | The number of times that the task can be rerun. |
task_rerun_interval | bigint | The rerun interval. Unit: milliseconds. |
cron_express | string | The CRON expression that specifies the scheduling frequency of the node. |
priority | bigint | The priority of the task. Valid values: 1, 3, 5, 7, and 8. A greater value indicates a higher priority. |
start_effect_date | bigint | The time when the node takes effect. The metric value is a 13-digit timestamp. |
end_effect_date | bigint | The time when the node loses effect. The metric value is a 13-digit timestamp. |
biz_date | string | The data timestamp. |
Metrics in the raw_v_schedule_task table
Metric name | Data type | Description |
---|---|---|
tenant_id | bigint | The ID of the tenant. |
project_id | bigint | The ID of the DataWorks workspace. |
node_id | bigint | The ID of the node. |
node_name | string | The name of the node. |
task_id | bigint | The name of the task. |
dag_id | bigint | The directed acyclic graph (DAG) ID of the workflow. |
task_type | bigint | The scheduling type of the task.
|
dag_type | bigint | The DAG type.
|
prg_type | bigint | The type of the node.
|
flow_id | bigint | The ID of the workflow. |
create_time | bigint | The creation time. The metric value is a 13-digit timestamp. |
modify_time | bigint | The last modification time. The metric value is a 13-digit timestamp. |
cycle_time | bigint | The scheduling time, which is a 13-digit timestamp. |
in_group_id | bigint | The serial number of the task. |
prg_name | string | The name of the node type. |
para_value | string | The execution parameter. |
file_id | bigint | The ID of the file. |
file_version | bigint | The file version. |
owner | string | The owner of the node. |
resgroup_id | bigint | The ID of the resource group. |
baseline_id | bigint | The ID of the baseline. |
cycle_type | bigint | The recurrence.
|
repeatable | bigint | The rerun identifier.
|
connection | string | The connection string of the data source. |
dqc_type | bigint | Specifies whether the node uses the Data Quality service.
|
dqc_description | string | The Data Quality rule. |
task_rerun_time | bigint | The number of times that the task can be rerun. |
task_rerun_interval | bigint | The rerun interval. Unit: milliseconds. |
begin_waittime_time | bigint | The time when the node starts to wait for scheduling. The metric value is a 13-digit timestamp. |
finish_time | bigint | The time when the running is complete. The metric value is a 13-digit timestamp. |
begin_waitres_time | bigint | The time when the node starts to wait for resource allocation. The metric value is a 13-digit timestamp. |
begin_run_time | bigint | The time when the node starts to run. The metric value is a 13-digit timestamp. |
rerun_times | bigint | The number of times that the task is rerun. |
priority | bigint | The priority of the task. Valid values: 1, 3, 5, 7, and 8. A greater value indicates a higher priority. |
task_key | string | The unique identifier of the task. |
error_msg | string | The reason why the task failed. |
status | bigint | The status of the task.
|
biz_date | string | The data timestamp. |
Metrics in the raw_v_schedule_node_relation table
Metric name | Data type | Description |
---|---|---|
tenant_id | bigint | The ID of the tenant. |
child_node_id | bigint | The ID of the descendant node. |
parent_node_id | bigint | The ID of the ancestor node. |
step_type | bigint | The dependency type.
|
child_flow_id | bigint | The ID of the workflow. |
project_env | string | The environment type.
|
create_time | bigint | The creation time. The metric value is a 13-digit timestamp. |
create_user | string | The creator. |
modify_time | bigint | The last modification time. The metric value is a 13-digit timestamp. |
modify_user | string | The user who modifies the node. |
biz_date | string | The data timestamp. |
Metrics in the raw_v_schedule_di_resgroup table
Metric name | Data type | Description |
---|---|---|
tenant_id | bigint | The ID of the tenant. |
project_id | bigint | The ID of the DataWorks workspace. |
node_id | bigint | The ID of the node. |
project_env | string | The environment of the workspace. |
res_group_identifier | string | The ID of the resource group for Data Integration. |
src_type | string | The type of the data source. |
dst_type | string | The type of the data destination. |
src_datasource | string | The data source. |
dst_datasource | string | The data destination. |
config_concurrent | bigint | The number of concurrent nodes. |
biz_date | string | The data timestamp. |
Metrics in the raw_v_tenant_res_group table
Metric name | Data type | Description |
---|---|---|
tenant_id | bigint | The ID of the tenant. |
res_group_id | bigint | The ID of the resource group. |
res_group_identifier | string | The identifier of the resource group. |
res_group_type | bigint | The type of the resource group.
|
res_group_mode | bigint | The billing method of the resource group.
|
status | bigint | The status of the resource group.
|
biz_ext_key | string | The extension field of the resource group. A value of single indicates an exclusive resource group. |
biz_date | string | The data timestamp. |
Metrics in the raw_v_tenant_user table
Metric name | Data type | Description |
---|---|---|
tenant_id | bigint | The ID of the tenant. |
yun_account | string | The Alibaba Cloud account. |
account_name | string | The name of the account. |
nick | string | The display name of the account. |
full_yun_account | string | The Alibaba Cloud account that contains the account provider information. |
biz_date | string | The data timestamp. |
Metrics in the raw_v_tenant_workspace table
Metric name | Data type | Description |
---|---|---|
tenant_id | bigint | The ID of the tenant. |
project_id | bigint | The ID of the workspace. |
project_name | string | The name of the workspace. |
project_identifier | string | The identifier of the workspace. |
project_desc | string | The description of the workspace. |
project_owner | string | The owner of the workspace. |
status | bigint | The status of the workspace.
|
biz_date | string | The data timestamp. |
Metrics in the raw_v_tenant_workspace_user table
Metric name | Data type | Description |
---|---|---|
tenant_id | bigint | The ID of the DataWorks tenant. |
project_id | bigint | The ID of the DataWorks workspace. |
base_id | string | The base ID of the user. |
status | bigint | The status of the user.
|
gmt_create_ts | bigint | The creation time. The metric value is a 13-digit timestamp. |
gmt_modified_ts | bigint | The last modification time. The metric value is a 13-digit timestamp. |
biz_date | string | The data timestamp. |