After the data opening package of DataWorks is installed, you can use the data opening feature to collect metadata in DataWorks and use the metadata for subsequent data governance and O&M. This topic describes the scenarios of the data opening feature and commands required to use the feature.

Prerequisites

The data opening package is installed. For more information, see Install and remove the data opening package.

Instructions

The following sections describe the commands that are used to obtain various types of metadata from a MaxCompute node of DataWorks. Before you use these commands, go to the code editing page of the MaxCompute node.

  1. Go to the DataStudio page.
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. On the Workspaces page, find the desired workspace and click Data Analytics in the Actions column.
  2. Go to the code editing page of the MaxCompute node.
    In the left-side navigation pane, click Business Flow, find your workflow, and then click the workflow name. Then, right-click MaxCompute and choose Create > ODPS SQL. In the Create Node dialog box, configure the parameters required to create a MaxCompute SQL node.
After the MaxCompute SQL node is configured, the code editing page of the node appears. On this page, you can use the data opening feature to query the following types of metadata:

View the created databases in a MaxCompute project

The raw_v_meta_database view provided by the data opening feature can be used to query the created databases in a MaxCompute project. You can run the following command to perform the operation:
SELECT * FROM u_meta_hangzhou.raw_v_meta_database_Version('Business date');
In the preceding command:
  • Version: specifies the version of the data opening package that you install and use. The format of the version is similar to v1_1. You can run the DESCRIBE PACKAGE command to query the actual version of the data opening package that you use. For more information, see View the tables or views provided by the data opening package.
  • Business date: represents the specific business date on which you want to view the metadata information. The date is presented in the yyyymmdd format. If you do not specify a specific business date, the metadata information on all business dates is queried.
The following code provides a command example:
-- Query the databases created on January 9, 2021.
SELECT * FROM u_meta_hangzhou.raw_v_meta_database_v1_1('20210109');
The following figure shows the query result.Query result 1For more information about the fields in the query result, see Metrics in the raw_v_meta_database table.

View the tables in a MaxCompute project

The raw_v_meta_table view provided by the data opening feature can be used to query the tables in a MaxCompute project. You can run the following command to perform the operation:
SELECT *
  FROM u_meta_hangzhou.raw_v_meta_table_Version('Business date')
  WHERE catalog_name = 'your_catalog_name'
   AND database_name = 'your_database_name'
   AND table_name = 'your_table_name'
;
In the preceding command:
  • Version: specifies the version of the data opening package that you install and use. The format of the version is similar to v1_1. You can run the DESCRIBE PACKAGE command to query the actual version of the data opening package that you use. For more information, see View the tables or views provided by the data opening package.
  • Business date: represents the specific business date on which you want to view the metadata information. The date is presented in the yyyymmdd format. If you do not specify a specific business date, the metadata information on all business dates is queried.
  • your_catalog_name: specifies the computing engine whose metadata you want to view. Set the value to odps.
  • your_database_name: specifies the name of the project in which you want to view metadata information. You can specify this parameter based on your business requirements.
  • your_table_name: specifies the name of the table whose metadata information you want to view. You can specify this parameter based on your business requirements.
The following code provides a command example:
-- Query the data of the ods_user_info_d table under the MaxCompute project isv2 on January 9, 2021.
SELECT *
  FROM u_meta_hangzhou.raw_v_meta_table_v1_1('20210109')
  WHERE catalog_name = 'odps'
   AND database_name = 'isv2'
   AND table_name = 'ods_user_info_d'
;
The following figure shows the query result.Query result 2For more information about the fields in the query result, see Metrics in the raw_v_meta_table table.

View the statistical information of a table

The rpt_v_meta_ind_table_core and rpt_v_meta_ind_table_extra views provided by the data opening feature can be used to query the statistical information of a table, such as the tenant ID and lifecycle. You can run the following command to perform the operation:
SELECT c.tenant_id, c.table_uuid, c.dim_life_cycle, c.is_partition_table, c.entity_type, c.categories, c.last_access_time, c.partition_count, c.favorite_count, e.output_task_count
FROM u_meta_hangzhou.rpt_v_meta_ind_table_core_Version('Business date') c 
LEFT OUTER JOIN u_meta_hangzhou.rpt_v_meta_ind_table_extra_Version('Business date') e 
ON c.table_uuid = e.table_uuid AND c.tenant_id = e.tenant_id 
WHERE c.catalog_name = 'your_catalog_name'
  AND c.database_name = 'your_database_name'
  AND c.table_name = 'your_table_name'
;
In the preceding command:
  • Version: specifies the version of the data opening package that you install and use. The format of the version is similar to v1_1. You can run the DESCRIBE PACKAGE command to query the actual version of the data opening package that you use. For more information, see View the tables or views provided by the data opening package.
  • Business date: represents the specific business date on which you want to view the metadata information. The date is presented in the yyyymmdd format. If you do not specify a specific business date, the metadata information on all business dates is queried.
  • your_catalog_name: specifies the computing engine whose metadata you want to view. Set the value to odps.
  • your_database_name: specifies the name of the project in which you want to view metadata information. You can specify this parameter based on your business requirements.
  • your_table_name: specifies the name of the table whose metadata information you want to view. You can specify this parameter based on your business requirements.
The following code provides a command example:
-- Query the statistical information of the ods_user_info_d table under the MaxCompute project isv2 on January 9, 2021.
SELECT c.tenant_id, c.table_uuid, c.dim_life_cycle, c.is_partition_table, c.entity_type, c.categories, c.last_access_time, c.partition_count, c.favorite_count, e.output_task_count
FROM u_meta_hangzhou.rpt_v_meta_ind_table_core_v1_1('20210109') c 
LEFT OUTER JOIN u_meta_hangzhou.rpt_v_meta_ind_table_extra_v1_1('20210109') e 
ON c.table_uuid = e.table_uuid AND c.tenant_id = e.tenant_id 
WHERE c.catalog_name = 'odps'
  AND c.database_name = 'isv2'
  AND c.table_name = 'ods_user_info_d'
;
The following figure shows the query result.Query result 3For more information about the fields in the query result, see Core metrics in the rpt_v_meta_ind_table_core table and Additional metrics in the rpt_v_meta_ind_table_extra table.

View the node details of an output table

The raw_v_meta_table_output and raw_v_schedule_node views provided by the data opening feature can be used to query the node details of an output table. You can run the following command to perform the operation:
SELECT s.*, o.schedule_instance_id, execute_time
  FROM u_meta_hangzhou.raw_v_meta_table_output_Version('Business date') o 
  LEFT OUTER JOIN u_meta_hangzhou.raw_v_schedule_node_Version('Business date') s 
  ON o.schedule_task_id = s.node_id
WHERE o.type = 'your_table_type'
  AND o.database = 'your_database_name'
  AND o.table = 'your_table_name'
  AND s.project_env = 'your_project_environment'
;
In the preceding command:
  • Version: specifies the version of the data opening package that you install and use. The format of the version is similar to v1_1. You can run the DESCRIBE PACKAGE command to query the actual version of the data opening package that you use. For more information, see View the tables or views provided by the data opening package.
  • Business date: represents the specific business date on which you want to view the metadata information. The date is presented in the yyyymmdd format. If you do not specify a specific business date, the metadata information on all business dates is queried.
  • your_table_type: specifies the type of the output table whose metadata you want to view. You can specify only the MaxCompute type. Set the value to odps.
  • your_database_name: specifies the name of the project in which you want to view metadata information. You can specify this parameter based on your business requirements.
  • your_table_name: specifies the name of the table whose metadata information you want to view. You can specify this parameter based on your business requirements.
  • your_project_environment: specifies the environment of the MaxCompute project whose metadata you want to view. If you want to view the metadata of a MaxCompute project in a development environment, set the value to DEV. If you want to view the metadata of a MaxCompute project in a production environment, set the value to PROD.
The following code provides a command example:
-- Query the details of the MaxCompute table ods_user_info_d under the xc_simple_e1 project on January 9, 2021.
SELECT s.*, o.schedule_instance_id, execute_time
  FROM u_meta_hangzhou.raw_v_meta_table_output_v1_1('20210109') o 
  LEFT OUTER JOIN u_meta_hangzhou.raw_v_schedule_node_v1_1('20210109') s 
  ON o.schedule_task_id = s.node_id
WHERE o.type = 'odps'
  AND o.database = 'xc_simple_e1'
  AND o.table = 'ods_user_info_d'
  AND s.project_env = 'PROD'
;
The following figure shows the query result.Query result 4For more information about the fields in the query result, see Metrics in the raw_v_meta_table_output table and Metrics in the raw_v_schedule_node table.

View the ancestor and descendant nodes of a node

The raw_v_schedule_node and raw_v_schedule_node_relation views provided by the data opening feature can be used to query the ancestor and descendant nodes of a node. You can run the following command to perform the operation:
-- Query the ancestor nodes of a node.
SELECT * 
  FROM u_meta_hangzhou.raw_v_schedule_node_Version('Business date') t
 WHERE t.project_env = 'your_project_environment'
   AND t.node_id IN (
     SELECT parent_node_id 
       FROM u_meta_hangzhou.raw_v_schedule_node_relation_Version('Business date') r
      WHERE r.child_node_id = your_child_node_id
        AND r.project_env = 'your_project_environment'
 )
;
-- Query the descendant nodes of a node.
SELECT * 
  FROM u_meta_hangzhou.raw_v_schedule_node_Version('Business date') t
 WHERE t.project_env = 'your_project_environment'
   AND t.node_id IN (
     SELECT child_node_id 
       FROM u_meta_hangzhou.raw_v_schedule_node_relation_Version('Business date') r
      WHERE r.child_node_id = your_child_node_id
        AND r.project_env = 'your_project_environment'
 )
;
In the preceding command:
  • Version: specifies the version of the data opening package that you install and use. The format of the version is similar to v1_1. You can run the DESCRIBE PACKAGE command to query the actual version of the data opening package that you use. For more information, see View the tables or views provided by the data opening package.
  • Business date: represents the specific business date on which you want to view the metadata information. The date is presented in the yyyymmdd format. If you do not specify a specific business date, the metadata information on all business dates is queried.
  • your_project_environment: specifies the environment of the MaxCompute project whose metadata you want to view. If you want to view the metadata of a MaxCompute project in a development environment, set the value to DEV. If you want to view the metadata of a MaxCompute project in a production environment, set the value to PROD.
  • your_child_node_id: specifies the ID of the node whose metadata you want to view.
The following code provides command examples:
  1. Query the ancestor nodes of a node in a project in a production environment.
    -- Query the ancestor nodes of the 1000550985 node under a project in a production environment on January 9, 2021.
    SELECT * 
      FROM u_meta_hangzhou.raw_v_schedule_node_v1_1('20210109') t
     WHERE t.project_env = 'PROD'
       AND t.node_id IN (
         SELECT parent_node_id 
           FROM u_meta_hangzhou.raw_v_schedule_node_relation_v1_1('20210109') r
          WHERE r.child_node_id = 1000550985
            AND r.project_env = 'PROD'
     )
    ;
    The following figure shows the query result.5.1
  2. Query the descendant nodes of a node in a project in a production environment.
    -- Query the descendant nodes of the 1000550985 node under a project in a production environment on January 9, 2021.
    SELECT * 
      FROM u_meta_hangzhou.raw_v_schedule_node_v1_1('20210109') t
     WHERE t.project_env = 'PROD'
       AND t.node_id IN (
         SELECT child_node_id
           FROM u_meta_hangzhou.raw_v_schedule_node_relation_v1_1('20210109') r
          WHERE r.parent_node_id = 1000550985
            AND r.project_env = 'PROD'
     )
    ;
    The following figure shows the query result.5.2
For more information about the fields in the query result, see Metrics in the raw_v_schedule_node table and Metrics in the raw_v_schedule_node_relation table.

Query the owner details of a table or node

The raw_v_meta_table and raw_v_tenant_user views provided by the data opening feature can be used to query the owner details of a table or node. You can run the following command to perform the operation:
  • Query the owner details of a table.
    SELECT
        c.catalog_name,
        c.database_name,
        c.table_name,
        c.owner_name,
        u.account_name,
        u.nick
    FROM u_meta_hangzhou.raw_v_meta_table_Version('Business date') c 
     LEFT OUTER JOIN u_meta_hangzhou.raw_v_tenant_user_Version('Business date') u 
       ON c.owner_name = TOLOWER(u.yun_account)
     WHERE c.catalog_name = 'your_catalog_name'
       AND c.database_name = 'your_database_name'
       AND c.table_name = 'your_table_name'
    ;
    In the preceding command:
    • Version: specifies the version of the data opening package that you install and use. The format of the version is similar to v1_1. You can run the DESCRIBE PACKAGE command to query the actual version of the data opening package that you use. For more information, see View the tables or views provided by the data opening package.
    • Business date: represents the specific business date on which you want to view the metadata information. The date is presented in the yyyymmdd format. If you do not specify a specific business date, the metadata information on all business dates is queried.
    • your_catalog_name: specifies the computing engine whose metadata you want to view. Set the value to odps.
    • your_database_name: specifies the name of the project in which you want to view metadata information. You can specify this parameter based on your business requirements.
    • your_table_name: specifies the name of the table whose metadata information you want to view. You can specify this parameter based on your business requirements.
  • Query the owner details of a node.
    SELECT t.project_id, t.node_id, t.node_name, 
        t.create_user, u.account_name AS create_user_name, u.nick as create_user_nick,
        t.modify_user, m.account_name AS modify_user_name, m.nick as modify_user_nick
      FROM u_meta_hangzhou.raw_v_schedule_node_Version('Business date') t 
    LEFT OUTER JOIN u_meta_hangzhou.raw_v_tenant_user_Version('Business date') u ON t.create_user = u.yun_account
    LEFT OUTER JOIN u_meta_hangzhou.raw_v_tenant_user_Version('Business date') m ON t.modify_user = m.yun_account
     WHERE t.node_id = your_node_id
       AND t.project_env = 'your_project_environment'
    ;
    In the preceding command:
    • Version: specifies the version of the data opening package that you install and use. The format of the version is similar to v1_1. You can run the DESCRIBE PACKAGE command to query the actual version of the data opening package that you use. For more information, see View the tables or views provided by the data opening package.
    • Business date: represents the specific business date on which you want to view the metadata information. The date is presented in the yyyymmdd format. If you do not specify a specific business date, the metadata information on all business dates is queried.
    • your_project_environment: specifies the environment of the MaxCompute project whose metadata you want to view. If you want to view the metadata of a MaxCompute project in a development environment, set the value to DEV. If you want to view the metadata of a MaxCompute project in a production environment, set the value to PROD.
    • your_node_id: specifies the ID of the node whose metadata information you want to view.

The following code provides command examples:

  1. Query the owner details of a table on January 9, 2021.
    SELECT
        c.catalog_name,
        c.database_name,
        c.table_name,
        c.owner_name,
        u.account_name,
        u.nick
    FROM u_meta_hangzhou.raw_v_meta_table_v1_1('20210109') c 
     LEFT OUTER JOIN u_meta_hangzhou.raw_v_tenant_user_v1_1('20210109') u 
       ON c.owner_name = TOLOWER(u.yun_account)
     WHERE c.catalog_name = 'odps'
       AND c.database_name = 'isv2'
       AND c.table_name = 'ods_user_info_d'
    ;
    The following figure shows the query result.6.1
  2. Query the owner details of an auto triggered node and the details of the user who modifies the auto triggered node on January 9, 2021.
    SELECT t.project_id, t.node_id, t.node_name, 
        t.create_user, u.account_name AS create_user_name, u.nick as create_user_nick,
        t.modify_user, m.account_name AS modify_user_name, m.nick as modify_user_nick
      FROM u_meta_hangzhou.raw_v_schedule_node_v1_1('20210109') t 
    LEFT OUTER JOIN u_meta_hangzhou.raw_v_tenant_user_v1_1('20210109') u ON t.create_user = u.yun_account
    LEFT OUTER JOIN u_meta_hangzhou.raw_v_tenant_user_v1_1('20210109') m ON t.modify_user = m.yun_account
     WHERE t.node_id = 1000454827
       AND t.project_env = 'PROD'
    ;
    The following figure shows the query result.6.2
For more information about the fields in the query result, see Metrics in the raw_v_schedule_node table and Metrics in the raw_v_schedule_node_relation table.

What to do next

The views provided by the data opening feature support date parameters in the yyyyMMdd format. You can create partitions based on dates and query historical data over the last 30 days. If you have requirements such as metric trend analysis, you can query data by business date. Then, you can save the data to a project and use the data as the historical data required to perform metric analysis.