This topic describes how to go to the details page of a table and view the details of the table, such as the basic information, output information, and lineage information.

Go to the details page of a table

You can use one of the following methods to go to the details page of a table:
  • Go to the details page of a table from the Workspaces page
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. In the top navigation bar, select the region where your workspace resides. On the Workspaces page, find the desired workspace and click Data Map in the Actions column. The homepage of DataMap appears.
    4. On the Table tab, select a data source type from the drop-down list and enter a keyword in the search box to search for the table whose details you want to view. In this topic, MaxCompute is selected. Find the table in the results that are displayed and click the table name to go to the details page of the table.
  • Go to the details page of a table from the All Data tab
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. In the top navigation bar, select the region where your workspace resides. On the Workspaces page, find the desired workspace and click Data Map in the Actions column. The homepage of DataMap appears.
    4. In the top navigation bar of the DataMap page, click All Data.
    5. On the Table tab, select a data source type from the drop-down list and enter a keyword in the search box to search for the table whose details you want to view. In this topic, MaxCompute is selected. Find the table in the results that are displayed and click the table name to go to the details page of the table.

View the details of a table

On the table details page, the following sections and tabs display the table information: Table Basic Information, Business Information, Permission Information, Technical Information, Details, Output, Lineage, Usage Notes, Data Quality, Records, Data Preview, and Data Profiling. View the details of a table
Area Description
1 In this area, you can perform the following operations on a table:
  • Apply for Permissions: You can apply for permissions on the table in Security Center and view the application records in DataMap.
  • Add to Favorites or Remove from Favorites: You can add the table to favorites or remove the table from favorites.
  • Generate API: You can use the table to generate or register an API in DataService Studio.
  • Analyze Data: You can query and analyze data in the table by executing SQL statements on the SQL Query tab of the DataAnalysis page.
  • Refresh: You can refresh the table details that are displayed on this page.
2 In this area, you can search for a table by keyword that is contained in a table name, field name, or project name.
3 In this area, you can view the information of a table in the following sections:
  • Table Basic Information: In this section, you can view information such as Reads, Favorites, and Views. You can click View Code next to Output Node to view the node code.
    Note
    • Reads: This item displays the number of times that the table is read from the production environment by executing SQL statements or running Tunnel Download commands, or for data synchronization over the last 30 days. The number of times is collected in offline mode and is updated one day later.
    • Favorites: This item displays the number of times that the table is added to favorites. The number of times is collected in real time.
    • Views: This item displays the number of times that the table is viewed in DataMap over the last 30 days. The number of times is collected in offline mode and is updated one day later.
    • Storage: This item displays the logical storage space occupied by data in the table. The logical storage space is calculated in offline mode and is updated one day later.
    • Output Node: This item displays the ID of the auto triggered node that generates data in the table. If the table is periodically updated but no node ID is displayed, the data in the table may not be generated by an auto triggered node. You can contact the table owner for details. The ID is determined in offline mode and is updated one day later.
  • Business Information: In this section, you can view information such as Workspace, Environment, and Category.
  • Permission Information: In this section, you can view the permissions that you are granted on the table. You can also click View More to go to the Permission application tab of the Data access control page and apply for more permissions on the table.
  • Technical Information: In this section, you can view information such as Compute Engine Information. You can click View next to Compute Engine Information to view or copy the compute engine information.
    Note Description of the Data Viewed At parameter:
    • The value of this item indicates the time when the table is last accessed by using a command or in a node scheduling scenario.
    • The time is for reference only and may not be the same as the actual time when the table is last accessed.
    • The time is collected in offline mode and is updated one day later.
4 In this area, you can view the information of a table on the following tabs:
  • Details: You can view the following information of the table on this tab: Field Information, Partition Information, and Change Records. For more information, see View information on the Details tab.
  • Output: If the table data periodically changes with the node that generates the table, you can view the change status and the data that is continuously updated on this tab. The output information is collected in offline mode and is updated one day later.
  • Lineage: You can view the inner lineages of the node that generates the table. If the current table is used as the data source of an API, you can also view the lineages between the table and the API. MaxCompute allows you to view the complete lineages of a batch synchronization node that is used to synchronize data to MaxCompute. For more information, see View lineage information on the Lineage tab. The lineage information is collected in offline mode and is updated one day later.
    Note For more information about how to view the complete lineage of a DataService Studio API, see View the details of an API.
  • Usage Notes: You can click Edit, View Versions, or View Markdown Syntax to view the related information.
  • Data Quality: You can view the monitoring rules that are configured for the table and the alerts that are generated based on the monitoring rules. You can click Configure Rules to go to the Data Quality page and configure monitoring rules for the table. For more information, see Configure monitoring rules by table.
  • Records: You can view the reference and access records of the table on the following subtabs:
    • Frequently Associated: On this subtab, you can view the number of times that the table data is referenced.
      Note The Frequently Associated subtab displays the number of times that the table data is referenced over the last 30 days. The number of times is collected in offline mode and is updated one day later.
    • Access Statistics: You can view the reference records of the table in the following sections on this subtab:
      • Trend for Reads: A date in the line chart corresponds to the number of times that the table is read from the development or production environment on the date. The number of times that a field in the table is read is related to the numbers of times a node that references the field is run and the field is referenced in the code of the node. The data displayed in the line chart is collected in offline mode and is updated one day later.

        If a field in the table is referenced by a node once and the node is run twice, the number of times that the field is read is recorded as two. If the field is referenced in the code of the node twice, the number of times that the field is read is recorded as two after the node is run once.

      • Field References in Clauses: displays the number of times that the fields in the table are specified in the WHERE, SELECT, JOIN, and GROUP BY statements. The number of times is collected in offline mode and is updated one day later.
      • Top 10 Readers: displays the users who read the table by executing SQL statements over the last 30 days and other details about the read operations. The users include scheduling users in the production environment and users who commit nodes in the development environment. The SQL statements that are used include WHERE, SELECT, JOIN, and GROUP BY. The user information is collected in offline mode and is updated one day later.
  • Data Preview: You can preview the data in the table on this tab.
    Notice
    • You can preview tables that are in the production environment only if you are granted the required permissions. For more information about how to apply for permissions on tables, see Request permissions on tables.
    • If the preview feature for tables in the development environment and that for tables in the production environment are enabled for the workspace to which the table belongs on the Configuration Management tab, you can preview the table data on the Data Preview tab without the need to apply for permissions on tables in Security Center.
    • If you configure data masking rules and the data masking rules are in effect, the Data Preview tab displays data based on the data masking rules. For more information about how to configure data masking rules, see Create de-identification rules.
    • The Data Preview tab cannot display data in external tables.
  • Data Profiling: You can view the data profiling results of the table on this tab. DataWorks detects the data of a table based on the schema and a partition key value. The data profiling results include basic statistical information and data distribution. For more information, see Detect data on the Data Profiling tab.
    Note If you perform data profiling on the table, you are charged for the data profiling task, and the fees are generated in Data Quality. You can go to the Node Query page of Data Quality to view the data profiling logs of the table.

View information on the Details tab

Click the Details tab and view information of a table on the following subtabs of the Details tab: Field Information, Partition Information, and Change Records.
  • Field Information: On this subtab, you can view the field information of a table. If the table is a partitioned table, you can also view partition fields in the table that are displayed in the Partition Fields section. Field Information
    Operation Description
    Edit You can click this button to modify Description, Business Description, Security Level, and Primary Key. You can also save the modified information or cancel the modification.
    You can specify a security level for multiple fields at a time.
    Note
    • Only a workspace member that is assigned the Workspace Manager role or the table owner can modify settings for table fields. If you want to modify settings for table fields, you must obtain the permissions of the Workspace Manager role. For more information, see Manage global roles and members.
    • The Security Level column is displayed on the Field Information subtab only for MaxCompute tables for which you specify field security levels.
    • The security level feature is exclusive to MaxCompute.
    • You can specify security levels for fields in a MaxCompute table on the Field Information subtab only after you enable the security level feature in the MaxCompute compute engine instance associated with the current workspace. For more information about how to enable the security level feature, see Label-based access control.
    Batch Edit Security Level You can click this button to specify or modify security levels for multiple table fields at a time. This improves data security.
    Note
    • The security level feature is exclusive to MaxCompute.
    Upload You can click this button and drag the file that you want to upload from your on-premises machine to the Batch Upload Field Information dialog box.
    Note
    • Only a workspace member that is assigned the Workspace Manager role or the table owner can upload data to a table whose details are displayed on this page. If you want to upload data to a table whose details are displayed on this page, you must obtain the permissions of the Workspace Manager role. For more information, see Manage global roles and members.
    • Only .xlsx files created in Excel 2007 are supported. You can also click Download Template File to download the template file.
    Download You can click this button to download the field information of the table.
    Generate SELECT Statement You can click this button to view or copy the SELECT statement in the Generate SELECT Statement dialog box. The statement can be used to query the table data.
    Generate DDL Statement. You can click this button to view or copy the data definition language (DDL) statement in the Generate DDL Statement dialog box. The statement can be used to create the table.
    Note The Number of Reads parameter represents the number of times that a field is specified in JOIN statements on the previous day. The value of this parameter is presented in the form of star rating based on the proportion of the number of times the field is specified in JOIN statements to the total number of times all fields in the table are specified in JOIN statements. The highest level of star rating is 5, and the lowest level of star rating is 0.
  • Partition Information: On this subtab, you can view information such as Partition Name, Records, and Storage. This subtab displays only the information of partitioned MaxCompute tables. Partition Information
    Note The data in the Records and Storage columns are for reference only. The data displayed on the Partition Information subtab may not be updated in real time. The data in the compute engine instance prevails.
  • Change Records: On this subtab, you can view information such as Description, Change Type, and Object. Change Records

    You can select a change type from the Change Type drop-down list on the Change Records subtab to view the related table changes.

    Change types include Create Table, Modify Table, Drop Table, Create Partition, Drop Partition, Modify Owner, and Modify TTL.

View lineage information on the Lineage tab

The Lineage tab displays the inner lineages of the node that generates the table. MaxCompute also allows you to view the complete lineages of a batch synchronization node that is used to synchronize data to MaxCompute. You can view the ancestor and descendant tables of a MaxCompute table. You can also expand the lineage levels of the MaxCompute table to view the sources and destinations of the table.
Note
  • The lineage feature is supported only in DataWorks Standard Edition or a more advanced edition. For example, if the compute engine is MaxCompute or E-MapReduce (EMR), this feature is available in DataWorks Standard Edition or a more advanced edition.
  • The lineage information displayed on this tab for a MaxCompute table includes the lineages between tables and the lineages between fields. The lineages are obtained by parsing the data of ODPS SQL scheduling jobs. The lineages that are generated by manual operations, such as ad hoc queries, are not included. The lineage information displayed on the Lineage tab is collected in offline mode and is updated one day later.
On the Lineage tab, you can view Table Lineage, Field Lineage, and Impact Analysis.
  • The Table Lineage tab consists of the Graph Analysis and View by Level subtabs.
    • Graph Analysis: On this subtab, you can view the number of ancestor and descendant tables at all levels of the table. You can also view the total number of ancestor and descendant tables for each table. Lineage
    • View by Level: On this subtab, you can view the ancestor and descendant tables at the nearest level of the table by default. You can search for ancestor and descendant tables based on their globally unique identifiers (GUIDs). View by Level
  • Field Lineage: On this tab, you can select a field in the table from the Field Name drop-down list to view the lineage information of the field. Field Lineage
  • Impact Analysis: On this tab, you can specify one or more of the following conditions to view Scheduling Output or Full Link of the lineage: Lineage Level, Node Type, Table Name, Project Name, and Table Owner. Impact Analysis

    You can click Start Analysis to perform an impact analysis. After the analysis is complete, you can download the impact analysis result. You can also enable the system to send the impact analysis result to the owners of descendant tables of the current table by email.

Detect data on the Data Profiling tab

DataWorks detects the data of a table based on the schema and a partition key value. The data profiling results include basic statistical information and data distribution.

Limits:
  • Only data in partitioned tables can be detected.
  • Only tables in the production environment can be detected.
  • Only the table owner can enable the Auto Profiling feature.
On the Data Profiling tab, you can specify a profiling mode and view data profiling records. Detect data on the Data Profiling tab
The following data profiling modes are supported:
  • Manual Profiling
    Note Data profiling tasks run in the MaxCompute project to which a detected table belongs. The system can detect a maximum of 10 columns in a table at a time. To save resources, select only the columns that you want to detect.
    To perform manual profiling, perform the following steps:
    1. On the Data Profiling tab, click Manual Profiling.
    2. In the Manual Profiling dialog box, configure the parameters. Manual Profiling
      Parameter Description
      Table Name The name of the table that you want to detect. By default, the table name is presented in the Workspace name.Table name format and cannot be changed.
      Partition Value The partition that you want to detect. You can select the desired partition from the Partition Value drop-down list.
      Detailed Configuration The columns that you want to detect. You can select the desired columns.
      Estimated Cost The estimated cost for running the data profiling task. The cost is estimated based on the settings of the preceding parameters.
      Notice
      • To detect data in the MaxCompute table, you must execute MaxCompute SQL statements. In this case, you will be charged for using the MaxCompute service. The estimated cost is for reference only. The actual cost may vary based on the amount of data that is detected. You can check the bills for MaxCompute to view the actual cost.
      • Features provided by Data Quality are used during data profiling. When you run a data profiling task, fees are generated in Data Quality and charged by DataWorks. For more information, see Overview.
    3. Select I understand that using this service will be charged.
    4. Click Commit.
    5. After the data profiling task is complete, view the data profiling results on the Data Profiling tab.

      You can select an option from the Profiling Records drop-down list to view the desired data profiling result. You can choose Data Distribution > Value range to view the distribution of data values in a field.

  • Auto Profiling
    To enable auto profiling, perform the following steps:
    1. Turn on Auto Profiling.
    2. In the Auto Profiling (When Partition Information Changes) dialog box, configure the parameters. Auto Profiling
      Parameter Description
      Table Name The name of the table that you want to detect. By default, the table name is presented in the Workspace name.Table name format and cannot be changed.
      Partition Value By default, the partition value is the latest partition value when data profiling is triggered. You cannot change the value.
      Detailed Configuration The columns that you want to detect. You can select the desired columns.
      Bind Trigger The auto triggered node that triggers auto profiling. You must select an auto triggered node from the Bind Trigger drop-down list. You can view the IDs of auto triggered nodes in Operation Center. We recommend that you select the node that generates the current table.

      After you select the metrics based on which you want to detect the table data and submit the auto profiling task, the system runs the auto profiling task to detect the latest partition in the table after the auto triggered node is successfully run.

      Estimated Cost The estimated cost for running the data profiling task. The cost is estimated based on the settings of the preceding parameters.
      Notice
      • To detect data in the MaxCompute table, you must execute MaxCompute SQL statements. In this case, you will be charged for using the MaxCompute service. The estimated cost is for reference only. The actual cost may vary based on the amount of data that is detected. You can check the bills for MaxCompute to view the actual cost.
      • Features provided by Data Quality are used during data profiling. When you run a data profiling task, fees are generated in Data Quality and charged by DataWorks. For more information, see Overview.
    3. Select I understand that using this service will be charged.
    4. Click Commit.
    5. After the data profiling task is complete, view the data profiling results on the Data Profiling tab.

      You can select an option from the Profiling Records drop-down list to view the desired data profiling result.