DataWorks provides the resource analysis feature for data developers and administrators
to view and analyze their own resources or all resources in a workspace. You can view
and analyze the resource usage of tables and nodes and the status of nodes. This helps
you optimize the overall resource usage.
Prerequisites
DataWorks Professional Edition or a more advanced edition is activated.
Usage notes
- Different accounts may have different roles or permissions. Therefore, the tables
and nodes that you can view on the Resource Analysis page may vary with the account
that you use to access the page. Only the administrator of a workspace can view the
details of all resources in the workspace.
- You can view the resource usage of MaxCompute tables, MaxCompute nodes, and Data Integration
nodes, and the status of MaxCompute nodes and Data Integration nodes.
Procedure
- Log on to the DataWorks console.
- In the left-side navigation pane, click Workspaces.
- On the Workspaces page that appears, find the target workspace and click Data Analytics in the Actions column.
- Click the icon in the upper-left corner and choose .
The Resource Optimization page appears.
- In the left-side navigation pane, click Resource Analysis. On the Resource Analysis page, you can view the resource usage of your own tables
and nodes or all tables and nodes in the workspace. You can also check the status
of your own nodes or all nodes in the workspace.
You can choose to view personal resources or workspace resources based on your optimization
requirements.
- The Personal Resources tab displays the tables and nodes of the current account.
- The Workspace Resources tab displays all tables and nodes in the current workspace. Only the administrator
of a workspace can view the details of all resources in the workspace. The Workspace
Resources tab is displayed only when you log on as a workspace administrator.
The following section describes the perspectives from which tables and nodes are analyzed.
You can view the details in the resource list.
- Resource Type: TablesTables are analyzed from the following perspectives:
- Occupied Storage: the total amount of storage space that is occupied by the table.
- Daily Increased Storage: the amount of storage space that was increased on the day before the current date,
compared with the amount of storage space that was occupied two days before the current
date.
- Number of Descendant Tables: the number of descendant tables of the table.
- Output Node: the ID of the node that generates the table. This information indicates whether
a node continuously generates data for the table.
- If the Output Node column is empty, the table is not an output table of a DataWorks
node and may be a temporary table or a dimension table that is seldom updated. Generally,
you can manually maintain the table.
- If the Output Node column has data, the table is an output table of a node. The table
may be a table that requires regular updates.
You can plan the optimization operations to be performed on a table based on the analysis
results that are displayed on the page and your business requirements. For example,
if a table has a long lifecycle, occupies a large amount of storage space, does not
have descendant tables, is not accessed from a long period of time, and does not have
a node that generates data for it, you can check the details of the table. If the
table is an unnecessary table, you can shorten its lifecycle or delete it.
On the Resource Analysis page, you can perform the following optimization operations
on a table:
Details,
Change Lifecycle, and
Delete.
Note If you change the lifecycle of a table or delete a table, the operation immediately
takes effect and you cannot undo the operation. Exercise caution when you perform
these operations.
- Resource Type: Nodes
Note The resource analysis results of nodes show the status and resource usage of the nodes
on the day before the current date.
You can view MaxCompute nodes and Data Integration nodes. In the following example,
MaxCompute nodes are used.Nodes are analyzed from the following perspectives:
- Number of Child Nodes: the number of child nodes of the node. This information is important and helps you
determine the dependencies of the node. If the value is not 0, the node has child
nodes. In this case, exercise caution when you optimize the node because the optimization
may affect the child nodes.
- Output Table Name: If the node writes data to MaxCompute tables, the names of these tables are displayed
in this column. If the Output Table Name column is empty, the node does not write
data to MaxCompute tables.
You can plan the optimization operations to be performed on a node based on the analysis
results that are displayed on the page and your business requirements. For example,
if a node failed to run, does not have child nodes or output tables, and consumes
a large amount of resources, you can check the details of the node. If the node is
an unnecessary node, you can pause the node.
On the Resource Analysis page, you can perform the following optimization operations
on a node:
Details and
Pause node.
Note If you pause a node, the node instances that have been generated are not affected,
whereas newly generated node instances are paused. After you pause a node, you may
also need to optimize the output tables of the node.