Track which RAM users drive the most compute spend and which jobs take the longest to run. This tutorial shows you how to back up job history from Information Schema, run cost and duration analysis queries, and push daily top-N reports to a DingTalk group.
Background
When you use MaxCompute in DataWorks standard mode, most jobs run under the root account. Information Schema records this root account as the executor, so only a small number of jobs show a RAM user as the owner. This makes per-user cost attribution and duration analysis difficult out of the box.
Data source options
| Source | Retention | Use when |
|---|---|---|
Project-level Information Schema (TASKS_HISTORY view) |
14 days | You need job-level detail for a single project |
| Tenant-level Information Schema | Longer, more complete | You need cross-project or longer-horizon analysis |
This tutorial uses the TASKS_HISTORY view. To back up beyond 14 days, the workflow writes daily snapshots to a persistent table. To use tenant-level Information Schema instead, see Tenant-level Information Schema.
How it works
-
Install the Information Schema permission package for each project to query.
-
Back up
TASKS_HISTORYdaily into a persistent table (information_history). -
Parse the
settingsfield to extract the actual RAM user from theSKYNET_ONDUTYkey—this corrects the root-account attribution problem. -
Run aggregation queries to produce
cost_topn(billing) andtime_topn(duration) tables. -
Push the ranked results to a DingTalk group using a Java program packaged as a JAR file.
-
Wire the three DataWorks nodes into a dependency chain and schedule them to run daily.
Prerequisites
Before you begin, make sure you have:
-
A MaxCompute project with the project owner account or a RAM user that has the
Super_Administratorrole -
Access to the DataWorks console with permission to create and schedule nodes
-
A DingTalk group where you have permission to add a robot
-
IntelliJ IDEA (or another Maven-compatible IDE) to compile the notification JAR
Step 1: Install the Information Schema package
Starting March 1, 2024, MaxCompute no longer automatically installs the project-level Information Schema package for new projects. If you are working with a new project, query the tenant-level Information Schema instead. For existing projects, follow the steps below.
The installation must be performed by the project owner or a RAM user with the Super_Administrator role. Install the package using either of the following methods:
-
Log on to the MaxCompute client and run:
install package Information_Schema.systables; -
Log on to the DataWorks console, go to Ad-hoc Query, and run:
install package Information_Schema.systables;
To analyze metadata from multiple MaxCompute projects, install the package for each project separately. Then insert the backups from all projects into a single table for unified analysis.
For more information about the features and limits of Information Schema, see Project-level Information Schema.
(Optional) Step 2: Grant permissions to other users
By default, only the project owner can view Information Schema data. To let other users or roles query it, grant them read access to the package.
Run the following commands in the MaxCompute client:
grant <actions> on package Information_Schema.systables to user <user_name>;
grant <actions> on package Information_Schema.systables to role <role_name>;
| Parameter | Description |
|---|---|
<actions> |
The permission to grant. Set this to Read. |
<user_name> |
An Alibaba Cloud account or RAM user added to the project. Run list users; to see available accounts. |
<role_name> |
A role added to the project. Run list roles; to see available roles. |
Example:
grant read on package Information_Schema.systables to user RAM$Bob@aliyun.com:user01;
For more information, see Access resources across projects based on packages.
Step 3: Back up job metadata
Create a persistent backup table and schedule a daily DataWorks node to populate it from TASKS_HISTORY.
3.1 Create the backup table
Log on to the MaxCompute client and run:
-- Replace <project_name> with your MaxCompute project name.
CREATE TABLE if NOT EXISTS <project_name>.information_history
(
task_catalog STRING
,task_schema STRING
,task_name STRING
,task_type STRING -- Job type: SQL, SQLRT (query acceleration), or CUPID (Spark)
,inst_id STRING -- Instance ID, used to identify time-consuming jobs
,`status` STRING
,owner_id STRING
,owner_name STRING
,result STRING
,start_time DATETIME
,end_time DATETIME
,input_records BIGINT
,output_records BIGINT
,input_bytes BIGINT -- Input data size in bytes; used in cost calculation
,output_bytes BIGINT
,input_tables STRING
,output_tables STRING
,operation_text STRING
,signature STRING
,complexity DOUBLE -- Job complexity coefficient; multiplied with input_bytes to estimate cost
,cost_cpu DOUBLE -- CPU-hours consumed; used for Spark job cost calculation
,cost_mem DOUBLE
,settings STRING -- JSON blob containing scheduler metadata; parse SKYNET_ONDUTY to get the true RAM user
,ds STRING
);
3.2 Create the ODPS SQL node for daily backup
In DataWorks Data Studio, create an ODPS SQL node named information_history. See Create an ODPS SQL node for setup details.
Add the following SQL to the node:
-- Replace <project_name> with your MaxCompute project name.
use <project_name>;
insert into table <project_name>.information_history
select * from information_schema.tasks_history where ds ='${datetime1}';
${datetime1} is a DataWorks scheduling parameter. In the Scheduling Configuration panel on the right, go to Basic Properties and set Parameter to datetime1=${yyyymmdd}.
Save the node by clicking the
icon in the upper-left corner.
To aggregate metadata from multiple MaxCompute projects, create one ODPS SQL node per project, each inserting into the same <code class="inline-code___exakR" data-tag="code" id="code_5f326182">information_history</code> table.
Step 4: Create analysis jobs for top-N billing accounts and time-consuming jobs
The settings field in TASKS_HISTORY contains a JSON object with scheduler metadata including SKYNET_ONDUTY, which identifies the RAM user who submitted the job—not just the root account. The analysis jobs use GET_JSON_OBJECT(settings, "$.SKYNET_ONDUTY") to attribute costs and durations to the correct user.
4.1 Create the supporting tables
Log on to the MaxCompute client and run the following commands.
RAM user lookup table (user_ram): maps account IDs to display names.
CREATE TABLE if NOT EXISTS <project_name>.user_ram
(
user_id STRING
,user_name STRING
);
Top-N billing accounts table (cost_topn):
CREATE TABLE if NOT EXISTS <project_name>.cost_topn
(
cost_sum DECIMAL(38,5)
,task_owner STRING
)
partitioned BY
(
ds STRING
);
Top-N time-consuming jobs table (time_topn):
CREATE TABLE if NOT EXISTS <project_name>.time_topn
(
inst_id STRING
,cost_time BIGINT
,task_owner STRING
)
partitioned BY
(
ds STRING
);
4.2 Create the analysis ODPS SQL node
In DataWorks Data Studio, create an ODPS SQL node named topn. Add the following SQL:
-- Enable data type 2.0. For more information, see Data type editions.
SET odps.sql.decimal.odps2=true;
-- Populate cost_topn.
-- Cost formulas by job type:
-- SQL / SQLRT: input_bytes (GB) x complexity x 0.3 (CNY)
-- CUPID (Spark, Terminated only): cost_cpu / 100 / 3600 x 0.66 (CNY)
-- Replace <user_id> with the account IDs you want to map to display names (from user_ram).
INSERT INTO TABLE <project_name>.cost_topn PARTITION (ds = '${datetime1}')
SELECT
NVL(cost_sum, 0) cost_sum
,CASE WHEN a.task_owner = '<user_id>' OR a.task_owner = '<user_id>' OR a.task_owner = '<user_id>'
THEN b.user_name
ELSE a.task_owner
END task_owner
FROM (
SELECT
inst_id
,owner_name
,task_type
,a.input_bytes
,a.cost_cpu
,a.status
,CASE WHEN a.task_type = 'SQL' THEN CAST(a.input_bytes/1024/1024/1024 * a.complexity * 0.3 AS DECIMAL(18,5))
WHEN a.task_type = 'SQLRT' THEN CAST(a.input_bytes/1024/1024/1024 * a.complexity * 0.3 AS DECIMAL(18,5))
WHEN a.task_type = 'CUPID' AND a.status = 'Terminated'
THEN CAST(a.cost_cpu/100/3600 * 0.66 AS DECIMAL(18,5))
ELSE 0
END cost_sum
,a.settings
,GET_JSON_OBJECT(settings, "$.SKYNET_ONDUTY") owner
,CASE WHEN GET_JSON_OBJECT(a.settings, "$.SKYNET_ONDUTY") IS NULL THEN owner_name
ELSE GET_JSON_OBJECT(a.settings, "$.SKYNET_ONDUTY")
END task_owner
FROM information_history
WHERE ds = '${datetime1}'
) a
LEFT JOIN <project_name>.user_ram b
ON a.task_owner = b.user_id;
-- Populate time_topn.
-- cost_time is measured in seconds using DATEDIFF(end_time, start_time, 'ss').
INSERT INTO TABLE <project_name>.time_topn PARTITION (ds = '${datetime1}')
SELECT
inst_id
,cost_time
,CASE WHEN a.task_owner = '<user_id>' OR a.task_owner = '<user_id>' OR a.task_owner = '<user_id>'
THEN b.user_name
ELSE a.task_owner
END task_owner
FROM (
SELECT
inst_id
,task_type
,status
,DATEDIFF(a.end_time, a.start_time, 'ss') AS cost_time
,CASE WHEN GET_JSON_OBJECT(a.settings, "$.SKYNET_ONDUTY") IS NULL THEN owner_name
ELSE GET_JSON_OBJECT(a.settings, "$.SKYNET_ONDUTY")
END task_owner
FROM <project_name>.information_history a
WHERE ds = '${datetime1}'
) a
LEFT JOIN <project_name>.user_ram b
ON a.task_owner = b.user_id;
In the example, <code class="inline-code___exakR" data-tag="code" id="009e222468q5n">task_type = 'SQL'</code> represents an SQL job, <code class="inline-code___exakR" data-tag="code" id="356b9950b9pcf">task_type = 'SQLRT'</code> represents a query acceleration job, and <code class="inline-code___exakR" data-tag="code" id="c5b5e2e926bma">task_type = 'CUPID'</code> represents a Spark job. To include other billable job types such as MapReduce or Mars, add CASE branches using their respective billing formulas. For more information, see <a baseurl="t2020484_v1_1_0.dita" data-node="1781616" data-tag="xref" href="t1781616.dita#concept_2245130" id="xref_t2i_fh1_b6j" props="intl partner" rel="noopener noreferrer" target="_blank">Compute unit-based billing (pay-as-you-go)</a><a baseurl="t2020484_v1_1_0.dita" data-node="1781616" data-tag="xref" href="t1781616.dita#concept_2245130" id="xref_t2i_fh1_b6j" props="intl partner" rel="noopener noreferrer" target="_blank">Compute unit-based billing (pay-as-you-go)</a>.
In the Scheduling Configuration panel, set Parameter to datetime1=${yyyymmdd}. Save the node.
Step 5: Push top-N results to DingTalk
5.1 Create a DingTalk group chatbot
The following steps use the DingTalk PC client as an example.
-
Open the target DingTalk group and click the
icon in the upper-right corner. -
On the Group Settings panel, click Smart Group Assistant.
-
On the Smart Group Assistant panel, click Add Robot.
-
In the Add Robot section of the Group Robot dialog box, click the
icon. -
In the Group Robot dialog box, click Custom.
-
In the Robot Details dialog box, click Add.
-
In the Add Robot dialog box, configure the robot:
Property Description Profile picture Click the icon in the lower-right corner of the profile picture to edit it. Robot Name Enter a name for the robot. Security Settings Configure at least one security setting. Three options are available: Custom Keywords (up to 10 keywords), Add Signature (select to get the robot's signing key), and IP Address (Range) (restricts requests to a specified IP range). Select I have read and agree to the "Custom Robot Service and Disclaimer", then click Finish. -
Copy the generated Webhook URL, then click Finish.
ImportantKeep the Webhook URL confidential. Exposing it publicly allows anyone to post messages to your group.
5.2 Build the notification program
Use IntelliJ IDEA to create a Maven project that queries cost_topn and time_topn and sends the results to DingTalk via the Webhook URL. After compilation, the build produces a JAR file named topn_new.jar.
Configure POM dependencies:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>DingTalk_Information</groupId>
<artifactId>DingTalk_Information</artifactId>
<version>1.0-SNAPSHOT</version>
<dependencies>
<dependency>
<groupId>com.aliyun.odps</groupId>
<artifactId>odps-sdk-core</artifactId>
<version>0.35.5-public</version>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.15</version>
<exclusions>
<exclusion>
<groupId>com.sun.jmx</groupId>
<artifactId>jmxri</artifactId>
</exclusion>
<exclusion>
<groupId>com.sun.jdmk</groupId>
<artifactId>jmxtools</artifactId>
</exclusion>
<exclusion>
<groupId>javax.jms</groupId>
<artifactId>jms</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>com.aliyun</groupId>
<artifactId>alibaba-dingtalk-service-sdk</artifactId>
<version>1.0.1</version>
</dependency>
<dependency>
<groupId>com.aliyun.odps</groupId>
<artifactId>odps-jdbc</artifactId>
<version>3.0.1</version>
<classifier>jar-with-dependencies</classifier>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.4.1</version>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
<archive>
<manifest>
<mainClass>com.alibaba.sgri.message.test</mainClass>
</manifest>
</archive>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
Java program example:
package com.alibaba.sgri.message;
import java.io.IOException;
import java.util.concurrent.atomic.AtomicInteger;
import com.aliyun.odps.Instance;
import com.aliyun.odps.Odps;
import com.aliyun.odps.OdpsException;
import com.aliyun.odps.account.Account;
import com.aliyun.odps.account.AliyunAccount;
import com.aliyun.odps.data.ResultSet;
import com.aliyun.odps.task.SQLTask;
import com.dingtalk.api.DefaultDingTalkClient;
import com.dingtalk.api.DingTalkClient;
import com.dingtalk.api.request.OapiRobotSendRequest;
import com.dingtalk.api.response.OapiRobotSendResponse;
import com.taobao.api.ApiException;
public class test {
public static void main(String[] args) throws ApiException {
if (args.length < 1) {
System.out.println("Please enter the date parameter.");
System.exit(0);
}
System.out.println("Start reading data.");
DingTalkClient client = new DefaultDingTalkClient(
"<Your chatbot's Webhook URL>");
OapiRobotSendRequest request = new OapiRobotSendRequest();
request.setMsgtype("markdown");
OapiRobotSendRequest.Markdown markdown = new OapiRobotSendRequest.Markdown();
// The date here is used as a parameter.
markdown.setText(getContent(args[0]));
markdown.setTitle("Top N jobs by consumption");
request.setMarkdown(markdown);
OapiRobotSendResponse response = client.execute(request);
System.out.println("Message sent successfully.");
}
/**
* Read from ODPS to get the data to send.
*/
public static String getContent(String day) {
Odps odps = createOdps();
StringBuilder sb = new StringBuilder();
try {
//==================Billing accounts=====================
String costTopnSql = "select sum(cost_sum)cost_sum,task_owner from cost_topn where ds='" + day + "' " + "group by task_owner order by cost_sum desc limit 5;";
Instance costInstance = SQLTask.run(odps, costTopnSql);
costInstance.waitForSuccess();
ResultSet costTopnRecords = SQLTask.getResultSet(costInstance);
sb.append("<font color=#FF0000 size=4>").append("Top N Billing Accounts (").append(day).append(
")[Calculated based on Alibaba Cloud pay-as-you-go billing]").append("</font>").append("\n\n");
AtomicInteger costIndex = new AtomicInteger(1);
costTopnRecords.forEach(item -> {
sb.append(costIndex.getAndIncrement()).append(".").append("Account:");
sb.append("<font color=#2E64FE>").append(item.getString("task_owner")).append("\n\n").append("</font>");
sb.append(" ").append(" ").append("Cost:").append("<font color=#2E64FE>").append(item.get("cost_sum"))
.append(" CNY").append(
"</font>").append("\n\n")
.append("</font>");
});
//==================Time-consuming jobs=====================
String timeTopnSql = "select * from time_topn where ds='" + day + "' ORDER BY cost_time DESC limit 5;";
Instance timeInstance = SQLTask.run(odps, timeTopnSql);
timeInstance.waitForSuccess();
ResultSet timeTopnRecords = SQLTask.getResultSet(timeInstance);
sb.append("<font color=#FF8C00 size=4>").append("Top N Time-consuming Jobs (").append(day).append(")")
.append("\n\n").append("</font>");
AtomicInteger timeIndex = new AtomicInteger(1);
timeTopnRecords.forEach(item -> {
sb.append(timeIndex.getAndIncrement()).append(".").append("Job:");
sb.append("<font color=#2E64FE>").append(item.getString("inst_id")).append("\n\n").append("</font>");
sb.append(" ").append("Account:").append("<font color=#2E64FE>").append(item.getString("task_owner")).append("\n\n").append("</font>");
sb.append(" ").append("Duration:").append("<font color=#2E64FE>").append(item.get("cost_time"))
.append("s").append(
"</font>").append("\n\n");
});
} catch (OdpsException | IOException e) {
e.printStackTrace();
}
return sb.toString();
}
/**
* Create an ODPS object.
*/
public static Odps createOdps() {
String project = "<project_name>";
// Store AccessKey credentials in environment variables rather than hardcoding them.
// An Alibaba Cloud account AccessKey has permissions to access all APIs.
// For least-privilege access, use a RAM user's AccessKey instead.
String accessId = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID");
String accessKey = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET");
String endPoint = "http://service.odps.aliyun.com/api";
Account account = new AliyunAccount(accessId, accessKey);
Odps odps = new Odps(account);
odps.setEndpoint(endPoint);
odps.setDefaultProject(project);
return odps;
}
}
For IntelliJ IDEA usage, click Help in the upper-right corner of the interface.
5.3 Upload the JAR and create the Shell node
-
Upload
topn_new.jaras a MaxCompute resource. See Create and use MaxCompute resources. -
In DataWorks Data Studio, create a Shell node named
dingsend. See Shell node for setup details. Add the following command to the node:java -jar topn_new.jar $1In the Scheduling Configuration panel, set Parameter to
${yyyymmdd}. Save the node.
Step 6: Configure node dependencies and run
On the business flow canvas, connect the three nodes in sequence: information_history → topn → dingsend. This ensures each stage runs only after the previous one completes.
Configure the rerun properties and ancestor node dependencies for each node. When the configuration is complete, right-click a node and select Run Node.
For dependency configuration details, see Configure same-cycle scheduling dependencies and Configure node context.