Troubleshoot unexpected postpaid fee surges - MaxCompute

When your daily MaxCompute spend suddenly doubles, this guide walks you through pinpointing which project and job caused the spike — and why.

The investigation follows a three-stage funnel:

Locate — use the Fee Hub to identify the date, project, and billing item with abnormal spend.
Diagnose — query job history in Information Schema to find the specific job or operation responsible.
Resolve — optimize or reschedule the job to bring costs back in line.

Prerequisites

Before you begin, ensure that you have:

A MaxCompute project with pay-as-you-go billing enabled
Project owner access, or a Resource Access Management (RAM) user with the Super_Administrator role (required to access project metadata)
The Information Schema service enabled (see Step 1)

Step 1: Enable Information Schema

Starting March 1, 2024, MaxCompute no longer automatically installs the project-level Information Schema package for new projects. For new projects, use tenant-level Information Schema instead — it covers all your projects without per-project installation.

For existing projects, install the Information Schema package using either of the following methods:

MaxCompute client: Log on to the MaxCompute client and run:
```
install package Information_Schema.systables;
```
DataWorks Ad Hoc Query: Log on to the DataWorks console, open the Ad Hoc Query interface, and run the same command. For details, see Run SQL statements using ad hoc queries.

To analyze metadata across multiple projects, install the package for each project and insert backup metadata from each project into a shared table for centralized analysis. For feature details and limits, see Information Schema overview.

(Optional) Step 2: Grant access to other users

By default, only the project owner can view Information Schema data. To grant access to other users or roles, run the following in the MaxCompute client:

grant <actions> on package Information_Schema.systables to user <user_name>;
grant <actions> on package Information_Schema.systables to role <role_name>;

Parameter	Description
`actions`	Must be `Read`
`user_name`	An Alibaba Cloud account or RAM user added to the project. Run `list users;` in the MaxCompute client to get usernames.
`role_name`	A role added to the project. Run `list roles;` in the MaxCompute client to get role names.

Example:

grant read on package Information_Schema.systables to user RAM$Bob@aliyun.com:user01;

For more information, see Access resources across projects based on packages.

(Optional) Step 3: Back up metadata

MaxCompute retains job history for the last 14 days. Skip this step if you only need to investigate costs within that window.

If you need to query data older than 14 days, set up a scheduled backup job to preserve metadata.

Log on to the MaxCompute client and create the backup tables:

-- Replace <project_name> with your MaxCompute project name.
-- Create the tasks_history backup table.
CREATE TABLE IF NOT EXISTS <project_name>.tasks_history
(
    task_catalog STRING
    ,task_schema STRING
    ,task_name STRING
    ,task_type STRING
    ,inst_id STRING
    ,status STRING
    ,owner_id STRING
    ,owner_name STRING
    ,result STRING
    ,start_time DATETIME
    ,end_time DATETIME
    ,input_records BIGINT
    ,output_records BIGINT
    ,input_bytes BIGINT
    ,output_bytes BIGINT
    ,input_tables STRING
    ,output_tables STRING
    ,operation_text STRING
    ,signature STRING
    ,complexity DOUBLE
    ,cost_cpu DOUBLE
    ,cost_mem DOUBLE
    ,settings STRING
    ,ds STRING
);
-- Create the tunnels_history backup table.
CREATE TABLE IF NOT EXISTS <project_name>.tunnels_history
(
tunnel_catalog  STRING
,tunnel_schema   STRING
,session_id      STRING
,operate_type    STRING
,tunnel_type     STRING
,request_id      STRING
,object_name     STRING
,partition_spec  STRING
,data_size       BIGINT
,block_id        BIGINT
,offset          BIGINT
,length          BIGINT
,owner_id        STRING
,owner_name      STRING
,start_time      DATETIME
,end_time        DATETIME
,client_ip       STRING
,user_agent      STRING
,object_type     STRING
,columns         STRING
,ds  STRING
);

In DataWorks Data Studio, create an ODPS SQL node named history_backup. Configure timed scheduling to write data to the backup tables daily, then save the node. Click the icon in the upper-left corner to save. Use the following SQL as the node content:

To back up metadata from multiple projects, create one ODPS SQL node per project and write all output to the same shared backup table.

-- Replace <project_name> with your MaxCompute project name.
USE <project_name>;
-- Back up tasks_history.
INSERT INTO TABLE <project_name>.tasks_history
SELECT task_catalog,task_schema,task_name,task_type STRING,inst_id,`status`,owner_id,owner_name,result
  ,start_time,end_time,input_records,output_records,input_bytes,output_bytes
  ,input_tables,output_tables,operation_text,signature,complexity,cost_cpu,cost_mem,settings,ds
  FROM information_schema.tasks_history WHERE ds ='${datetime1}';
-- Back up tunnels_history.
INSERT INTO TABLE <project_name>.tunnels_history
SELECT tunnel_catalog,tunnel_schema,session_id,operate_type,tunnel_type,request_id,object_name
  ,partition_spec,data_size,block_id,offset,length,owner_id,owner_name,start_time,end_time
  ,client_ip,user_agent,object_type,columns,ds
  FROM information_schema.tunnels_history WHERE ds ='${datetime1}';

The ${datetime1} variable is a DataWorks scheduling parameter. In the ODPS SQL node, click Scheduling Configuration on the right. Under Basic Properties, set Parameters to datetime1=${yyyymmdd}.

Step 4: Identify the high-cost project and billing item

Log on to the Fee Hub and use the following methods to narrow down where the cost spike occurred. For more information, see Log on to the Fee Hub.View billing details.

Find the project with abnormal spend:

In the navigation pane, choose Billing > Billing Details. Set Statistic Item to Instance and Statistical Period to Day. Scan the list to find the project (instance ID) where costs clearly exceeded expectations.

Find the specific billing item within a project:

On the same page, search by Resource Instance Name/ID to filter results to a single project, then identify which billing item drove the increase.

Spot rising costs by category:

In the navigation pane, choose Cost Analysis. Use this view to compare billing items across time periods and surface categories with growing spend.

Step 5: Identify the root cause

Once you know which project and billing item spiked in Step 4, use the queries below to find the specific job or operation responsible.

SQL computing costs

SQL job costs are determined by: input data volume x SQL complexity x USD 0.0438 per GB.USD 0.1166 per GBUSD 0.0690 per hour per taskUSD 0.0438 per GB

A sudden increase is typically caused by a single abnormal job, a scheduling misconfiguration, or duplicate jobs running repeatedly.

Which SQL jobs consumed the most in a given day?

Log on to the MaxCompute client and switch to the high-cost project using the use command. Query TASKS_HISTORY to rank jobs by estimated cost:

-- Enable the ODPS 2.0 data type.
SET odps.sql.decimal.odps2=true;
SELECT
    inst_id                                                                   -- instance ID
    ,input_bytes                                                              -- input data volume
    ,complexity
    ,CAST(input_bytes/1024/1024/1024 * complexity * 0.0438 AS DECIMAL(18,5)) cost_sum
    ,GET_JSON_OBJECT(settings, "$.SKYNET_ID") SKYNET_ID                      -- DataWorks scheduling job ID
FROM information_schema.tasks_history
-- For data older than 14 days, query the backup table: <project_name>.tasks_history
WHERE (task_type = 'SQL' OR task_type = 'SQLRT')
AND ds = '<date partition to query>'
ORDER BY cost_sum DESC
LIMIT 10000;

SQL computing cost = Input data volume × SQL complexity × Unit price (USD 0.0438 per GB). task_type = 'SQL' — SQL job. task_type = 'SQLRT' — SQL query acceleration job.

Who triggered the expensive jobs?

Check the SKYNET_ID field in the results:

If a SKYNET_ID is present, the job was triggered by a DataWorks scheduling node. Check that node's execution history in DataWorks.

If SKYNET_ID is empty, the job was triggered outside DataWorks. Use the inst_id to retrieve the full SQL text:

SELECT operation_text
  FROM information_schema.tasks_history
 WHERE ds = '<date partition of job execution>' AND inst_id = '<inst_id>';

Which SQL jobs ran repeatedly?

Use the signature field (a hash of the SQL text) to detect repeated execution of the same query:

-- Identify job growth trends by signature.
SELECT  signature
        ,ds
        ,COUNT(*) AS tasknum
FROM    information_schema.tasks_history
-- For data older than 14 days, query the backup table: <project_name>.tasks_history
WHERE (task_type = 'SQL' OR task_type = 'SQLRT')
AND ds >= '<date partition to query>'
GROUP BY ds
         ,signature
ORDER BY tasknum DESC
LIMIT   10000;

-- After identifying an abnormal signature, check its recent execution history.
SELECT  *
FROM    information_schema.tasks_history
-- For data older than 14 days, query the backup table: <project_name>.tasks_history
WHERE   signature = '<abnormal signature>'
AND ds >= '<date partition to query>';

Spark computing costs

Spark job costs are determined by: total billable CPU hours x USD 0.1041 per hour per task.USD 0.1041 per hour per task

Which Spark jobs consumed the most?

Log on to the MaxCompute client and switch to the high-cost project. Query TASKS_HISTORY to rank Spark jobs by cost:

-- Enable the ODPS 2.0 data type.
SET odps.sql.decimal.odps2=true;
SELECT
    inst_id                                                           -- instance ID
    ,cost_cpu                                                         -- CPU consumption (100 = 1 core x second)
    ,CAST(cost_cpu/100/3600 * 0.36 AS DECIMAL(18,5)) cost_sum
FROM information_schema.tasks_history
-- For data older than 14 days, query the backup table: <project_name>.tasks_history
WHERE task_type = 'CUPID' AND status = 'Terminated'
AND ds = '<date partition to query>'
ORDER BY cost_sum DESC
LIMIT 10000;

Spark job computing cost for the day = Total billable hours × Unit price (USD 0.1041 per hour per task). cost_cpu unit: 100 = 1 core x second. Example: 10 cores running for 5 seconds gives cost_cpu = 10 x 100 x 5 = 5000.

What did an abnormal Spark job execute?

SELECT operation_text
  FROM information_schema.tasks_history
 WHERE ds = '<date partition of job execution>' AND inst_id = '<inst_id>';

MapReduce computing costs

MapReduce job costs follow the same CPU-hour formula as Spark.

Which MapReduce jobs consumed the most?

Log on to the MaxCompute client and switch to the high-cost project. Query TASKS_HISTORY:

-- Enable the ODPS 2.0 data type.
SET odps.sql.decimal.odps2=true;
SELECT
    inst_id                                                           -- instance ID
    ,cost_cpu                                                         -- CPU consumption (100 = 1 core x second)
    ,CAST(cost_cpu/100/3600 * 0.36 AS DECIMAL(18,5)) cost_sum
FROM information_schema.tasks_history
-- For data older than 14 days, query the backup table: <project_name>.tasks_history
WHERE task_type = 'LOT' AND status = 'Terminated'
AND ds = '<date partition to query>'
ORDER BY cost_sum DESC
LIMIT 10000;

What did an abnormal MapReduce job execute?

SELECT operation_text
  FROM information_schema.tasks_history
 WHERE ds = '<date partition of job execution>' AND inst_id = '<inst_id>';

Storage fees

Use Usage Details to investigate storage fee changes:

Download usage details — see Download usage details.
Upload usage details to MaxCompute — see Upload usage details data to MaxCompute.
Analyze with SQL — see Analyze MaxCompute billing usage details.

Public network downstream traffic fees

Download fees are charged at USD 0.1166 per GB of downloaded data.

Which time periods and download types had the highest fees?

Log on to the MaxCompute client and switch to the high-cost project. Query TUNNELS_HISTORY:

-- Enable the ODPS 2.0 data type.
SET odps.sql.decimal.odps2=true;
SELECT  ds
        ,operate_type
        ,SUM(CAST(data_size / 1024 / 1024 / 1024 * 0.1166 AS DECIMAL(18,5))) download_fee
FROM    information_schema.tunnels_history
WHERE   operate_type = 'DOWNLOADLOG'
OR      operate_type = 'DOWNLOADINSTANCELOG'
AND     ds >= '<date partition to query>'
GROUP BY ds
         ,operate_type
ORDER BY download_fee DESC;

Download fee = Downloaded data volume × Unit price (USD 0.1166 per GB).

Run the tunnel show history command to view per-request download details. For the full command reference, see Tunnel commands.

More information

For more articles about cost optimization, see Cost optimization overview.