All Products
Search
Document Center

MaxCompute:TPC-DS data

Last Updated:Mar 05, 2026

MaxCompute provides public TPC-DS datasets in four sizes (10 GB, 100 GB, 1 TB, and 10 TB) for product testing. This topic describes the dataset and how to query it.

Dataset and tables

TPC-DS is a standard benchmark from the Transaction Processing Performance Council (TPC) for evaluating data management systems. MaxCompute uses the official TPC-DS tool to generate datasets stored in different schemas within the BIGDATA_PUBLIC_DATASET project.

Dataset size

Project name

Schema name

10 GB

BIGDATA_PUBLIC_DATASET

TPCDS_10G

100 GB

BIGDATA_PUBLIC_DATASET

TPCDS_100G

1 TB

BIGDATA_PUBLIC_DATASET

TPCDS_1T

10 TB

BIGDATA_PUBLIC_DATASET

TPCDS_10T

Each schema contains the following tables:

call_center, catalog_page, catalog_returns, catalog_sales, customer, customer_address, customer_demographics, date_dim, household_demographics, income_band, inventory, item, promotion, reason, ship_mode, store, store_returns, store_sales, tab_reducenum, tab_reducenum_100, time_dim, warehouse, web_page, web_returns, web_sales, web_site

Note

For details about table schemas and content, see TPC Benchmark DS (TPC-DS) v2.5.0.

Available regions

Region

Region ID

China (Hangzhou)

cn-hangzhou

China (Shanghai)

cn-shanghai

China (Beijing)

cn-beijing

China (Zhangjiakou)

cn-zhangjiakou

China (Ulanqab)

cn-wulanchabu

China (Shenzhen)

cn-shenzhen

China (Chengdu)

cn-chengdu

China (Hong Kong)

cn-hongkong

Singapore

ap-southeast-1

Japan (Tokyo)

ap-northeast-1

Malaysia (Kuala Lumpur)

ap-southeast-3

Indonesia (Jakarta)

ap-southeast-5

US (Silicon Valley)

us-west-1

US (Virginia)

us-east-1

UK (London)

eu-west-1

Germany (Frankfurt)

eu-central-1

UAE (Dubai)

me-east-1

China (Shanghai) Finance Cloud

cn-shanghai-finance-1

China (Beijing) Finance Cloud (Invitational Preview)

cn-beijing-finance-1

South China 1 Finance Cloud

cn-shenzhen-finance-1

China (Beijing) Alibaba Gov Cloud 1

cn-north-2-gov-1

Prerequisites

Before you begin, make sure that you have:

Query the data

You query TPC-DS tables through cross-project access because you are not added as a member of the BIGDATA_PUBLIC_DATASET project. Specify the full path in project.schema.table format.

Supported tools

Note

The Data Map feature in DataWorks cannot discover tables in this public dataset because the data requires cross-project access.

Required session flags

The TPC-DS dataset uses schemas for storage and data types such as DECIMAL and INT. Set the following flags before running queries:

-- Enable session-level schema syntax.
SET odps.namespace.schema=true;

-- Enable data type compatibility.
SET odps.sql.hive.compatible=true;
SET odps.sql.type.system.odps2=true;
SET odps.sql.decimal.odps2=true;

-- Allow ORDER BY without LIMIT clause.
-- New projects use this setting by default. Existing projects may need it set explicitly
-- to avoid errors or suboptimal join order for the Q72 query.
SET odps.sql.validate.orderby.limit=false;

-- Allow Cartesian products (required for Q77).
SET odps.sql.allow.cartesian=true;
Note

If tenant-level schema syntax is not enabled, the public dataset does not appear in DataWorks Data Analysis. SQL queries still work.

Query example

The following query retrieves 100 rows from the store_sales table in the 10 GB dataset. To query other datasets, replace the schema name (for example, tpcds_100g).

-- Enable session-level schema syntax.
SET odps.namespace.schema=true;

-- Query the tpcds_10g dataset. Replace the schema name to query other datasets.
SELECT * FROM bigdata_public_dataset.tpcds_10g.store_sales limit 100;

Sample output:

+-----------------+-----------------+------------+----------------+-------------+-------------+------------+-------------+-------------+------------------+-------------+-------------------+---------------+----------------+---------------------+--------------------+-----------------------+-------------------+------------+---------------+-------------+---------------------+---------------+
| ss_sold_date_sk | ss_sold_time_sk | ss_item_sk | ss_customer_sk | ss_cdemo_sk | ss_hdemo_sk | ss_addr_sk | ss_store_sk | ss_promo_sk | ss_ticket_number | ss_quantity | ss_wholesale_cost | ss_list_price | ss_sales_price | ss_ext_discount_amt | ss_ext_sales_price | ss_ext_wholesale_cost | ss_ext_list_price | ss_ext_tax | ss_coupon_amt | ss_net_paid | ss_net_paid_inc_tax | ss_net_profit |
+-----------------+-----------------+------------+----------------+-------------+-------------+------------+-------------+-------------+------------------+-------------+-------------------+---------------+----------------+---------------------+--------------------+-----------------------+-------------------+------------+---------------+-------------+---------------------+---------------+
| NULL            | NULL            | 39073      | NULL           | 1420876     | 1738        | 56600      | NULL        | NULL        | 41171            | 90          | 53.3              | NULL          | 72.87          | 0                   | NULL               | 4797                  | 7626.6            | 459.08     | 0             | NULL        | NULL                | NULL          |
| NULL            | NULL            | 22434      | 98163          | NULL        | NULL        | NULL       | 1           | NULL        | 8909             | NULL        | 15.22             | NULL          | 9.2            | NULL                | 690                | NULL                  | 1380.75           | NULL       | NULL          | NULL        | NULL                | -451.5        |
| NULL            | NULL            | 82219      | NULL           | NULL        | 1572        | 209531     | 38          | 285         | 14907            | 48          | 84.64             | 132.03        | NULL           | 0                   | NULL               | NULL                  | NULL              | 51.96      | 0             | NULL        | 2650.2              | -1464.48      |
| NULL            | NULL            | 97573      | 214533         | 1298744     | NULL        | NULL       | NULL        | 77          | 26167            | NULL        | 92.55             | 143.45        | 91.8           | 0                   | 8353.8             | NULL                  | NULL              | NULL       | 0             | NULL        | NULL                | -68.25        |
| NULL            | NULL            | 60120      | 376494         | NULL        | 1678        | 13917      | NULL        | NULL        | 35953            | 9           | 46.97             | NULL          | NULL           | NULL                | NULL               | NULL                  | 714.33            | NULL       | NULL          | NULL        | NULL                | 34.38         |
+-----------------+-----------------+------------+----------------+-------------+-------------+------------+-------------+-------------+------------------+-------------+-------------------+---------------+----------------+---------------------+--------------------+-----------------------+-------------------+------------+---------------+-------------+---------------------+---------------+

... (100 rows returned)

Sample query files

MaxCompute provides sample query files for each dataset size. Each file contains 99 queries that vary in complexity and data scan volume.

Note

Select queries carefully to avoid high computing costs, especially for larger datasets.

Dataset size

Query file

10 GB

MaxCompute-TPCDS_10G-99-query

100 GB

MaxCompute-TPCDS_100G-99-query

1 TB

MaxCompute-TPCDS_1T-99-query

10 TB

MaxCompute-TPCDS_10T-99-query

Generate different query versions using the TPC-DS benchmark suite tools. For more information, see the official TPC-DS documentation.

Billing

Storage of this public dataset is free. However, running queries incurs computing charges. For more information, see Pay-as-you-go computing pricing.

Disclaimer

  • The TPC-DS data generation and analysis are based on the TPC-DS benchmark. Results cannot be compared with any officially published TPC-DS benchmark results because the test environment does not meet all TPC-DS benchmark requirements.

  • This TPC-DS dataset is for product testing and evaluation only. The data is not updated regularly and must not be used in a production environment.

  • The TPC-DS data originates from TPC. You can also generate TPC-DS data independently. For more information, see the official TPC-DS documentation.