All Products
Search
Document Center

MaxCompute:Public Dimension Layer (DIM)

Last Updated:Mar 31, 2026

The dimension data (DIM) layer stores conformed dimension tables — wide tables that represent the people, places, and things your business measures (such as products, locations, and customers). It is built on the concepts of dimensional modeling.

Dimensions must be conformed across your enterprise: the same dimension defined in one business process carries the same meaning in every other. For example, a province code dimension delivers the same information regardless of which business process queries it.

Define dimensions

When you build a bus matrix and divide data domains, define dimensions by analyzing your business processes.

This tutorial uses the e-commerce marketing business unit of Company A. In the transaction data domain, it focuses on the receipt confirmation business process (where a transaction is successful). Each product is assumed to be purchased and received in the same place.

From the product perspective, the following dimensions apply:

  • Product ID

  • Product name

  • Product price

  • Product status: 0: new; 1: idle; 2: second-hand

  • Product category ID

  • Product category name

  • Commodity ID

  • Commodity name

  • Product sales status: 0: normal; 1: deleted; 2: taken off the shelf; 3: never put on the shelf

  • City of the product

  • Province of the product

From the place of receipt perspective, the following dimensions apply:

  • City code

  • City name

  • Province code

  • Province name

Design dimension tables

After defining dimensions, supplement data to generate dimension tables. Apply these constraints to every dimension table:

Constraint Value Reason
Maximum table size 10 million rows Larger tables degrade join performance
Join method MAPJOIN Optimized for dimension-to-fact joins in MaxCompute
Update frequency Infrequent Frequent updates increase pipeline complexity

Choose a dimension table structure

When designing a dimension table, evaluate the four factors below to decide how to structure and split it:

Factor When it applies Recommended approach
Data stability Dimension data rarely expires but may update at any time (e.g., member records) Create a partition to store full snapshots. If some records never change, separate them into a historical table (partitioned by expiration time) and a daily table (currently valid records).
Vertical splitting The table has many unused attribute columns, or too many attributes slow queries Split into multiple dimension tables by attribute group
Horizontal splitting Some rows are structurally distinct from the rest Split into separate tables or add multi-level partitions
Query time limits Core dimension tables have strict query time SLAs Keep the table lean; enforce row count and column count limits

Design steps

Follow these four steps to design a dimension table:

  1. Define and conform dimensions. Establish dimensions and make sure they are conformed — consistent and reusable across all business processes in your enterprise.

  2. Identify the primary dimension table. The primary dimension table is an ODS table that synchronizes directly from a source business system. For example, the s_auction table records product information synchronized from the front-end product center system. This is the primary dimension table for the product dimension in this tutorial.

  3. Identify related dimension tables. The data warehouse aggregates data from multiple source systems. Analyze the business logic to find which tables associate with your primary dimension table. For the product dimension, related dimensions include category, seller, and shop.

  4. Specify dimension attributes in two phases:

    • Phase 1: Select or derive dimension attributes from the primary dimension table.

    • Phase 2: Select or derive dimension attributes from related dimension tables — for example, from the category, seller, and shop dimension tables.

    When specifying attributes, apply these principles:

    • Generate as many meaningful dimension attributes as possible.

    • Write descriptive labels that make attribute values self-explanatory.

    • Distinguish numeric attributes (e.g., price) from facts (e.g., revenue). Numeric attributes describe the dimension; facts belong in fact tables.

    • Identify common attributes shared across multiple dimensions and promote them to conformed dimensions.

Follow DIM layer naming conventions

Name dimension tables using this pattern:

dim_{Abbreviated business unit|pub}_{Abbreviated dimension}[_{Abbreviated custom tag}]

Use pub when a dimension is not specific to any business unit and is shared across the enterprise (e.g., the time dimension or regional information).

The following table maps business concepts to their DIM layer table names:

Business concept Dimension type Table name
Regional information (shared) Geographic, pub dim_pub_area
Full product catalog — Company A e-commerce marketing Product dim_asale_itm

Create dimension tables

The following DDL statements create the two dimension tables used in this tutorial.

CREATE TABLE IF NOT EXISTS dim_asale_itm
(
    item_id           BIGINT  COMMENT 'Product ID',
    item_title        STRING  COMMENT 'Product name',
    item_price        DOUBLE  COMMENT 'Product price',
    item_stuff_status BIGINT  COMMENT 'Product status: 0: new; 1: idle; 2: second-hand',
    cate_id           BIGINT  COMMENT 'Product category ID',
    cate_name         STRING  COMMENT 'Product category name',
    commodity_id      BIGINT  COMMENT 'Commodity ID',
    commodity_name    STRING  COMMENT 'Commodity name',
    item_status       BIGINT  COMMENT 'Product sales status: 0: normal; 1: deleted; 2: taken off the shelf; 3: never put on the shelf',
    city              STRING  COMMENT 'City of the product',
    prov              STRING  COMMENT 'Province of the product'
)
COMMENT 'Table of full products'
PARTITIONED BY (ds STRING COMMENT 'Date: yyyymmdd');

CREATE TABLE IF NOT EXISTS dim_pub_area
(
    city_code  STRING COMMENT 'City code',
    city_name  STRING COMMENT 'City name',
    prov_code  STRING COMMENT 'Province code',
    prov_name  STRING COMMENT 'Province name'
)
COMMENT 'Table of common regional information'
PARTITIONED BY (ds STRING COMMENT 'Date: yyyymmdd')
LIFECYCLE 3600;
dim_pub_area is a shared (pub) dimension table. The LIFECYCLE 3600 setting retains data for 3,600 days. Adjust this value based on your data retention requirements.