All Products
Search
Document Center

MaxCompute:CREATE TABLE

Last Updated:Mar 25, 2026

Use CREATE TABLE to define a new table in MaxCompute. The statement supports seven variants:

  • Internal table — stores data inside MaxCompute (default).

  • External table — references data stored in OSS without loading it into MaxCompute.

  • Partitioned table — divides data into partitions by one or more key columns to skip full-table scans.

  • Clustered table — co-locates rows within hash or range buckets to accelerate joins and aggregations.

  • Transactional table — enables row-level UPDATE and DELETE on internal tables.

  • Delta table — adds a primary key, ACID (atomicity, consistency, isolation, and durability) transactions, upserts, time travel, and incremental queries on top of a transactional table.

  • Create from existing — copies the schema (and optionally the data) of another table into a new table.

Limits

  • A partitioned table supports up to 6 levels of partitions (for example, year/month/week/day/hour/minute).

  • By default, a table can have up to 60,000 partitions. This limit is configurable per project.

For all table-related limits, see MaxCompute SQL limits.

Syntax

Internal table

CREATE [OR REPLACE] TABLE [IF NOT EXISTS] <table_name>
  (<col_name> <data_type> [NOT NULL] [DEFAULT <default_value>] [comment <col_comment>], ...)
  [comment <table_comment>]
  [PARTITIONED BY (<col_name> <data_type> [comment <col_comment>], ...)]
  [LIFECYCLE <days>];

Clustered table

CREATE TABLE [IF NOT EXISTS] <table_name>
  (<col_name> <data_type>, ...)
  [CLUSTERED BY | RANGE CLUSTERED BY (<col_name> [, <col_name>, ...])
    [SORTED BY (<col_name> [ASC | DESC] [, <col_name> [ASC | DESC] ...])]
    INTO <number_of_buckets> BUCKETS];

External table

CREATE EXTERNAL TABLE [IF NOT EXISTS] <table_name>
  (<col_name> <data_type>, ...)
  STORED AS '<file_format>'
  [WITH SERDEPROPERTIES (options)]
  LOCATION '<oss_location>';

Transactional table

CREATE [EXTERNAL] TABLE [IF NOT EXISTS] <table_name>
  (<col_name> <data_type> [NOT NULL] [DEFAULT <default_value>] [comment <col_comment>], ...)
  [comment <table_comment>]
  [TBLPROPERTIES ("transactional"="true")];

Delta table

CREATE [EXTERNAL] TABLE [IF NOT EXISTS] <table_name>
  (<col_name> <data_type> [NOT NULL] [DEFAULT <default_value>] [comment <col_comment>], ...
  [PRIMARY KEY (<pk_col_name> [, <pk_col_name>, ...])])
  [comment <table_comment>]
  [TBLPROPERTIES ("transactional"="true"
    [, "write.bucket.num" = "N"
    , "acid.data.retain.hours" = "hours"
    ...])]
  [LIFECYCLE <days>];

Create from existing table

Copy schema and data (partitions not carried over):

CREATE TABLE [IF NOT EXISTS] <table_name> [LIFECYCLE <days>] AS <select_statement>;

Copy schema only (no data replicated):

CREATE TABLE [IF NOT EXISTS] <table_name> [LIFECYCLE <days>] LIKE <existing_table_name>;

Parameters

Common parameters

ParameterRequiredDescription
OR REPLACENoDrops the existing table with the same name and recreates it. Cannot be combined with IF NOT EXISTS, AS SELECT, or LIKE.
EXTERNALNoCreates an external table.
IF NOT EXISTSNoSkips the statement without error if a table with the same name already exists. The existing table's metadata and data are unchanged.
table_nameYesThe name of the table. Case-insensitive. Letters, digits, and underscores only; starting with a letter is recommended. Maximum 128 bytes.
col_nameYesThe name of a column. Same naming rules as table_name. Maximum 128 bytes.
col_commentNoThe column description. Maximum 1,024 bytes.
data_typeYesThe data type of the column: BIGINT, DOUBLE, BOOLEAN, DATETIME, DECIMAL, STRING, or others. See Data type editions.
NOT NULLNoPrevents NULL values in the column.
DEFAULT <default_value>NoThe value written to the column when INSERT omits it. Functions such as GETDATE() and NOW() are not supported as default values.
table_commentNoThe table description. Maximum 1,024 bytes.
LIFECYCLENoThe number of days before MaxCompute automatically drops the table (non-partitioned) or reclaims each partition (partitioned). Must be a positive integer.
PRIMARY KEYNoOne or more columns whose combined values must be unique. Applicable only to Delta tables. Primary key columns must be NOT NULL and cannot be modified after creation.

Lifecycle behavior:

  • Non-partitioned tables: The clock resets each time table data is modified. After the specified number of idle days, MaxCompute drops the table.

  • Partitioned tables: MaxCompute evaluates each partition's LastModifiedTime independently. The table itself is not dropped even when all partitions have been reclaimed.

Parameters for partitioned tables

MaxCompute supports two partitioning modes:

Regular partitioned tables (PARTITIONED BY)

ParameterRequiredDescription
PARTITIONED BYYesDeclares partition key columns. Cannot be combined with AUTO PARTITIONED BY on the same table.
col_nameYesThe name of the partition key column. Same naming rules as table columns. Maximum 128 bytes.
data_typeYesThe data type of the partition key column. MaxCompute V1.0 supports STRING only. MaxCompute V2.0 supports TINYINT, SMALLINT, INT, BIGINT, VARCHAR, and STRING.
col_commentNoThe partition key column description. Maximum 1,024 bytes.

Partition key column values:

  • No double-byte characters (for example, Chinese characters).

  • Must start with a letter.

  • Maximum 255 bytes.

  • Supported special characters: space, :, _, $, #, ., !, @.

  • Escape characters \t, \n, and / have undefined behavior.

Auto-partitioned tables (AUTO PARTITIONED BY)

Auto-partitioned tables generate partition key columns automatically based on a partition expression.

ParameterRequiredDescription
AUTO PARTITIONED BYYesDeclares the auto-partition expression. Cannot be combined with PARTITIONED BY.
auto_partition_expressionYesDefines how to derive partition key values. Only the TRUNC_TIME function is supported, and only one partition key column is allowed per table.
auto_partition_column_nameNoThe name of the generated partition key column. Defaults to _pt_col_0_. If that name already exists in the table, the system increments the suffix (_pt_col_1_, _pt_col_2_, and so on). The generated column is always of STRING type.
TBLPROPERTIES('ingestion_time_partition'='true')NoGenerates partition key columns based on the write time of each record. See Auto-partitioned tables based on the data write time.

Parameters for clustered tables

Hash-clustered tables (CLUSTERED BY)

ParameterRequiredDescription
CLUSTERED BYYesThe hash key columns. MaxCompute distributes rows across buckets by hash value. Choose columns with high cardinality and few duplicates to avoid data skew and hot spots. Columns used in frequent joins or aggregations are good candidates.
SORTED BYYesThe sort order within each bucket. Setting SORTED BY to the same columns as CLUSTERED BY enables index-accelerated queries.
number_of_bucketsYesThe number of hash buckets. Default maximum: 1,111. Override with set odps.stage.reducer.num=<concurrency>; (hard limit: 4,000).

Bucket sizing guidelines:

  • Target ~500 MB per bucket. For a 500 GB partition, use 1,000 buckets.

  • For very large tables, increase to 2–3 GB per bucket.

  • Use powers of 2 (512, 1,024, 2,048, 4,096) to allow automatic bucket splitting and merging during join optimization.

Range-clustered tables (RANGE CLUSTERED BY)

ParameterRequiredDescription
RANGE CLUSTERED BYYesThe range-clustered key columns. MaxCompute assigns rows to buckets by range.
SORTED BYYesThe sort order within each bucket. Same usage as hash-clustered tables.
number_of_bucketsNoThe number of buckets. Unlike hash-clustered tables, this does not need to follow the 2^n rule. If omitted, MaxCompute selects the optimal count automatically.

To disable shuffling when the join key or group key matches the range-clustered key (or its prefix), set odps.optimizer.enable.range.partial.repartitioning=true. This setting is disabled by default.

Parameters for external tables

For a complete guide, see External tables.

ParameterRequiredDescription
STORED AS '<file_format>'YesThe file format of the external data source.
WITH SERDEPROPERTIES(options)NoAuthorization, compression, and character parsing options.
LOCATION '<oss_location>'YesThe OSS path where the external table data resides. See OSS external tables.

Parameters for Delta tables

Delta tables support near real-time reads and writes, incremental queries, time travel, and ACID transactions using Multi-Version Concurrency Control (MVCC) with snapshot isolation. A primary key is required.

ParameterRequiredDefaultDescription
PRIMARY KEYYesOne or more NOT NULL columns whose combined values uniquely identify each row within a partition (or within the full table for non-partitioned tables). Duplicate rows are deduplicated by primary key. The primary key schema cannot be changed after creation.
"transactional"="true"YesEnables ACID semantics and MVCC. Must be true.
"write.bucket.num"No16The number of buckets per partition (or for non-partitioned tables). Valid range: (0, 4096]. Target ~500 MB per bucket (or 2–3 GB for large tables). Changeable for partitioned tables (applies to new partitions); not changeable for non-partitioned tables.
"acid.data.retain.hours"No24The number of hours of historical data to retain for time travel queries. Valid range: [24, 168]. For retention beyond 168 hours (7 days), contact MaxCompute technical support. Set to 0 to disable time travel.
"acid.incremental.query.out.of.time.range.enabled"NofalseWhen true, the endTimestamp of an incremental query can exceed the latest committed time in the table. Multiple queries may return different results if new data arrives between queries. Changeable after creation.
"acid.write.precombine.field"NoThe name of a single column. When set, MaxCompute deduplicates rows by primary key within the file committed in the same SQL statement. Not effective across multiple files (commits larger than 128 MB generate multiple files).
"acid.partial.fields.update.enable"NoWhen true, enables partial column updates via SQL or Tunnel. Cannot be changed after table creation.

Delta table lifecycle constraint: Set LIFECYCLE to satisfy lifecycle >= acid.data.retain.hours / 24. MaxCompute validates this at creation time and returns an error if the constraint is not met.

Parameters for transactional tables

ParameterRequiredDescription
TBLPROPERTIES("transactional"="true")YesMarks the table as transactional, enabling row-level UPDATE and DELETE. See Update or delete data.

Parameters for creating tables from existing tables

StatementWhat is copiedWhat is not copied
CREATE TABLE ... AS <select_statement>Column data (the SELECT result)Partition properties, lifecycle
CREATE TABLE ... LIKE <existing_table_name>Column names, column comments, table commentData, lifecycle

Both statements support the LIFECYCLE parameter to set the lifecycle of the new table independently. Both can create an internal table from an existing external table or from a table in an external project used for the data lakehouse solution.

When using CREATE TABLE ... AS SELECT, the partition key columns of the source table become regular columns in the new table. The resulting table is not partitioned.

Usage notes

Clustered tables

Benefits:

  • Bucket pruning during queries

  • Aggregation and join optimization

  • Storage optimization

Limits:

  • INSERT INTO is not supported. Use INSERT OVERWRITE to load data.

  • Data imported via Tunnel commands is not sorted, so Tunnel-based imports to range-clustered tables are not supported.

  • Data backup and restoration are not supported.

Transactional tables

  • Set a table to transactional only at creation time. ALTER TABLE SET TBLPROPERTIES("transactional"="true") is not supported and returns an error:

    FAILED: Catalog Service Failed, ErrorCode: 151, Error Message: Set transactional is not supported
  • Cannot create clustered or external tables as transactional.

  • Transactional tables cannot be converted to or from internal, external, or clustered tables.

  • Files in transactional tables are not merged automatically. Run the merge operation manually. See Merge transactional table files.

  • merge partition is not supported.

  • Graph jobs cannot read from or write to transactional tables.

  • Spark and Platform for AI (PAI) jobs can read from transactional tables but cannot write to them.

  • Before running UPDATE, DELETE, or INSERT OVERWRITE on critical data, back up the data to another table using SELECT + INSERT.

Delta tables

  • Only MaxCompute SQL can directly operate on Delta tables.

  • Existing common tables cannot be converted to Delta tables.

  • The primary key column schema cannot be changed after the table is created.

  • CLUSTER BY and CREATE TABLE AS SELECT are not supported for Delta tables.

  • Delta tables cannot be used as external tables.

OR REPLACE

CREATE OR REPLACE TABLE cannot be combined with:

  • IF NOT EXISTS

  • AS SELECT

  • LIKE

Examples

Create a non-partitioned table

CREATE TABLE test1 (key STRING);

Create a partitioned table

CREATE TABLE IF NOT EXISTS sale_detail (
  shop_name    STRING,
  customer_id  STRING,
  total_price  DOUBLE
)
PARTITIONED BY (sale_date STRING, region STRING);

Replace an existing table

  1. Create the original table and insert data:

    CREATE OR REPLACE TABLE my_table (a BIGINT);
    
    INSERT INTO my_table (a) VALUES (1), (2), (3);
  2. Replace it with a new schema:

    CREATE OR REPLACE TABLE my_table (b STRING);
  3. Query the table. The previous data is dropped and the new schema has no rows:

    +------------+
    | b          |
    +------------+
    +------------+

The following statements are invalid with OR REPLACE:

CREATE OR REPLACE TABLE IF NOT EXISTS my_table (b STRING);
CREATE OR REPLACE TABLE my_table AS SELECT;
CREATE OR REPLACE TABLE my_table LIKE newtable;

Copy data from an existing table (CTAS)

Create sale_detail_ctas1 by copying data from sale_detail and set its lifecycle to 10 days:

SET odps.sql.allow.fullscan=true;
CREATE TABLE sale_detail_ctas1 LIFECYCLE 10 AS SELECT * FROM sale_detail;
sale_detail is a partitioned table, but sale_detail_ctas1 is not. The partition key columns of sale_detail (sale_date and region) become regular columns in sale_detail_ctas1, giving it five columns total.

Run DESC EXTENDED sale_detail_ctas1; to inspect the schema and lifecycle.

Copy data with explicit column aliases

When using constants as column values, specify aliases to control column names. Without aliases, MaxCompute assigns generated names such as _c4 and _c5.

With aliases:

SET odps.sql.allow.fullscan=true;
CREATE TABLE sale_detail_ctas2 AS
SELECT shop_name, customer_id, total_price, '2013' AS sale_date, 'China' AS region
FROM sale_detail;

Without aliases (column names are auto-generated):

SET odps.sql.allow.fullscan=true;
CREATE TABLE sale_detail_ctas3 AS
SELECT shop_name, customer_id, total_price, '2013', 'China'
FROM sale_detail;

Copy the schema of an existing table (LIKE)

CREATE TABLE sale_detail_like LIKE sale_detail LIFECYCLE 10;
sale_detail_like has the same column names, column comments, and table comment as sale_detail, but its lifecycle is set independently and no data is copied.

Run DESC EXTENDED sale_detail_like; to verify the schema.

Copy the schema of an external table

-- Create an internal table with the same schema as the external table mc_oss_extable_orc.
CREATE TABLE mc_oss_extable_orc_like LIKE mc_oss_extable_orc;

Run DESC mc_oss_extable_orc_like; to verify. The result shows InternalTable: YES and the columns from the external table's schema.

Use new data types

Enable the MaxCompute V2.0 data type edition before creating the table:

SET odps.sql.type.system.odps2=true;
CREATE TABLE test_newtype (
  c1  TINYINT,
  c2  SMALLINT,
  c3  INT,
  c4  BIGINT,
  c5  FLOAT,
  c6  DOUBLE,
  c7  DECIMAL,
  c8  BINARY,
  c9  TIMESTAMP,
  c10 ARRAY<MAP<BIGINT, BIGINT>>,
  c11 MAP<STRING, ARRAY<BIGINT>>,
  c12 STRUCT<s1:STRING, s2:BIGINT>,
  c13 VARCHAR(20)
) LIFECYCLE 1;

Create a hash-clustered table

Non-partitioned:

CREATE TABLE t1 (a STRING, b STRING, c BIGINT)
CLUSTERED BY (c) SORTED BY (c) INTO 1024 BUCKETS;

Partitioned:

CREATE TABLE t2 (a STRING, b STRING, c BIGINT)
PARTITIONED BY (dt STRING)
CLUSTERED BY (c) SORTED BY (c) INTO 1024 BUCKETS;

Create a range-clustered table

Non-partitioned (explicit bucket count):

CREATE TABLE t3 (a STRING, b STRING, c BIGINT)
RANGE CLUSTERED BY (c) SORTED BY (c) INTO 1024 BUCKETS;

Partitioned (MaxCompute selects bucket count automatically):

CREATE TABLE t4 (a STRING, b STRING, c BIGINT)
PARTITIONED BY (dt STRING)
RANGE CLUSTERED BY (c) SORTED BY (c);

Create a transactional table

Non-partitioned:

CREATE TABLE t5 (id BIGINT) TBLPROPERTIES ("transactional"="true");

Partitioned:

CREATE TABLE IF NOT EXISTS t6 (id BIGINT)
PARTITIONED BY (ds STRING)
TBLPROPERTIES ("transactional"="true");

Create a Delta table

Minimal Delta table (single-column primary key):

CREATE TABLE mf_tt (pk BIGINT NOT NULL PRIMARY KEY, val BIGINT)
TBLPROPERTIES ("transactional"="true");

Delta table with composite primary key and custom properties:

CREATE TABLE mf_tt2 (
  pk   BIGINT NOT NULL,
  pk2  BIGINT NOT NULL,
  val  BIGINT,
  val2 BIGINT,
  PRIMARY KEY (pk, pk2)
)
TBLPROPERTIES (
  "transactional"="true",
  "write.bucket.num" = "64",
  "acid.data.retain.hours" = "120"
) LIFECYCLE 7;

Create a table with default column values

CREATE TABLE test_default (
  tinyint_name  TINYINT  NOT NULL DEFAULT 1Y,
  smallint_name SMALLINT NOT NULL DEFAULT 1S,
  int_name      INT      NOT NULL DEFAULT 1,
  bigint_name   BIGINT   NOT NULL DEFAULT 1,
  binary_name   BINARY,
  float_name    FLOAT,
  double_name   DOUBLE   NOT NULL DEFAULT 0.1,
  decimal_name  DECIMAL(2, 1) NOT NULL DEFAULT 0.0BD,
  varchar_name  VARCHAR(10),
  char_name     CHAR(2),
  string_name   STRING   NOT NULL DEFAULT 'N',
  boolean_name  BOOLEAN  NOT NULL DEFAULT TRUE
);

Copy data from an OSS external table into an internal table

Without partition properties:

  1. Create the OSS external table and insert data:

    CREATE EXTERNAL TABLE max_oss_test (a INT, b INT, c INT)
    STORED AS TEXTFILE
    LOCATION "oss://oss-cn-hangzhou-internal.aliyuncs.com/<bucket_name>";
    
    INSERT INTO max_oss_test VALUES
      (101, 1, 20241108),
      (102, 2, 20241109),
      (103, 3, 20241110);
  2. Create an internal table from the external table using CREATE TABLE AS:

    CREATE TABLE from_exetbl_oss AS SELECT * FROM max_oss_test;
  3. Query the internal table to confirm all rows are present:

    SELECT * FROM from_exetbl_oss;
    -- a      b    c
    -- 101    1    20241108
    -- 102    2    20241109
    -- 103    3    20241110

With partition properties (using LIKE):

  1. Create an internal table that inherits the schema (including partition structure) from the external table:

    CREATE TABLE from_exetbl_like LIKE max_oss_test;
  2. Verify the schema:

    DESC from_exetbl_like;

    The result shows InternalTable: YES with a and b as native columns and c as a partition column.

Related commands

  • ALTER TABLE — modify table schema, properties, or lifecycle.

  • TRUNCATE — clear all data from a table.

  • DROP TABLE — delete a table.

  • DESC TABLE/VIEW — inspect table schema, metadata, and properties.

  • SHOW — list tables, views, or partitions in a project.