All Products
Search
Document Center

MaxCompute:CREATE TABLE

Last Updated:Mar 28, 2026

CREATE TABLE creates non-partitioned, partitioned, external, or clustered tables.

Limits

  • A partitioned table can have up to 6 partition levels. For example, a table partitioned by date might use the levels year/month/week/day/hour/minute.

  • By default, a table can have up to 60,000 partitions. This limit is configurable per project.

For other table limits, see MaxCompute SQL limits.

Syntax

Create an internal table

Create an internal table (including non-partitioned and partitioned tables)

CREATE [OR REPLACE] TABLE [IF NOT EXISTS] <table_name> (
<col_name> <data_type>, ... )
[comment <table_comment>]
[PARTITIONED BY (<col_name> <data_type> [comment <col_comment>], ...)]
[AUTO PARTITIONED 
   BY (<auto_partition_expression> [AS <auto_partition_column_name>])
  [TBLPROPERTIES('ingestion_time_partition'='true')]
];

Create a clustered table

Create a clustered table

CREATE TABLE [IF NOT EXISTS] <table_name> (
<col_name> <data_type>, ... )
[CLUSTERED BY | RANGE CLUSTERED BY (<col_name> [, <col_name>, ...]) 
[SORTED BY (<col_name> [ASC | DESC] [, <col_name> [ASC | DESC] ...])] 
INTO <number_of_buckets> buckets];

Create an external table

Create an external table

This example shows how to create an OSS external table using the built-in text data parser. For more information, see ORC external tables.

CREATE EXTERNAL TABLE [IF NOT EXISTS] <mc_oss_extable_name> ( 
<col_name> <data_type>, ... ) 
STORED AS '<file_format>' 
[WITH SERDEPROPERTIES (options)]  
LOCATION '<oss_location>';

Specify table type

  • Specifying a table as transactional allows you to update or delete its data. However, these tables have limitations, so consider your requirements carefully before creating one.

    CREATE [EXTERNAL] TABLE [IF NOT EXISTS] <table_name> (
    <col_name <data_type> [NOT NULL] [DEFAULT <default_value>] [comment <col_comment>], ...   
    [comment <table_comment>]
    [TBLPROPERTIES ("transactional"="true")];
  • Specify the table as a Delta table. In combination with a primary key, this enables operations such as upsert, incremental queries, and time travel.

    CREATE [EXTERNAL] TABLE [IF NOT EXISTS] <table_name> (
    <col_name <data_type> [NOT NULL] [DEFAULT <default_value>] [comment <col_comment>], ...   
    [PRIMARY KEY (<pk_col_name>[, <pk_col_name2>, ...] )]) 
    [comment <table_comment>]
    [CLUSTERED BY (<pk_col_name>[, <pk_col_name2>, ...] )]
    [TBLPROPERTIES ("transactional"="true" 
    [, "write.bucket.num" = "N", "acid.data.retain.hours"="hours"...])] [LIFECYCLE <days>];
    • A Delta table with a primary key supports using a subset of the primary key as the hash cluster key.

      By default, a Delta table with a primary key distributes data using hash clustering, with the full set of primary key columns serving as the default cluster key. If your queries frequently filter by a subset of the primary key, you can specify that subset as the cluster key. This alignment of data distribution with query patterns improves filtering performance.

    • When you explicitly specify a cluster key, the system distributes table data by hash bucketing based on the specified columns. The cluster key must be a subset of the primary key columns.

      If you do not specify a cluster key, the system uses the full set of primary key columns as the default cluster key.

      The cluster key is immutable and must be specified when the table is created.

Create from existing table

  • This statement creates a new table from an existing one and copies its data. Partition properties are not copied. This operation is supported for external tables and tables in Data Lakehouse external projects.

    CREATE TABLE [IF NOT EXISTS] <table_name> [LIFECYCLE <days>] AS <select_statement>;
  • This statement creates a table with the same schema as an existing one but does not copy any data. This operation is supported for external tables and tables in Data Lakehouse external projects.

    CREATE TABLE [IF NOT EXISTS] <table_name> [LIFECYCLE <days>] LIKE <existing_table_name>;

Parameters

Common parameters

Common parameters

Parameter

Required

Description

Notes

OR REPLACE

No

  • If a table named <table_name> already exists, this statement drops the existing table and then creates a new one with the same name to replace it.

  • The CREATE OR REPLACE TABLE statement cannot be used with the following syntax.

    • CREATE TABLE ... IF NOT EXISTS.

    • CREATE TABLE ... AS SELECT.

    • CREATE TABLE ... LIKE.

Equivalent to running the following two statements:

DROP TABLE IF EXISTS <tableName>;  -- If the target table exists, drop it first.
CREATE TABLE <tableName> ...;      -- Create the table.

EXTERNAL

No

Creates an external table.

None

IF NOT EXISTS

No

Checks if a table with the specified name exists.

Without the IF NOT EXISTS clause, the statement fails if a table with the same name exists. With IF NOT EXISTS, the statement succeeds even if a table with the same name exists and its schema differs from the one you are creating. The existing table's metadata is not changed.

table_name

Yes

The name of the table.

The table name is case-insensitive. It can contain only letters (a-z, A-Z), digits, and underscores (_). We recommend that the name starts with a letter. The name can be up to 128 bytes long. If the name exceeds this limit, the statement fails.

PRIMARY KEY(pk)

No

The primary key of the table.

You can define one or more columns as the primary key. This indicates that the combination of values in these columns must be unique within the table. The syntax follows the standard SQL primary key syntax. The primary key columns must be NOT NULL and cannot be modified.

Important

This parameter applies only to Delta tables.

col_name

Yes

The name of a column.

  • The column name is case-insensitive. It can contain only letters (a-z, A-Z), digits, underscores (_), or Chinese characters. We recommend that the name starts with a letter. The name can be up to 128 bytes long. If the name exceeds this limit, the statement fails.

  • Run the SET odps.sql.bigquery.compatible=true; command to enable BigQuery compatible mode. In this mode, the column name can be up to 256 bytes long. If the name exceeds this limit, the statement fails.

col_comment

No

The comment on the column.

The comment must be a valid string up to 1,024 bytes long. If the comment exceeds this limit, an error occurs.

data_type

Yes

The data type of the column.

Supported data types include BIGINT, DOUBLE, BOOLEAN, DATETIME, DECIMAL, and STRING. For more information, see Data type editions.

NOT NULL

No

Specifies that the column cannot contain NULL values.

For more information about how to modify the NOT NULL constraint, see Partition operations.

default_value

No

The default value of the column.

If an INSERT statement omits a value for this column, the default value is used.

Note

Functions such as GETDATE() and NOW() are not supported as default values.

table_comment

No

The comment on the table.

The comment must be a valid string up to 1,024 bytes long. If the comment exceeds this limit, an error occurs.

LIFECYCLE

No

The lifecycle of the table.

Specifies a positive integer. Unit: days.

  • For a non-partitioned table, the lifecycle begins at the last data modification time. If the table remains unmodified for the specified number of days, MaxCompute automatically reclaims it, which is similar to a DROP TABLE operation.

  • For a partitioned table, the system evaluates each partition for reclamation based on its LastModifiedTime. Unlike a non-partitioned table, the table itself is not deleted even after its last partition is reclaimed. You can set the lifecycle only at the table level, not for individual partitions.

Partitioned tables

Partitioned tables

Partitioned tables

MaxCompute supports two types of partitioned tables: regular partitioned tables and Auto-partitioned tables. Choose a table type based on how you need to generate partition key columns in different scenarios. For more information, see Partitioned table overview.

Regular partitioned tables

Parameter

Required

Description

Notes

PARTITIONED BY

Yes

Specifies the partitions of a regular partitioned table.

Use either the PARTITIONED BY or AUTO PARTITIONED BY clause, but not both.

col_name

Yes

The name of the partition key column.

  • The column name is case-insensitive. It can contain only letters (a-z, A-Z), digits, underscores (_), or Chinese characters. We recommend that the name starts with a letter. The name can be up to 128 bytes long. If the name exceeds this limit, the statement fails.

  • Run the SET odps.sql.bigquery.compatible=true; command to enable BigQuery compatible mode. In this mode, the column name can be up to 256 bytes long. If the name exceeds this limit, the statement fails.

data_type

Yes

The data type of the partition key column.

MaxCompute 1.0 supports only the STRING type for partition key columns. MaxCompute 2.0 expands this to include TINYINT, SMALLINT, INT, BIGINT, VARCHAR, and STRING. For more information, see Data type editions. Using partition key columns avoids full table scans when adding, updating, or reading partitions, which improves processing efficiency.

col_comment

No

The comment on the partition key column.

The comment must be a valid string up to 1,024 bytes long. If the comment exceeds this limit, an error occurs.

Note

Partition values cannot contain double-byte characters, such as Chinese characters. A partition value must start with a letter and can contain letters, digits, and allowed special characters. The value can be up to 255 bytes long. The allowed special characters are spaces, colons (:), underscores (_), dollar signs ($), number signs (#), periods (.), exclamation points (!), and at signs (@). The behavior of other characters, such as the escape characters \t, \n, and /, is undefined.

Auto-partitioned tables

In an Auto-partitioned table, partition key columns are automatically generated based on a specified method. For more information about Auto-partitioned tables, see Partitioned table types.

Parameter

Required

Description

Notes

AUTO PARTITIONED BY

Yes

Specifies the partitions of an Auto-partitioned table.

Use either the PARTITIONED BY or AUTO PARTITIONED BY clause, but not both.

auto_partition_expression

Yes

An expression that defines how partition key columns are calculated.

Currently, you can use only the TRUNC_TIME function to generate a partition key column, and only one partition key column is supported.

The TRUNC_TIME function truncates data in a time or date column of a table based on a specified time unit and generates a partition key column.

auto_partition_column_name

No

The name of the generated partition key column.

If a name for the partition column is not specified, the system uses _pt_col_0_ as the default column name. The system then sequentially checks if this name exists in the table. If it does, the system automatically increments the suffix (for example, _pt_col_1_, _pt_col_2_, and so on) until an unused column name is found.

After the partition expression is calculated, the system generates a STRING partition key column from the result. You can name this column, but you cannot directly modify its data type or value.

TBLPROPERTIES('ingestion_time_partition'='true')

No

Specifies whether to generate partition key columns based on the data write time.

For more information about how to generate partitions based on the data write time, see Auto-partitioned tables based on data write time.

Clustered tables

Clustered tables

Clustered tables are classified into Hash-clustered tables and Range-clustered tables.

Hash-clustered tables

Parameter

Required

Description

Notes

CLUSTERED BY

Yes

Specifies the hash key columns.

MaxCompute performs a hash operation on the specified columns and distributes the data to buckets based on the hash values. To prevent data skew, avoid hot spots, and improve parallel execution, we recommend that you select columns with a wide range of values and few duplicate keys for the CLUSTERED BY clause. To optimize JOIN operations, you can also select commonly used join or aggregation keys, which are similar to primary keys in traditional databases.

SORTED BY

Yes

Specifies the sort order of columns within a bucket.

We recommend that you specify the same columns for the SORTED BY and CLUSTERED BY clauses to achieve better performance. After you specify the SORTED BY clause, MaxCompute automatically generates an index and uses the index to accelerate queries.

number_of_buckets

Yes

Specifies the number of hash buckets.

This parameter is required and its value depends on the data volume. MaxCompute supports up to 1,111 reducers by default. Therefore, up to 1,111 hash buckets are supported. You can run the set odps.stage.reducer.num =<concurrency>; command to increase this limit, but the value cannot exceed 4,000. Otherwise, performance may be affected.

Note

When you select the number of hash buckets, follow these principles:

  • Ensure that the size of each hash bucket is appropriate. We recommend that you set the size of each bucket to about 500 MB. For example, if a partition is estimated to be 500 GB in size, you can set the number of buckets to 1,000. In this case, the average size of each bucket is about 500 MB. For very large tables, the size of a bucket can be 2 GB to 3 GB. You can also run the set odps.stage.reducer.num=<concurrency>; command to use more than 1,111 buckets.

  • For JOIN optimization scenarios, you can remove the shuffle and sort steps to significantly improve performance. This requires that the numbers of hash buckets in the two tables to be joined are multiples of each other, for example, 256 and 512. We recommend that you set the number of hash buckets to a power of 2, such as 512, 1,024, 2,048, or 4,096 (2n). This way, the system can automatically split and merge hash buckets, and remove the shuffle and sort steps to improve execution efficiency.

Range-clustered tables

Parameter

Required

Description

Notes

RANGE CLUSTERED BY

Yes

Specifies the range clustering key columns.

MaxCompute performs a bucketing operation on the specified columns and distributes the data to buckets based on bucket numbers.

SORTED BY

Yes

Specifies the sort order of columns within a bucket.

This parameter is used in the same way as for a Hash-clustered table.

number_of_buckets

Yes

Specifies the number of hash buckets.

The number of buckets in a Range-clustered table does not need to follow the 2n rule for Hash-clustered tables. If data is evenly distributed, you can specify any number of buckets. This parameter is optional for Range-clustered tables. If you omit it, the system automatically determines the optimal number of buckets based on the data volume.

If the object of a JOIN or AGGREGATION operation is a Range-clustered table and the join key or group key is the range clustering key or its prefix, you can remove data redistribution (Shuffle Remove) to improve execution efficiency. You can run the set odps.optimizer.enable.range.partial.repartitioning=true/false; command to configure this feature. This feature is disabled by default.

Note
  • Benefits of clustered tables:

    • Optimized bucket pruning.

    • Optimized aggregation.

    • Optimized storage.

  • Limitations of clustered tables:

    • The INSERT INTO statement is not supported. You can only use the INSERT OVERWRITE statement to add data.

    • You cannot use Tunnel to directly upload data to Range-clustered tables because Tunnel uploads are unordered.

    • The backup and restoration feature is not supported.

External tables

External tables

The following table describes the parameters for creating an OSS external table as an example. For more information about the parameters for creating other types of external tables, see External tables.

Parameter

Required

Description

STORED AS '<file_format>'

Yes

Specifies the file format of the data in the external table.

WITH SERDEPROPERTIES(options)

No

Specifies the parameters related to authorization, compression, and character parsing for the external table.

oss_location

Yes

The OSS path where the data of the external table is stored. For more information, see OSS external tables.

Transactional and Delta tables

Transactional and Delta tables

Delta tables

A Delta table supports near-real-time reads and writes, incremental I/O, and real-time updates. Currently, you can only create Delta tables with a primary key.

Parameter

Required

Description

Notes

PRIMARY KEY(PK)

Yes

Required when you create a Delta table with a primary key. The primary key can contain multiple columns.

The syntax follows the standard SQL primary key syntax. The primary key columns must be set to NOT NULL and cannot be modified. Uniqueness is enforced per partition, or across the entire table if it is a non-partitioned table.

transactional

Yes

Required when you create a Delta table. You must set this parameter to true.

This parameter indicates that the table has the transactional features of a MaxCompute ACID table and uses the Multi-Version Concurrency Control (MVCC) transaction model to ensure snapshot isolation.

write.bucket.num

No

The default value is 16. The value must be in the range of (0, 4096].

Specifies the number of buckets in each partition or non-partitioned table. It also specifies the number of concurrent nodes for data writes. You can modify this parameter for a partitioned table. The new value takes effect for new partitions. You cannot modify this parameter for a non-partitioned table. We recommend that you follow these guidelines:

  • If you import data by using Tunnel, this parameter specifies the number of concurrent Tunnel nodes. The value affects the import traffic and is limited by the maximum number of concurrent Tunnel nodes.

  • If you write data by using SQL, this parameter specifies the concurrency of reducers for data writes and is limited by the maximum number of concurrent reducer nodes.

  • We recommend that the amount of data written to each bucket is about 500 MB. For example, if a partition is estimated to be 500 GB in size, you can set the number of buckets to 1,000. In this case, the average size of each bucket is about 500 MB. For very large tables, the size of a bucket can be 2 GB to 3 GB.

acid.data.retain.hours

No

The default value is 24. The value must be in the range of [24, 168].

Specifies the time range, in hours, for which historical data states are retained for time travel queries. If you need to query data that is more than 168 hours (7 days) old, contact MaxCompute technical support.

  • If you set this parameter to 0, historical data states are not retained, and time travel queries are not supported.

  • If historical data states exist for a period longer than the value of this parameter, the data can be deleted or compacted.

  • If a time travel query specifies a time that is earlier than allowed by this parameter, the query fails. For example, if this parameter is set to 72 (hours) and a query attempts to access a data state from more than 72 hours ago, the query fails.

acid.incremental.query.out.of.time.range.enabled

No

The default value is false.

If you set this parameter to true, the endTimestamp specified in an incremental query can be later than the latest data commit time of the table. In scenarios where endTimestamp is later than the current time, you may obtain different results from multiple queries because new data may be inserted. You can modify the value of this parameter for a table.

acid.write.precombine.field

No

You can specify the name of only one column.

If you specify a column name, the system combines this column with the primary key (PK) columns to deduplicate data within a single file commit. This ensures data uniqueness and consistency.

Note

If the amount of data in a single commit exceeds 128 MB, multiple files are generated. This parameter does not apply across multiple files.

acid.partial.fields.update.enable

No

If you set this parameter to true, you can use SQL or Tunnel to perform partial column updates on a Delta table.

You can set this parameter when you create a table. You cannot modify this parameter after the table is created.

  • Other common parameter requirements for Delta tables:

    • LIFECYCLE: The table lifecycle must be greater than or equal to the time travel retention period. The following condition must be met: lifecycle >= acid.data.retain.hours / 24. MaxCompute checks this condition when you create a table. If the condition is not met, an error is returned.

    • Other unsupported features: You cannot set the CLUSTER BY clause, create external tables, or use CREATE TABLE AS.

  • Other limitations:

    • Currently, Delta tables can be operated on only by using MaxCompute SQL. Other computing engines cannot operate on them directly.

    • You cannot convert an existing regular table to a Delta table.

    • You cannot change the schema of the primary key columns in a Delta table.

Transactional tables

Parameter

Required

Description

TBLPROPERTIES("transactional"="true")

Yes

Sets the table as a transactional table. You can then run UPDATE and DELETE statements to update or delete data at the row level. For more information, see Update or delete data (UPDATE | DELETE).

The following limitations apply to transactional tables:

  • MaxCompute allows you to set the transactional property only when you create a table. You cannot run the ALTER TABLE statement to modify the transactional property of an existing table. The following statement returns an error:

    ALTER TABLE not_txn_tbl SET TBLPROPERTIES("transactional"="true");
    --An error is returned.
    FAILED: Catalog Service Failed, ErrorCode: 151, Error Message: Set transactional is not supported
  • You cannot set a clustered table or an external table as a transactional table when you create the table.

  • You cannot convert existing internal tables, external tables, or clustered tables to transactional tables, or vice versa.

  • Transactional table files cannot be automatically merged. You must manually merge the files. For more information, see Merge transactional table files.

  • The MERGE PARTITION operation is not supported.

  • Jobs from other systems have some limitations when they access transactional tables. For example, Graph does not support read and write operations. Spark and PAI support only read operations, not write operations.

  • Before you run the UPDATE, DELETE, or INSERT OVERWRITE statement on important data in a transactional table, you must manually back up the data to another table by running SELECT and INSERT statements.

Create tables from existing tables

Create tables from existing tables

  • You can run the CREATE TABLE [IF NOT EXISTS] <table_name> [LIFECYCLE <days>] AS <select_statement>; statement to create a table and copy data to the new table at the same time.

    • A table created this way does not inherit the source table's partition properties. The source table's partition key columns become regular columns in the new table. The lifecycle property of the source table is not copied.

    • You can also use the lifecycle parameter to reclaim the table. This statement also allows you to create an internal table and copy the data of an external table to the internal table.

  • You can run the CREATE TABLE [IF NOT EXISTS] <table_name> [LIFECYCLE <days>] LIKE <existing_table_name>; statement to create a table that has the same schema as the source table.

    • A table created this way copies the schema but not the data or lifecycle property of the source table.

    • You can also use the lifecycle parameter to reclaim the table. This statement also allows you to create an internal table and copy the schema of an external table to the internal table.

Examples

Non-partitioned table

  • Create a non-partitioned table.

    CREATE TABLE test1 (key STRING);
  • Create a non-partitioned table and specify default values for columns.

    CREATE TABLE test_default( 
    tinyint_name tinyint NOT NULL default 1Y,
    smallint_name SMALLINT NOT NULL DEFAULT 1S,
    int_name INT NOT NULL DEFAULT 1,
    bigint_name BIGINT NOT NULL DEFAULT 1,
    binary_name BINARY ,
    float_name FLOAT ,
    double_name DOUBLE NOT NULL DEFAULT 0.1,
    decimal_name DECIMAL(2, 1) NOT NULL DEFAULT 0.0BD,
    varchar_name VARCHAR(10) ,
    char_name CHAR(2) ,
    string_name STRING NOT NULL DEFAULT 'N',
    boolean_name BOOLEAN NOT NULL DEFAULT TRUE
    );

Partitioned table

  • Create a regular partitioned table.

    CREATE TABLE IF NOT EXISTS sale_detail(
     shop_name     STRING,
     customer_id   STRING,
     total_price   DOUBLE)
    PARTITIONED BY (sale_date STRING, region STRING);
  • Create an auto-partitioned table that partitions data by a time-based column using a time function.

    -- Truncates the sale_date column to a monthly granularity to generate a partition column named sale_month, which is used to partition the table.
    CREATE TABLE IF NOT EXISTS auto_sale_detail(
     shop_name     STRING,
     customer_id   STRING,
     total_price   DOUBLE,
     sale_date    DATE )
    AUTO PARTITIONED BY (TRUNC_TIME(sale_date, 'month') AS sale_month);
  • Create an auto-partitioned table that generates partitions based on the data ingestion time. MaxCompute automatically captures the time when data is written and uses a time function to generate partitions.

    -- After the table is created, when data is written, MaxCompute captures the ingestion time in the _partitiontime column, truncates it to a daily granularity, and generates a partition column named sale_date, which is used to partition the table.
    CREATE TABLE IF NOT EXISTS auto_sale_detail2(
     shop_name     STRING,
     customer_id   STRING,
     total_price   DOUBLE,
     _partitiontime TIMESTAMP_NTZ)
    AUTO PARTITIONED BY (TRUNC_TIME(_partitiontime, 'day') AS sale_date)
    TBLPROPERTIES('ingestion_time_partition'='true');

Hash and range cluster

  • Create a non-partitioned hash-clustered table.

    CREATE TABLE t1 (a STRING, b STRING, c BIGINT) CLUSTERED BY (c) SORTED BY (c) INTO 1024 buckets;
  • Create a partitioned hash-clustered table.

    CREATE TABLE t2 (a STRING, b STRING, c BIGINT) 
    PARTITIONED BY (dt STRING) CLUSTERED BY (c) SORTED BY (c) INTO 1024 buckets;
  • Create a non-partitioned range-clustered table.

    CREATE TABLE t3 (a STRING, b STRING, c BIGINT) RANGE CLUSTERED BY (c) SORTED BY (c) INTO 1024 buckets;
  • Create a partitioned range-clustered table.

    CREATE TABLE t4 (a STRING, b STRING, c BIGINT) 
    PARTITIONED BY (dt STRING) RANGE CLUSTERED BY (c) SORTED BY (c); 

Transactional table

  • Create a non-partitioned transactional table.

    CREATE TABLE t5(id BIGINT) TBLPROPERTIES ("transactional"="true");
  • Create a partitioned transactional table.

    CREATE TABLE IF NOT EXISTS t6(id BIGINT) 
    PARTITIONED BY (ds STRING) TBLPROPERTIES ("transactional"="true");

Internal table

  • Create an internal table by copying data from an external partitioned table. The internal table does not inherit the partition properties.

    1. Create an OSS external table and a MaxCompute internal table.

      -- Create an OSS external table and insert data.
      CREATE EXTERNAL TABLE max_oss_test(a INT, b INT, c INT) 
      STORED AS TEXTFILE
      LOCATION "oss://oss-cn-hangzhou-internal.aliyuncs.com/<bucket_name>";
      
      INSERT INTO max_oss_test VALUES 
      (101, 1, 20241108),
      (102, 2, 20241109),
      (103, 3, 20241110);
      
      SELECT * FROM max_oss_test;
      
      -- Result
      a    b    c
      101    1    20241108
      102    2    20241109
      103    3    20241110
      
      
      -- Create an internal table by using CREATE TABLE AS.
      CREATE TABLE from_exetbl_oss AS SELECT * FROM max_oss_test;
      
      -- Query the new internal table.
      SELECT * FROM from_exetbl_oss;
      
      -- All data is returned.
      a    b    c
      101    1    20241108
      102    2    20241109
      103    3    20241110
    2. Run DESC from_exetbl_oss; to view the internal table's schema. The following output is returned.

      +------------------------------------------------------------------------------------+
      | Owner:                    ALIYUN$***********                                       |
      | Project:                  ***_*****_***                                            |
      | TableComment:                                                                      |
      +------------------------------------------------------------------------------------+
      | CreateTime:               2023-01-10 15:16:33                                      |
      | LastDDLTime:              2023-01-10 15:16:33                                      |
      | LastModifiedTime:         2023-01-10 15:16:33                                      |
      +------------------------------------------------------------------------------------+
      | InternalTable: YES      | Size: 919                                                |
      +------------------------------------------------------------------------------------+
      | Native Columns:                                                                    |
      +------------------------------------------------------------------------------------+
      | Field           | Type       | Label | Comment                                     |
      +------------------------------------------------------------------------------------+
      | a               | string     |       |                                             |
      | b               | string     |       |                                             |
      | c               | string     |       |                                             |
      +------------------------------------------------------------------------------------+
  • Create an internal table by copying the schema from an external partitioned table. The internal table inherits the partition properties.

    1. Create the internal table from_exetbl_like.

      -- Query the external table in MaxCompute.
      SELECT * FROM max_oss_test;
      -- Result
      a    b    c
      101    1    20241108
      102    2    20241109
      103    3    20241110
      
      -- Create an internal table by using CREATE TABLE LIKE.
      CREATE TABLE from_exetbl_like LIKE max_oss_test;
      
      -- Query the new internal table.
      SELECT * FROM from_exetbl_like;
      -- Only the schema is returned because no data is copied.
      a    b    c
    2. Run DESC from_exetbl_like; to view the internal table's schema. The following output is returned.

      +------------------------------------------------------------------------------------+
      | Owner:                    ALIYUN$************                                      |
      | Project:                  ***_*****_***                                            |
      | TableComment:                                                                      |
      +------------------------------------------------------------------------------------+
      | CreateTime:               2023-01-10 15:09:47                                      |
      | LastDDLTime:              2023-01-10 15:09:47                                      |
      | LastModifiedTime:         2023-01-10 15:09:47                                      |
      +------------------------------------------------------------------------------------+
      | InternalTable: YES      | Size: 0                                                  |
      +------------------------------------------------------------------------------------+
      | Native Columns:                                                                    |
      +------------------------------------------------------------------------------------+
      | Field           | Type       | Label | Comment                                     |
      +------------------------------------------------------------------------------------+
      | a               | string     |       |                                             |
      | b               | string     |       |                                             |
      +------------------------------------------------------------------------------------+
      | Partition Columns:                                                                 |
      +------------------------------------------------------------------------------------+
      | c               | string     |                                                     |
      +------------------------------------------------------------------------------------+

Delta table

  • Create a Delta table.

    CREATE TABLE mf_tt (pk BIGINT NOT NULL PRIMARY KEY, val BIGINT) 
      TBLPROPERTIES ("transactional"="true");
  • Create a Delta table with a composite primary key and custom table properties.

    CREATE TABLE mf_tt2 ( 
      pk BIGINT NOT NULL, 
      pk2 BIGINT NOT NULL, 
      val BIGINT, 
      val2 BIGINT, 
      PRIMARY KEY (pk, pk2)
    ) 
    TBLPROPERTIES (
      "transactional"="true", 
      "write.bucket.num" = "64", 
      "acid.data.retain.hours"="120"
    ) LIFECYCLE 7;

Other creation options

Replace an existing table

  1. Create the original table my_table and insert data.

    CREATE OR REPLACE TABLE my_table(a BIGINT);
    
    INSERT INTO my_table(a) VALUES (1),(2),(3);
  2. Use OR REPLACE to create a new table with the same name and modify the columns.

    CREATE OR REPLACE TABLE my_table(b STRING);
  3. Query the my_table table. The original data is dropped, and the new table is empty. The following output is returned.

    +------------+
    | b          | 
    +------------+
    +------------+

    The following SQL statements are invalid:

    CREATE OR REPLACE TABLE IF NOT EXISTS my_table(b STRING);
    CREATE OR REPLACE TABLE my_table AS SELECT;
    CREATE OR REPLACE TABLE my_table LIKE newtable;
Copy data and set a lifecycle
-- Create a new table named sale_detail_ctas1, copy data from sale_detail into it, and set a lifecycle.
SET odps.sql.allow.fullscan=true;
CREATE TABLE sale_detail_ctas1 LIFECYCLE 10 AS SELECT * FROM sale_detail;

You can run the DESC EXTENDED sale_detail_ctas1; command to view detailed information about the table, including its schema and lifecycle.

Note

In this example, sale_detail is a partitioned table. However, the table sale_detail_ctas1, which is created using the CREATE TABLE ... AS select_statement ... statement, does not inherit partition properties. The partition columns of the source table become regular columns in the destination table. As a result, sale_detail_ctas1 is a non-partitioned table with five columns.

Use constants for columns
Note

If you use constants as column values in the SELECT clause, specify column aliases. Otherwise, MaxCompute assigns generated names to those columns, such as _c4 and _c5.

  • Specify column aliases.

    SET odps.sql.allow.fullscan=true;
    CREATE TABLE sale_detail_ctas2
    AS
    SELECT shop_name, customer_id, total_price, '2013' AS sale_date, 'China' AS region
    FROM sale_detail;
  • Do not specify column aliases.

    SET odps.sql.allow.fullscan=true;
    
    CREATE TABLE sale_detail_ctas3
    AS
    SELECT shop_name, customer_id, total_price, '2013', 'China' 
    FROM sale_detail;
Copy a schema and set a lifecycle
CREATE TABLE sale_detail_like LIKE sale_detail LIFECYCLE 10;

You can run the DESC EXTENDED sale_detail_like; command to view detailed information about the table, including its schema and lifecycle.

Note

The schema of sale_detail_like is identical to that of sale_detail. All properties, such as column names, column comments, and the table comment, are inherited, except for the lifecycle. However, the data from sale_detail is not copied to sale_detail_like.

Copy schema from an external table
-- Create a new table named mc_oss_extable_orc_like with the same schema as the external table mc_oss_extable_orc.
CREATE TABLE mc_oss_extable_orc_like LIKE mc_oss_extable_orc;

You can run the DESC mc_oss_extable_orc_like; command to view detailed information, such as the table schema.

+------------------------------------------------------------------------------------+
| Owner: ALIYUN$****@***.aliyunid.com | Project: max_compute_7u************yoq              |
| TableComment:                                                                      |
+------------------------------------------------------------------------------------+
| CreateTime:               2022-08-11 11:10:47                                      |
| LastDDLTime:              2022-08-11 11:10:47                                      |
| LastModifiedTime:         2022-08-11 11:10:47                                      |
+------------------------------------------------------------------------------------+
| InternalTable: YES      | Size: 0                                                  |
+------------------------------------------------------------------------------------+
| Native Columns:                                                                    |
+------------------------------------------------------------------------------------+
| Field           | Type       | Label | Comment                                     |
+------------------------------------------------------------------------------------+
| id              | string     |       |                                             |
| name            | string     |       |                                             |
+------------------------------------------------------------------------------------+

New data types

SET odps.sql.type.system.odps2=true;
CREATE TABLE test_newtype (
    c1 TINYINT,
    c2 SMALLINT,
    c3 INT,
    c4 BIGINT,
    c5 FLOAT,
    c6 DOUBLE,
    c7 DECIMAL,
    c8 BINARY,
    c9 TIMESTAMP,
    c10 ARRAY<MAP<BIGINT,BIGINT>>,
    c11 MAP<STRING,ARRAY<BIGINT>>,
    c12 STRUCT<s1:STRING,s2:BIGINT>,
    c13 VARCHAR(20))
LIFECYCLE 1;

Related commands

  • ALTER TABLE: Modifies a table's structure or properties.

  • TRUNCATE: Removes all data from a table.

  • DROP TABLE: Deletes a table.

  • DESC TABLE/VIEW: Displays information about a MaxCompute internal table, view, materialized view, external table, clustered table, or Transactional table.

  • SHOW: Displays the SQL DDL statement for a table, all tables and views in a project, or all partitions in a table.