This topic explains how to use the CREATE TABLE syntax in AnalyticDB for MySQL to create partitioned and replicated tables and define their distribution keys, partition keys, indexes, lifecycle, and tiered storage.
Syntax
CREATE TABLE [IF NOT EXISTS] table_name
({column_name column_type [column_attributes] [ column_constraints ] [COMMENT 'column_comment']
| table_constraints}
[, ... ])
[table_attribute]
[partition_options]
[index_all]
[storage_policy]
[block_size]
[engine]
[table_properties]
[AS query_expr]
[COMMENT 'table_comment']
column_attributes:
[DEFAULT {constant | CURRENT_TIMESTAMP}]
[AUTO_INCREMENT]
column_constraints:
[{NOT NULL|NULL} ]
[PRIMARY KEY]
table_constraints:
[{INDEX|KEY} [index_name] (column_name|column_name->'$.json_path'|column_name->'$[*]')][,...]
[FULLTEXT [INDEX|KEY] [index_name] (column_name) [index_option]] [,...]
[PRIMARY KEY [index_name] (column_name,...)]
[CLUSTERED KEY [index_name] (column_name[ASC|DESC],...) ]
[[CONSTRAINT [symbol]] FOREIGN KEY (fk_column_name) REFERENCES pk_table_name (pk_column_name)][,...]
[ANN INDEX [index_name] (column_name,...) [index_option]] [,...]
table_attribute:
DISTRIBUTED BY HASH(column_name,...) | DISTRIBUTED BY BROADCAST
partition_options:
PARTITION BY
{VALUE(column_name) | VALUE(DATE_FORMAT(column_name, 'format')) | VALUE(FROM_UNIXTIME(column_name, 'format'))}
LIFECYCLE N
index_all:
INDEX_ALL= 'Y|N'
storage_policy:
STORAGE_POLICY= {'HOT'|'COLD'|'MIXED' {hot_partition_count=N}}
block_size:
BLOCK_SIZE= VALUE
engine:
ENGINE= 'XUANWU|XUANWU_V2'
Internal tables in AnalyticDB for MySQL use the zstd compression algorithm by default.
Parameters
Parameters
Examples
Automatic partitioning by date
Create a partitioned table named sales that is automatically partitioned by the date in the sale_time column.
CREATE TABLE sales (
sale_id BIGINT NOT NULL COMMENT 'Order ID',
customer_id VARCHAR NOT NULL COMMENT 'Customer ID',
phone_num BIGINT NOT NULL COMMENT 'Customer phone number',
revenue DECIMAL(15, 2) COMMENT 'Order amount',
sale_time TIMESTAMP NOT NULL COMMENT 'Order time',
PRIMARY KEY (sale_time,sale_id)
)
DISTRIBUTED BY HASH(sale_id)
PARTITION BY VALUE(DATE_FORMAT(sale_time, '%Y%m%d'));
Partitioned table with a lifecycle
Create a partitioned table named customer. The columns login_time, customer_id, and phone_num form a composite primary key. The distribution key is customer_id, the partition key is login_time, and the lifecycle is set to 30.
Partitions are sorted in descending order by the partition key login_time (e.g., 20231202, 20231201). The system retains only the 30 partitions with the largest key values. When data is written to a 31st partition, the system automatically deletes the partition with the smallest key value.
For example, assume data is written with a login_time of 20231201 on day 1, 20231202 on day 2, and so on. When data with a login_time of 20231231 is written on day 31, the system automatically deletes the partition with the oldest data ('20231201'). This retains only data from the most recent 30 days.
CREATE TABLE customer (
customer_id BIGINT NOT NULL COMMENT 'Customer ID',
customer_name VARCHAR NOT NULL COMMENT 'Customer name',
phone_num BIGINT NOT NULL COMMENT 'Phone number',
city_name VARCHAR NOT NULL COMMENT 'City',
sex INT NOT NULL COMMENT 'Gender',
id_number VARCHAR NOT NULL COMMENT 'ID number',
home_address VARCHAR NOT NULL COMMENT 'Home address',
office_address VARCHAR NOT NULL COMMENT 'Office address',
age INT NOT NULL COMMENT 'Age',
login_time TIMESTAMP NOT NULL COMMENT 'Login time',
PRIMARY KEY (login_time,customer_id,phone_num)
)
DISTRIBUTED BY HASH(customer_id)
PARTITION BY VALUE(DATE_FORMAT(login_time, '%Y%m%d')) LIFECYCLE 30
COMMENT 'Customer information table';
Non-partitioned table
Default behavior: Primary key as distribution key
If you define a primary key for a table but do not specify a distribution key, AnalyticDB for MySQL uses the primary key as the distribution key by default.
CREATE TABLE orders (
order_id BIGINT NOT NULL COMMENT 'Order ID',
customer_id INT NOT NULL COMMENT 'Customer ID',
order_status VARCHAR(1) NOT NULL COMMENT 'Order status',
total_price DECIMAL(15, 2) NOT NULL COMMENT 'Order amount',
order_date DATE NOT NULL COMMENT 'Order date',
PRIMARY KEY(order_id,order_date)
);
Run the SHOW CREATE TABLE statement to verify the result. The output shows that the primary key columns order_id and order_date also serve as the distribution key.
SHOW CREATE TABLE orders;+---------+-----------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+---------+-----------------------------------------------------------------------------------------------------------------------------------------------+
| orders | CREATE TABLE `orders` ( |
| | `order_id` bigint NOT NULL COMMENT 'Order ID', |
| | `customer_id` int NOT NULL COMMENT 'Customer ID', |
| | `order_status` varchar(1) NOT NULL COMMENT 'Order status', |
| | `total_price` decimal(15, 2) NOT NULL COMMENT 'Order amount', |
| | `order_date` date NOT NULL COMMENT 'Order date', |
| | PRIMARY KEY (`order_id`,`order_date`) |
| | ) DISTRIBUTED BY HASH(`order_id`,`order_date`) INDEX_ALL='Y' STORAGE_POLICY='HOT' ENGINE='XUANWU' TABLE_PROPERTIES='{"format":"columnstore"}' |
+---------+-----------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.04 sec)
Default behavior: Auto-added primary and distribution keys
If you create a table without specifying a primary key or a distribution key, AnalyticDB for MySQL automatically adds a column named __adb_auto_id__ and uses it as both the primary key and the distribution key.
CREATE TABLE orders_new (
order_id BIGINT NOT NULL COMMENT 'Order ID',
customer_id INT NOT NULL COMMENT 'Customer ID',
order_status VARCHAR(1) NOT NULL COMMENT 'Order status',
total_price DECIMAL(15, 2) NOT NULL COMMENT 'Order amount',
order_date DATE NOT NULL COMMENT 'Order date'
);
Run the SHOW CREATE TABLE statement to verify the result. The output shows that AnalyticDB for MySQL added an auto-increment column named __adb_auto_id__ and set it as both the primary key and distribution key.
SHOW CREATE TABLE orders_new;+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
| orders_new | CREATE TABLE `orders_new` ( |
| | `__adb_auto_id__` bigint AUTO_INCREMENT, |
| | `order_id` bigint NOT NULL COMMENT 'Order ID', |
| | `customer_id` int NOT NULL COMMENT 'Customer ID', |
| | `order_status` varchar(1) NOT NULL COMMENT 'Order status', |
| | `total_price` decimal(15, 2) NOT NULL COMMENT 'Order amount', |
| | `order_date` date NOT NULL COMMENT 'Order date', |
| | PRIMARY KEY (`__adb_auto_id__`) |
| | ) DISTRIBUTED BY HASH(`__adb_auto_id__`) INDEX_ALL='Y' STORAGE_POLICY='HOT' ENGINE='XUANWU' TABLE_PROPERTIES='{"format":"columnstore"}' |
+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.04 sec)
Specify primary and distribution keys
Create a table named supplier. The column supplier_id is an auto-increment column. The distribution key is supplier_id, and data is sharded by hashing the supplier_id values.
CREATE TABLE supplier (
supplier_id BIGINT AUTO_INCREMENT PRIMARY KEY,
supplier_name VARCHAR,
address INT,
phone VARCHAR
)
DISTRIBUTED BY HASH(supplier_id);
Tiered storage policies
Cold storage policy
CREATE TABLE item (
order_id BIGINT NOT NULL,
item_id INT NOT NULL,
quantity DECIMAL(15, 2) NOT NULL,
discount DECIMAL(15, 2) NOT NULL,
shipdate DATE NOT NULL,
PRIMARY KEY (order_id,item_id,shipdate)
)
DISTRIBUTED BY HASH(item_id)
PARTITION BY VALUE(date_format(shipdate, '%Y%m')) LIFECYCLE 200
STORAGE_POLICY='COLD';
Hot storage policy
CREATE TABLE item (
order_id BIGINT NOT NULL,
item_id INT NOT NULL,
quantity DECIMAL(15, 2) NOT NULL,
discount DECIMAL(15, 2) NOT NULL,
shipdate DATE NOT NULL,
PRIMARY KEY (order_id,item_id,shipdate)
)
DISTRIBUTED BY HASH(item_id)
PARTITION BY VALUE(date_format(shipdate, '%Y%m')) LIFECYCLE 200
STORAGE_POLICY='HOT';
Mixed policy with 16 hot partitions
CREATE TABLE item (
order_id BIGINT NOT NULL,
item_id INT NOT NULL,
quantity DECIMAL(15, 2) NOT NULL,
discount DECIMAL(15, 2) NOT NULL,
shipdate DATE NOT NULL,
PRIMARY KEY (order_id,item_id,shipdate)
)
DISTRIBUTED BY HASH(item_id)
PARTITION BY VALUE(date_format(shipdate, '%Y%m')) LIFECYCLE 200
STORAGE_POLICY='MIXED' HOT_PARTITION_COUNT=16;
Create regular indexes on specific columns
This example creates regular indexes on the id and date columns.
CREATE TABLE index_tb (
id INT,
sales DECIMAL(15, 2),
date DATE,
INDEX (id),
INDEX (date),
PRIMARY KEY (id)
)
DISTRIBUTED BY HASH(id);
Define a clustered index
Create a clustered index named clustered_index on the quantity column.
CREATE TABLE clustered (
product_id INT,
product_name VARCHAR,
quantity INT,
price DECIMAL(10, 2),
CLUSTERED KEY INDEX clustered_index(quantity)
)
DISTRIBUTED BY HASH(product_id);
Full-text index
Create a full-text index named fidx_c on the content column.
CREATE TABLE fulltext_tb (
id INT,
content VARCHAR,
keyword VARCHAR,
FULLTEXT INDEX fidx_c(content),
PRIMARY KEY (id)
)
DISTRIBUTED BY HASH(id);
For more information about creating and modifying a full-text index, see Create a full-text index.
For more information, see Full-text search.
Define a vector index
Define short_feature (type array<smallint>) and float_feature (type array<float>) as vector columns with a dimension of 4.
Create the vector index short_feature_index on the short_feature column and float_feature_index on the float_feature column.
CREATE TABLE fact_tb (
xid BIGINT NOT NULL,
cid BIGINT NOT NULL,
uid VARCHAR NOT NULL,
vid VARCHAR NOT NULL,
wid VARCHAR NOT NULL,
short_feature array<smallint>(4),
float_feature array<float>(4),
ann index short_feature_index(short_feature),
ann index float_feature_index(float_feature),
PRIMARY KEY (xid, cid, vid)
)
DISTRIBUTED BY HASH(xid) PARTITION BY VALUE(cid) LIFECYCLE 4;
For details on vector indexing and vector search, see Vector search.
Define a foreign key
Create a table named store_returns. Use the FOREIGN KEY clause to associate the sr_item_sk column with the primary key customer_id of the customer table.
CREATE TABLE store_returns (
sr_sale_id BIGINT NOT NULL PRIMARY KEY,
sr_store_sk BIGINT,
sr_item_sk BIGINT NOT NULL,
FOREIGN KEY (sr_item_sk) REFERENCES customer (customer_id)
);
JSON array index
Create a JSON array index named idx_vj on the vj column.
CREATE TABLE json(
id INT,
vj JSON,
INDEX idx_vj(vj->'$[*]')
)
DISTRIBUTED BY HASH(id);
For details on creating and modifying JSON array indexes, see Create a JSON array index and JSON array indexes.
FAQ
Column attributes and constraints
Distribution key, partition key, and lifecycle
Indexes
Column-oriented storage
Other
Common errors
Related topics
-
To write data to a table, see INSERT INTO.
-
To insert or overwrite data in a table with query results, see INSERT SELECT FROM or INSERT OVERWRITE SELECT.
-
To import data from data sources such as ApsaraDB RDS, MaxCompute, and OSS, see data import.