In MaxCompute, tables are used to store data. When you develop, analyze, and maintain a data warehouse, you must process table data. This topic describes the operations that you can perform on tables.
The following table describes the supported operations on tables.
Operation | Description | Authorized user | Platform |
---|---|---|---|
Create a table | Creates a non-partitioned table, a partitioned table, an external table, or a clustered table. | Users who have the CreateTable permission on a project | You can execute the statements on the following platforms: |
Change the owner of a table | Changes the owner of a table. | Project owner | |
Modify the comment of a table | Modifies the comment of a table. | Users who have the Alter permission on tables | |
Change the value of LastModifiedTime | Changes the value of LastModifiedTime for a table to the current time. | ||
Modify the clustering attribute of a table | Adds or removes the clustering attribute to or from a table. | ||
Rename a table | Renames a table. | ||
Clear partition data from a non-partitioned table | Clears data from a specified non-partitioned table. | ||
Drop a table | Drops a partitioned table or a non-partitioned table. | Users who have the Drop permission on tables | |
View the information about tables or views | Views the information about MaxCompute internal tables, views, external tables, clustered tables, or transactional tables. | Users who have the Describe permission to read the metadata of a table | |
View partition information | Views the details of partitions in a table. | ||
View the CREATE TABLE statement | Views the SQL DDL statement that is used to create a specified table. | ||
Display tables and views in a project | Displays all tables and views in a project or the tables and views that meet specific rules, such as regular expressions, in a project. | Users who have the List permission on objects in a project | |
Display partitions | Displays all the partitions of a table. If the table does not exist or the table is a non-partitioned table, an error is returned. |
Create a table
Creates a non-partitioned table, a partitioned table, an external table, or a clustered table.
- Limits
- A partitioned table can have a maximum of six levels of partitions. For example, if a table uses date columns as partition key columns, the six levels of the partitions are
year/month/week/day/hour/minute
. - By default, a table can have a maximum of 60,000 partitions. You can adjust the maximum number of partitions in a table based on your business requirements.
For more information about the limits on tables, see MaxCompute SQL limits.
- A partitioned table can have a maximum of six levels of partitions. For example, if a table uses date columns as partition key columns, the six levels of the partitions are
- Syntax
-- Create a table. create [external] table [if not exists] <table_name> [(<col_name> <data_type> [not null] [default <default_value>] [comment <col_comment>], ...)] [comment <table_comment>] [partitioned by (<col_name> <data_type> [comment <col_comment>], ...)] -- Configure the shuffle and sort properties of a clustered table that you want to create. [clustered by | range clustered by (<col_name> [, <col_name>, ...]) [sorted by (<col_name> [asc | desc] [, <col_name> [asc | desc] ...])] into <number_of_buckets> buckets] -- Used only for external tables. [stored by StorageHandler] -- Used only for external tables. [with serdeproperties (options)] -- Used only for external tables. [location <osslocation>] -- Set the table to a transactional table. You can later modify or delete the data of the transactional table. Transactional tables have specific limits. Create a transactional table based on your business requirements. [tblproperties("transactional"="true")] [lifecycle <days>]; -- Create a table based on an existing table and replicate data from the existing table to the new table. Partition properties are not replicated. You can execute the CREATE TABLE statement to create a table based on an external table or an existing table of an external project that is used to implement the data lakehouse solution. create table [if not exists] <table_name> [lifecycle <days>] as <select_statement>; -- Create a table based on an existing table and replicate the schema of the existing table. You can execute the CREATE TABLE statement to create a table based on an external table or an existing table of an external project that is used to implement the data lakehouse solution. create table [if not exists] <table_name> like <existing_table_name> [lifecycle <days>];
- Parameters
- external: optional. This parameter specifies that the table you want to create is an external table.
- if not exists: optional. If you create a table by using the name of an existing table but do not specify the if not exists parameter, an error is returned. If you create a table by using the name of an existing table and specify the if not exists parameter, a success message is returned even if the schema of the existing table is different from the schema of the table that you want to create. The metadata of the existing table remains unchanged.
- table_name: required. The name of the table. The name must be 1 to 128 bytes in length, and can contain letters, digits, and underscores (_). The name must start with a letter. The name is not case-sensitive. If the value of this parameter does not meet the requirements, an error is returned.
- col_name: optional. The name of a table column. The name must be 1 to 128 bytes in length, and can contain letters, digits, and underscores (_). The name must start with a letter. The name is not case-sensitive. If the value of this parameter does not meet the requirements, an error is returned.
- col_comment: optional. The comment of a column. The comment must be a valid string that is 1 to 1,024 bytes in length. If the value of this parameter does not meet the requirements, an error is returned.
- data_type: optional. The data type of a column. The following data types are supported: BIGINT, DOUBLE, BOOLEAN, DATETIME, DECIMAL, and STRING. For more information about data types, see Data type editions.
- not null: optional. If you specify this parameter for a column, the values of the column cannot be NULL. For more information about how to modify the parameter, see Change the non-nullable property of a non-partition key column in a table.
- default_value: optional. The default value of the specified column. If a column is not specified in an
INSERT
operation, the default value is used for the column. - table_comment: optional. The comment of a table. The comment must be a valid string that is 1 to 1,024 bytes in length. If the value of this parameter does not meet the requirements, an error is returned.
- partitioned by (<col_name> <data_type> [comment <col_comment>], ...: optional. The partition fields of a partitioned table.
- col_name: the name of a partition key column. The name must be 1 to 128 bytes in length, and can contain letters, digits, and underscores (_). The name must start with a letter. The name is not case-sensitive. If the value of this parameter does not meet the requirements, an error is returned.
- data_type: the data type of a partition key column. In the MaxCompute V1.0 data type edition, partition key columns must be of the STRING type. In the MaxCompute V2.0 data type edition, partition key columns can be of the TINYINT, SMALLINT, INT, BIGINT, VARCHAR, or STRING type. For more information, see Data type editions. If you use a partition field to partition a table, a full table scan is not required when you add partitions, update partition data, or read partition data. This improves the efficiency of data processing.
- col_comment: the comment of a partition key column. The comment must be a valid string that is 1 to 1,024 bytes in length. If the value of this parameter does not meet the requirements, an error is returned.
Note The value of a partition key column cannot contain double-byte characters, such as Chinese characters. It must start with a letter and can contain letters, digits, and supported special characters. It must be 1 to 255 bytes in length. The following special characters are supported: spaces, colons (:), underscores (_), dollar signs ($), number signs (#), periods (.), exclamation points (!), and at signs (@). The behavior of other characters is not defined, such as escaped characters\t
,\n
, and/
. - clustered by | range clustered by (<col_name> [, <col_name>, ...]) [sorted by (<col_name> [asc | desc] [, <col_name> [asc | desc] ...])] into <number_of_buckets> buckets: optional. The shuffle and sort properties of the clustered table that you want to create. Clustered tables are classified into hash-clustered tables and range-clustered tables.
- Hash-clustered tables
- CLUSTERED BY: the hash key. MaxCompute performs a hash operation on specified columns and distributes data to each bucket based on the hash values. To prevent data skew and hot spots and to improve the efficiency of concurrent executions, we recommend that you specify columns with a large value range and a small number of duplicate key values in
CLUSTERED BY
. To optimize the performance ofJOIN
operations, we recommend that you select commonly used join or aggregate keys that are similar to primary keys in conventional databases. - SORTED BY: specifies how to sort fields in a bucket. To improve performance, we recommend that you keep the configuration of the SORTED BY clause consistent with that of the CLUSTERED BY clause. After you specify fields in the SORTED BY clause, MaxCompute automatically generates indexes, which can be used to accelerate data queries.
- number_of_buckets: the number of hash buckets. This parameter is required and the value of this parameter varies based on the amount of data. By default, MaxCompute supports a maximum of 1,111 reducers. This means that MaxCompute supports a maximum of 1,111 hash buckets. You can run the
set odps.stage.reducer.num =<Number of concurrent reducers
command to increase the maximum number of hash buckets. However, the maximum number of hash buckets cannot exceed 4,000. Otherwise, performance may be affected.To maintain optimal performance, we recommend that you take note of the following rules when you specify the number of hash buckets:- Keep the size of each hash bucket around 500 MB. For example, if the partition size is 500 GB, we recommend that you add 1,000 hash buckets. This way, the size of each hash bucket is 500 MB on average. If a table contains a large amount of data, you can increase the size of each hash bucket from 500 MB to a size in the range of 2 GB to 3 GB. You can also execute the
set odps.stage.reducer.num =<Number of concurrent reducers>;
statement to set the maximum number of hash buckets to a value greater than 1111. - To optimize the performance of
JOIN
operations, we recommend that you do not configure the shuffle and sort properties for hash-clustered tables. The number of hash buckets of a table must be a multiple of the number of hash buckets of the other table. For example, one table has 256 hash buckets and the other table has 512 hash buckets. We recommend that you set the number of hash buckets to 2n, such as 512, 1024, 2048, and 4096. This way, MaxCompute can automatically split and merge hash buckets. To improve execution efficiency, you can also skip the step of configuring the shuffle and sort properties for hash-clustered tables.
- Keep the size of each hash bucket around 500 MB. For example, if the partition size is 500 GB, we recommend that you add 1,000 hash buckets. This way, the size of each hash bucket is 500 MB on average. If a table contains a large amount of data, you can increase the size of each hash bucket from 500 MB to a size in the range of 2 GB to 3 GB. You can also execute the
- CLUSTERED BY: the hash key. MaxCompute performs a hash operation on specified columns and distributes data to each bucket based on the hash values. To prevent data skew and hot spots and to improve the efficiency of concurrent executions, we recommend that you specify columns with a large value range and a small number of duplicate key values in
- Range-clustered tables
- RANGE CLUSTERED BY: the range-clustered columns. MaxCompute performs the bucket operation on the specified columns and distributes data to each bucket based on the bucket ID.
- SORTED BY: the sequence of fields in a bucket. You can use this parameter in the same way as you use it for a hash-clustered table.
- number_of_buckets: the number of hash buckets. Compared with hash-clustered tables, range-clustered tables have no limits on the number of buckets when data is evenly distributed. If you do not specify the number of buckets in a range-clustered table, MaxCompute automatically determines the optimal number based on the amount of data.
- If JOIN and AGGREGATE operations are performed on range-clustered tables and the join key or group key is the range-clustered key or the prefix of the range-clustered key, you can manage flags to disable shuffling. This improves execution efficiency. You can set
odps.optimizer.enable.range.partial.repartitioning
to true to enable shuffling. By default, this parameter is set to false, which indicates that shuffling is disabled.Note- Clustered tables help optimize the following aspects:
- Bucket pruning
- Aggregation
- Storage
- Limits on clustered tables
- The
INSERT INTO
statement is not supported. You can add data only by using theINSERT OVERWRITE
statement. - The data that is imported by using Tunnel commands is not arranged in order. Therefore, you cannot import data into a range-clustered table by using Tunnel commands.
- The data backup and restoration feature is not supported.
- The
- Clustered tables help optimize the following aspects:
- Hash-clustered tables
- stored by StorageHandler: optional. The StorageHandler specified based on the data format of the external table.
- with serdeproperties (options): optional. The parameters related to the authorization, compression, and character parsing of the external table.
- osslocation: optional. The Object Storage Service (OSS) bucket where the data of the external table is stored. For more information, see Create an OSS external table.
- tblproperties("transactional"="true"): optional. Set the table to a transactional table. You can later perform the
UPDATE
orDELETE
operation on the transactional table to update or delete data by rows. For more information, see UPDATE and DELETE.A transactional table has the following limits:- MaxCompute allows you to set a table to a transactional table only when you create the table. If you execute the
ALTER TABLE
statement to change an existing table to a transactional table, an error is returned.alter table not_txn_tbl set tblproperties("transactional"="true"); -- The following error is returned: FAILED: Catalog Service Failed, ErrorCode: 151, Error Message: Set transactional is not supported
- When you create a clustered table or an external table, you cannot set the clustered table or external table to a transactional table.
- You cannot convert between transactional tables and MaxCompute internal tables, external tables, or clustered tables.
- Jobs from other systems, such as MaxCompute Spark, Machine Learning Platform for AI, and Graph, cannot access transactional tables.
CLONE TABLE
andMERGE PARTITION
operations are not supported.- Before you execute the
UPDATE
,DELETE
, orINSERT OVERWRITE
statement for important data in transactional tables, you must execute theSELECT
andINSERT
statements to back up the data to other tables.
- MaxCompute allows you to set a table to a transactional table only when you create the table. If you execute the
- lifecycle: optional. The lifecycle of the table. The value must be a positive integer. Unit: days.
- Non-partitioned tables: If data in a non-partitioned table remains unchanged for the number of days specified by days after the last data update, MaxCompute executes a statement, such as DROP TABLE, to reclaim the table.
- Partitioned tables: MaxCompute determines whether to reclaim a partition based on the value of LastModifiedTime. Unlike non-partitioned tables, a partitioned table is not deleted even if all of its partitions have been reclaimed. You can configure lifecycles for tables, but not for partitions.
- You can execute the
create table [if not exists] <table_name> [lifecycle <days>] as <select_statement>;
statement to create a table and replicate data to the table. However, partition properties and the lifecycle attribute of the source table are not replicated to the created table. The partition key columns of the source table are considered common columns in the created table. You can also execute this statement to create an internal table and replicate the schema of an external table to the internal table. You can configure the lifecycle parameter to reclaim the table. You can also execute this statement to create an internal table and replicate data of an external table to the internal table. - You can execute the
create table [if not exists] <table_name> like <existing_table_name> [lifecycle <days>];
statement to create a table that has the same schema as the source table. However, tables that are created by using this statement do not replicate data or replicate the lifecycle attribute of the source table. You can configure the lifecycle parameter to reclaim the table. You can also execute this statement to create an internal table that has the same schema as an existing external table.
- Examples:
- Example 1: Create a non-partitioned table named test1.
create table test1 (key STRING);
- Example 2: Create a partitioned table named sale_detail.
create table if not exists sale_detail( shop_name STRING, customer_id STRING, total_price DOUBLE) partitioned by (sale_date STRING, region STRING);
- Example 3: Create a table named sale_detail_ctas1, replicate data from the sale_detail table to the sale_detail_ctas1 table, and then configure the lifecycle for the sale_detail_ctas1 table.
create table sale_detail_ctas1 lifecycle 10 as select * from sale_detail;
You can run the
desc extended sale_detail_ctas1;
command to view table details, such as the schema and lifecycle of a table.The
sale_detail
table is a partitioned table, but thesale_detail_ctas1
table that is created by usingcreate table ... as select_statement ...
does not replicate partition properties. The partition key columns of the source table are considered common columns in the created table. Thesale_detail_ctas1
is a non-partitioned table that has five columns. - Example 4: Create the sale_detail_ctas2 table and use constants as column values in the
SELECT
clause.-- Column names are specified. create table sale_detail_ctas2 as select shop_name, customer_id, total_price, '2013' as sale_date, 'China' as region from sale_detail; -- Column names are not specified. create table sale_detail_ctas3 as select shop_name, customer_id, total_price, '2013', 'China' from sale_detail;
Note If you use constants as column values in theSELECT
clause, we recommend that you specify column names. In this example, the names of the fourth and fifth columns in the sale_detail_ctas3 table contain suffixes that are similar to_c4
and_c5
. - Example 5: Create a table named sale_detail_like that uses the same schema as the sale_detail table and configure the lifecycle for the sale_detail_like table.
create table sale_detail_like like sale_detail lifecycle 10;
You can run the
desc extended sale_detail_like;
command to view table details, such as the schema and lifecycle of a table.The schema of the sale_detail_like table is the same as that of the sale_detail table. The two tables have the same properties, such as column names, column comments, and table comments, aside from the lifecycle. However, data in the sale_detail table is not replicated to the sale_detail_like table.
- Example 6: Create a table named mc_oss_extable_orc_like that uses the same schema as the external table mc_oss_extable_orc.
create table mc_oss_extable_orc_like like mc_oss_extable_orc;
You can run thedesc mc_oss_extable_orc_like;
command to view table details, such as the schema of a table.+------------------------------------------------------------------------------------+ | Owner: ALIYUN$****@***.aliyunid.com | Project: max_compute_7u************yoq | | TableComment: | +------------------------------------------------------------------------------------+ | CreateTime: 2022-08-11 11:10:47 | | LastDDLTime: 2022-08-11 11:10:47 | | LastModifiedTime: 2022-08-11 11:10:47 | +------------------------------------------------------------------------------------+ | InternalTable: YES | Size: 0 | +------------------------------------------------------------------------------------+ | Native Columns: | +------------------------------------------------------------------------------------+ | Field | Type | Label | Comment | +------------------------------------------------------------------------------------+ | id | string | | | | name | string | | | +------------------------------------------------------------------------------------+
- Example 7: Create a table named test_newtype that uses new data types.
set odps.sql.type.system.odps2=true; CREATE TABLE test_newtype ( c1 TINYINT ,c2 SMALLINT ,c3 INT ,c4 BIGINT ,c5 FLOAT ,c6 DOUBLE ,c7 DECIMAL ,c8 BINARY ,c9 TIMESTAMP ,c10 ARRAY<MAP<BIGINT,BIGINT>> ,c11 MAP<STRING,ARRAY<BIGINT>> ,c12 STRUCT<s1:STRING,s2:BIGINT> ,c13 VARCHAR(20)) LIFECYCLE 1 ;
- Example 8: Create a hash-clustered table named t1. This table is a non-partitioned table.
create table t1 (a STRING, b STRING, c BIGINT) clustered by (c) sorted by (c) into 1024 buckets;
- Example 9: Create a hash-clustered table named t2. This table is a partitioned table.
create table t2 (a STRING, b STRING, c BIGINT) partitioned by (dt STRING) clustered by (c) sorted by (c) into 1024 buckets;
- Example 10: Create a range-clustered table named t3. This table is a non-partitioned table.
create table t3 (a STRING, b STRING, c BIGINT) range clustered by (c) sorted by (c) into 1024 buckets;
- Example 11: Create a range-clustered table named t4. This table is a partitioned table.
create table t4 (a STRING, b STRING, c BIGINT) partitioned by (dt STRING) range clustered by (c) sorted by (c);
- Example 12: Create a transactional table named t5. This table is a non-partitioned table.
create table t5(id bigint) tblproperties("transactional"="true");
- Example 13: Create a transactional table t6. This table is a partitioned table.
create table if not exists t6(id bigint) partitioned by(ds string) tblproperties ("transactional"="true");
- Example 14: Create a non-partitioned table named test_default and configure default values for the parameters.
create table test_default ( tinyint_name tinyint not NULL default 1Y, smallint_name SMALLINT not NULL DEFAULT 1S, int_name INT not NULL DEFAULT 1, bigint_name BIGINT not NULL DEFAULT 1, binary_name BINARY , float_name FLOAT , double_name DOUBLE not NULL DEFAULT 0.1, decimal_name DECIMAL(2, 1) not NULL DEFAULT 0.0BD, varchar_name VARCHAR(10) , char_name CHAR(2) , string_name STRING not NULL DEFAULT 'N', boolean_name BOOLEAN not NULL DEFAULT TRUE );
- Example 15: Execute the
CREATE TABLE [if not exists] <table_name> [lifecycle <days>] as <select_statement>;
statement to create an internal table and replicate data of an partitioned external table to the internal table. Partition properties are not replicated to the internal table.-- Query the external table of an external project that is used to implement the data lakehouse solution in MaxCompute. select * from hive_external2_1.myhive_0110; -- The following result is returned: a b c 101 1 20230110 102 2 20230110 103 3 20230110 -- Execute the CREATE TABLE AS statement to create an internal table. create table from_exetbl_as_par as select * from hive_external2_1.myhive_0110_par; -- Query the internal table. select * from from_exetbl_as_par; -- All data in the internal table is returned. a b c 101 1 20230110 102 2 20230110 103 3 20230110 -- Query the schema of the internal table. desc from_exetbl_as_par; -- The following result is returned: +------------------------------------------------------------------------------------+ | Owner: ALIYUN$*********** | | Project: ***_*****_*** | | TableComment: | +------------------------------------------------------------------------------------+ | CreateTime: 2023-01-10 15:16:33 | | LastDDLTime: 2023-01-10 15:16:33 | | LastModifiedTime: 2023-01-10 15:16:33 | +------------------------------------------------------------------------------------+ | InternalTable: YES | Size: 919 | +------------------------------------------------------------------------------------+ | Native Columns: | +------------------------------------------------------------------------------------+ | Field | Type | Label | Comment | +------------------------------------------------------------------------------------+ | a | string | | | | b | string | | | | c | string | | | +------------------------------------------------------------------------------------+
- Example 16: Execute the
CREATE TABLE [if not exists] <table_name> like <existing_table_name> [lifecycle <days>];
statement to create an internal table and replicate the schema of a partitioned external table to the internal table. Partition properties are replicated to the internal table.-- Query the external table of an external project that is used to implement the data lakehouse solution in MaxCompute. select * from hive_external2_1.myhive_0110_par; -- The following result is returned: a b c 101 1 20230110 102 2 20230110 103 3 20230110 -- Execute the CREATE TABLE LIKE statement to create an internal table. create table from_exetbl_like like hive_external2_1.myhive_0110_par; -- Query the internal table. select * from from_exetbl_like_par; -- Only the schema of the internal table is returned. a b c -- Query the schema of the internal table. desc from_exetbl_like; -- The following result is returned: +------------------------------------------------------------------------------------+ | Owner: ALIYUN$************ | | Project: ***_*****_*** | | TableComment: | +------------------------------------------------------------------------------------+ | CreateTime: 2023-01-10 15:09:47 | | LastDDLTime: 2023-01-10 15:09:47 | | LastModifiedTime: 2023-01-10 15:09:47 | +------------------------------------------------------------------------------------+ | InternalTable: YES | Size: 0 | +------------------------------------------------------------------------------------+ | Native Columns: | +------------------------------------------------------------------------------------+ | Field | Type | Label | Comment | +------------------------------------------------------------------------------------+ | a | string | | | | b | string | | | +------------------------------------------------------------------------------------+ | Partition Columns: | +------------------------------------------------------------------------------------+ | c | string | | +------------------------------------------------------------------------------------+
- Example 1: Create a non-partitioned table named test1.
Change the owner of a table
Changes the owner of a table.
- Syntax
alter table <table_name> changeowner to <new_owner>;
- Parameters
- table_name: required. The name of the table whose owner you want to change.
- new_owner: required. The new owner of the table. If you want to change the owner of a table to a RAM user, you must set this parameter to a value in the
RAM$<UID>:<ram_name>
format. UID indicates the ID of the Alibaba Cloud account and ram_name indicates the display name of the RAM user.Note If you want to change the owner of a table to a RAM user, make sure that the RAM user is added to the project to which the table belongs.
- Examples
- Change the owner of the test1 table to ALIYUN$xxx@aliyun.com.
alter table test1 changeowner to 'ALIYUN$xxx@aliyun.com';
- Change the owner of the test1 table to a RAM user named
ram_test
.alter table test1 changeowner to 'RAM$13xxxxxxxxxxx:ram_test';
- Change the owner of the test1 table to ALIYUN$xxx@aliyun.com.
Modify the comment of a table
Modifies the comment of a table.
- Syntax
alter table <table_name> set comment '<new_comment>';
- Parameters
- table_name: required. The name of the table whose comment you want to modify.
- new_comment: required. The new comment of the table.
- Examples
You can execute thealter table sale_detail set comment 'new coments for table sale_detail';
DESC table_name
statement of MaxCompute to view the modification result of thecomment
in the table.
Change the value of LastModifiedTime
Changes the value of LastModifiedTime
for a table to the current time. MaxCompute SQL allows you to execute the TOUCH
statement to change the value of LastModifiedTime
. After you execute this statement to change the value of LastModifiedTime
, MaxCompute determines that the table data has changed, and restarts the lifecycle of the table from the time that is specified by LastModifiedTime.
- Syntax
alter table <table_name> touch;
- Parameters
table_name: required. The name of the table whose LastModifiedTime you want to modify.
- Examples
alter table sale_detail touch;
Modify the clustering attribute of a table
Adds or removes the clustering attribute to or from a table. MaxCompute allows you to add or remove the clustering attribute to or from a table by executing the ALTER TABLE
statement.
- Syntax
- Syntax of the statement that is used to add the hash clustering attribute for a table:
alter table <table_name> [clustered by (<col_name> [, <col_name>, ...]) [sorted by (<col_name> [asc | desc] [, <col_name> [asc | desc] ...])] into <number_of_buckets> buckets];
- Syntax of the statement that is used to remove the hash clustering attribute from a table:
alter table <table_name> not clustered;
- If you do not specify the number of buckets in a range-clustered table, MaxCompute automatically determines the optimal number based on the amount of data. Syntax:
alter table <table_name> [range clustered by (<col_name> [, <col_name>, ...]) [sorted by (<col_name> [asc | desc] [, <col_name> [asc | desc] ...])] into <number_of_buckets> buckets];
- Syntax of the statement that is used to remove the range clustering attribute from a table or partition:
alter table <table_name> not clustered; alter table <table_name> <pt_spec> not clustered;
Note- The
ALTER TABLE
statement can modify the clustering attribute only for a partitioned table. The clustering attribute of a non-partitioned table cannot be modified after the table is created. TheALTER TABLE
statement is suitable for existing tables. After you specify the clustering attribute, new partitions are stored based on the clustering attribute that you specified. ALTER TABLE
takes effect only on the new partitions in a partitioned table. The new partitions include those generated by usingINSERT OVERWRITE
and are stored based on the new clustering attribute. The clustering attribute and storage method remain unchanged for the original partitions. After you specify the clustering properties for a table, you can remove the clustering properties and add clustering properties for the table again. You can specify different clustering columns, sort columns, and numbers of buckets for new partitions.ALTER TABLE
takes effect only on the new partitions. Therefore, this statement cannot be used to specify partitions.
- The
- Syntax of the statement that is used to add the hash clustering attribute for a table:
- Parameters
For more information about DataWorks, see Create a table.
Rename a table
Renames a table. After you rename a table, only the name of the table is changed. Data in the table is not changed.
- Syntax
alter table <table_name> rename to <new_table_name>;
- Parameters
- table_name: required. The name of the table that you want to rename.
- new_table_name: required. The new name of the table. If the name specified by the new_table_name parameter exists, an error is returned.
- Examples
alter table sale_detail rename to sale_detail_rename;
Clear partition data from a non-partitioned table
Clears partition data from a specified non-partitioned table. For more information about how to clear data from one or more partitions in a partitioned table, see Clear data from a partition.
- Syntax
truncate table <table_name>;
- Parameters
table_name: required. The name of the non-partitioned table whose partition data you want to clear.
Drop a table
Drops a non-partitioned table or partitioned table.
- Usage notes
- Before you drop a table, confirm that the table can be dropped. Proceed with caution. If you accidentally drop a table, you can restore the table if the backup and restoration feature is enabled for the project and the table is dropped within the backup data retention period specified for the project. For more information about the backup and restoration feature, see Backup and restoration.
- After you drop a table, the volume of stored data in a MaxCompute project decreases.
- Syntax
drop table [if exists] <table_name>;
- Parameters
- if exists: optional. If you do not specify the if exists parameter and the table that you want to drop does not exist, an error is returned. If you specify the if exists parameter, a success message is returned regardless of whether the table exists.
- table_name: required. The name of the table that you want to drop.
- Examples
-- Drop the sale_detail table. A success message is returned regardless of whether the sale_detail table exists. drop table if exists sale_detail;
View the information about tables or views
Views the information about MaxCompute internal tables, views, external tables, clustered tables, or transactional tables. For more information about how to view detailed table information, see SELECT syntax.
- Syntax
-- View the information about a table or view. desc <table_name|view_name> [partition (<pt_spec>)]; -- View the information about an external table, a clustered table, or a transactional table. You can also execute this statement to view extended information about an internal table. desc extended <table_name>;
- Parameters
- table_name: required. The name of the table that you want to view.
- view_name: required. The name of the view whose information you want to view.
- pt_spec: optional. The partition in the partitioned table that you want to view. The value of this parameter is in the
(partition_col1 = partition_col_value1, partition_col2 = partition_col_value2, ...)
format. - extended: This parameter is required if the table is an external table, a clustered table, or a transactional table. This parameter is used to query extended information about a table. You can also use this parameter to view extended information about an internal table, such as whether a column of the internal table can contain NULL values.
- Examples
- Example 1: View the information about the test1 table.
The following result is returned:desc test1;
+------------------------------------------------------------------------------------+ | Owner: ALIYUN$maoXXX@alibaba-inc.com | Project: $project_name | | TableComment: | +------------------------------------------------------------------------------------+ | CreateTime: 2020-11-16 17:47:48 | | LastDDLTime: 2020-11-16 17:47:48 | | LastModifiedTime: 2020-11-16 17:47:48 | +------------------------------------------------------------------------------------+ | InternalTable: YES | Size: 0 | +------------------------------------------------------------------------------------+ | Native Columns: | +------------------------------------------------------------------------------------+ | Field | Type | Label | Comment | +------------------------------------------------------------------------------------+ | key | string | | | +------------------------------------------------------------------------------------+
- Example 2: View the information about the sale_detail table.
desc sale_detail;
The following result is returned:+--------------------------------------------------------------------+ | Owner: ALIYUN$maoXXX@alibaba-inc.com | Project: $project_name | | TableComment: | +--------------------------------------------------------------------+ | CreateTime: 2017-06-28 15:05:17 | | LastDDLTime: 2017-06-28 15:05:17 | | LastModifiedTime: 2017-06-28 15:05:17 | +--------------------------------------------------------------------+ | InternalTable: YES | Size: 0 | +--------------------------------------------------------------------+ | Native Columns: | +--------------------------------------------------------------------+ | Field | Type | Label | Comment | +--------------------------------------------------------------------+ | shop_name | string | | | | customer_id | string | | | | total_price | double | | | +--------------------------------------------------------------------+ | Partition Columns: | +--------------------------------------------------------------------+ | sale_date | string | | | region | string | | +--------------------------------------------------------------------+
- Example 3: View the detailed information about the sale_detail_ctas1 table.
The following result is returned:desc extended sale_detail_ctas1;
The sale_date and region columns are considered as common columns. They are not partition key columns.+------------------------------------------------------------------------------------+ | Owner: ALIYUN$maoXXX@alibaba-inc.com | Project: $project_name | | TableComment: | +------------------------------------------------------------------------------------+ | CreateTime: 2021-07-07 15:29:53 | | LastDDLTime: 2021-07-07 15:29:53 | | LastModifiedTime: 2021-07-07 15:29:53 | | Lifecycle: 10 | +------------------------------------------------------------------------------------+ | InternalTable: YES | Size: 0 | +------------------------------------------------------------------------------------+ | Native Columns: | +------------------------------------------------------------------------------------+ | Field | Type | Label | ExtendedLabel | Nullable | DefaultValue | Comment | +------------------------------------------------------------------------------------+ | shop_name | string | | | true | NULL | | | customer_id | string | | | true | NULL | | | total_price | double | | | true | NULL | | | sale_date | string | | | true | NULL | | | region | string | | | true | NULL | | +------------------------------------------------------------------------------------+ | Extended Info: | +------------------------------------------------------------------------------------+ | TableID: 98cb8a38733c49eabed4735173818147 | | IsArchived: false | | PhysicalSize: 0 | | FileNum: 0 | | StoredAs: AliOrc | | CompressionStrategy: normal | +------------------------------------------------------------------------------------+
- Example 4: View the information about the sale_detail_ctas2 table.
desc sale_detail_ctas2;
The following result is returned:+--------------------------------------------------------------------+ | Owner: ALIYUN$xxxxx@alibaba-inc.com | Project: $project_name | | TableComment: | +--------------------------------------------------------------------+ | CreateTime: 2017-06-28 15:42:17 | | LastDDLTime: 2017-06-28 15:42:17 | | LastModifiedTime: 2017-06-28 15:42:17 | +--------------------------------------------------------------------+ | InternalTable: YES | Size: 0 | +--------------------------------------------------------------------+ | Native Columns: | +--------------------------------------------------------------------+ | Field | Type | Label | Comment | +--------------------------------------------------------------------+ | shop_name | string | | | | customer_id | string | | | | total_price | double | | | | sale_date | string | | | | region | string | | | +--------------------------------------------------------------------+
- Example 5: View the details about the sale_detail_like table.
desc extended sale_detail_like;
The following result is returned:
Aside from the lifecycle configuration, the properties, such as field types and partition types, of the sale_detail_like table are the same as those of the sale_detail table.+------------------------------------------------------------------------------------+ | Owner: ALIYUN$xxxxx@alibaba-inc.com | Project: $project_name | | TableComment: | +------------------------------------------------------------------------------------+ | CreateTime: 2021-07-07 15:40:38 | | LastDDLTime: 2021-07-07 15:40:38 | | LastModifiedTime: 2021-07-07 15:40:38 | | Lifecycle: 10 | +------------------------------------------------------------------------------------+ | InternalTable: YES | Size: 0 | +------------------------------------------------------------------------------------+ | Native Columns: | +------------------------------------------------------------------------------------+ | Field | Type | Label | ExtendedLabel | Nullable | DefaultValue | Comment | +------------------------------------------------------------------------------------+ | shop_name | string | | | true | NULL | | | customer_id | string | | | true | NULL | | | total_price | double | | | true | NULL | | +------------------------------------------------------------------------------------+ | Partition Columns: | +------------------------------------------------------------------------------------+ | sale_date | string | | | region | string | | +------------------------------------------------------------------------------------+ | Extended Info: | +------------------------------------------------------------------------------------+ | TableID: 61782ff7713f426e9d6f91d5deeac99a | | IsArchived: false | | PhysicalSize: 0 | | FileNum: 0 | | StoredAs: AliOrc | | CompressionStrategy: normal | +------------------------------------------------------------------------------------+
Note The data size in the output of theDESC table_name
command includes the data size of the recycle bin. If you want to clear the recycle bin, execute thePURGE TABLE table_name
statement. Then, execute theDESC table_name
statement to view the size of data that excludes the size of data in the recycle bin. You can also execute theSHOW RECYCLEBIN
statement to view the details about data in the recycle bin for the current project. - Example 6: View the information about the test_newtype table.
The following result is returned:desc test_newtype;
| Native Columns: | +------------------------------------------------------------------------------------+ | Field | Type | Label | Comment | +------------------------------------------------------------------------------------+ | c1 | tinyint | | | | c2 | smallint | | | | c3 | int | | | | c4 | bigint | | | | c5 | float | | | | c6 | double | | | | c7 | decimal | | | | c8 | binary | | | | c9 | timestamp | | | | c10 | array<map<bigint,bigint>> | | | | c11 | map<string,array<bigint>> | | | | c12 | struct<s1:string,s2:bigint> | | | | c13 | varchar(20) | | | +------------------------------------------------------------------------------------+ OK
- Example 7: View the information of the t1 hash-clustered table. This table is a non-partitioned table. The clustering attribute is displayed in Extended Info.
The following result is returned:desc extended t1;
+------------------------------------------------------------------------------------+ | Owner: ALIYUN$xxxxx@alibaba-inc.com | Project: $project_name | | TableComment: | +------------------------------------------------------------------------------------+ | CreateTime: 2020-11-16 18:00:56 | | LastDDLTime: 2020-11-16 18:00:56 | | LastModifiedTime: 2020-11-16 18:00:56 | +------------------------------------------------------------------------------------+ | InternalTable: YES | Size: 0 | +------------------------------------------------------------------------------------+ | Native Columns: | +------------------------------------------------------------------------------------+ | Field | Type | Label | ExtendedLabel | Nullable | DefaultValue | Comment | +------------------------------------------------------------------------------------+ | a | string | | | true | NULL | | | b | string | | | true | NULL | | | c | bigint | | | true | NULL | | +------------------------------------------------------------------------------------+ | Extended Info: | +------------------------------------------------------------------------------------+ | TableID: e6b06f705dc34a36a5b72e5af486cab7 | | IsArchived: false | | PhysicalSize: 0 | | FileNum: 0 | | StoredAs: AliOrc | | CompressionStrategy: normal | | ClusterType: hash | | BucketNum: 1024 | | ClusterColumns: [c] | | SortColumns: [c ASC] | +------------------------------------------------------------------------------------+ OK
- Example 8: View the information about the t2 hash-clustered table. This table is a partitioned table. The clustering attribute is displayed in Extended Info.
The following result is returned:desc extended t2;
+------------------------------------------------------------------------------------+ | Owner: ALIYUN$xxxxx@alibaba-inc.com | Project: $project_name | | TableComment: | +------------------------------------------------------------------------------------+ | CreateTime: 2017-12-25 11:18:26 | | LastDDLTime: 2017-12-25 11:18:26 | | LastModifiedTime: 2017-12-25 11:18:26 | | Lifecycle: 2 | +------------------------------------------------------------------------------------+ | InternalTable: YES | Size: 0 | +------------------------------------------------------------------------------------+ | Native Columns: | +------------------------------------------------------------------------------------+ | Field | Type | Label | Comment | +------------------------------------------------------------------------------------+ | a | string | | | | b | string | | | | c | bigint | | | +------------------------------------------------------------------------------------+ | Partition Columns: | +------------------------------------------------------------------------------------+ | dt | string | | +------------------------------------------------------------------------------------+ | Extended Info: | +------------------------------------------------------------------------------------+ | TableID: 91a3395d3ef64b4d9ee1d2852755 | | IsArchived: false | | PhysicalSize: 0 | | FileNum: 0 | | ClusterType: hash | | BucketNum: 1024 | | ClusterColumns: [c] | | SortColumns: [c ASC] | +------------------------------------------------------------------------------------+ OK
- Example 9: View the information about the t3 range-clustered table. This table is a non-partitioned table. The clustering attribute is displayed in Extended Info.
The following result is returned:desc extended t3;
+------------------------------------------------------------------------------------+ | Owner: ALIYUN$xxxxx@alibaba-inc.com | Project: $project_name | | TableComment: | +------------------------------------------------------------------------------------+ | CreateTime: 2020-11-16 18:01:05 | | LastDDLTime: 2020-11-16 18:01:05 | | LastModifiedTime: 2020-11-16 18:01:05 | +------------------------------------------------------------------------------------+ | InternalTable: YES | Size: 0 | +------------------------------------------------------------------------------------+ | Native Columns: | +------------------------------------------------------------------------------------+ | Field | Type | Label | ExtendedLabel | Nullable | DefaultValue | Comment | +------------------------------------------------------------------------------------+ | a | string | | | true | NULL | | | b | string | | | true | NULL | | | c | bigint | | | true | NULL | | +------------------------------------------------------------------------------------+ | Extended Info: | +------------------------------------------------------------------------------------+ | TableID: 38d170aca2684f4baadbbe1931a6ae1f | | IsArchived: false | | PhysicalSize: 0 | | FileNum: 0 | | StoredAs: AliOrc | | CompressionStrategy: normal | | ClusterType: range | | BucketNum: 1024 | | ClusterColumns: [c] | | SortColumns: [c ASC] | +------------------------------------------------------------------------------------+ OK
- Example 10: View the information about the t4 range-clustered table. This table is a partitioned table. The clustering attribute is displayed in Extended Info.
The following result is returned:desc extended t4;
+------------------------------------------------------------------------------------+ | Owner: ALIYUN$xxxxx@alibaba-inc.com | Project: $project_name | | TableComment: | +------------------------------------------------------------------------------------+ | CreateTime: 2020-11-16 19:17:48 | | LastDDLTime: 2020-11-16 19:17:48 | | LastModifiedTime: 2020-11-16 19:17:48 | +------------------------------------------------------------------------------------+ | InternalTable: YES | Size: 0 | +------------------------------------------------------------------------------------+ | Native Columns: | +------------------------------------------------------------------------------------+ | Field | Type | Label | ExtendedLabel | Nullable | DefaultValue | Comment | +------------------------------------------------------------------------------------+ | a | string | | | true | NULL | | | b | string | | | true | NULL | | | c | bigint | | | true | NULL | | +------------------------------------------------------------------------------------+ | Partition Columns: | +------------------------------------------------------------------------------------+ | dt | string | | +------------------------------------------------------------------------------------+ | Extended Info: | +------------------------------------------------------------------------------------+ | TableID: 6ebc3432e283449188c861427bcd6ee4 | | IsArchived: false | | PhysicalSize: 0 | | FileNum: 0 | | StoredAs: AliOrc | | CompressionStrategy: normal | | ClusterType: range | | BucketNum: 0 | | ClusterColumns: [c] | | SortColumns: [c ASC] | +------------------------------------------------------------------------------------+ OK
- Example 11: Check whether the t5 non-partitioned table is a transactional table. Note We recommend that you use the MaxCompute client to check whether a table is a transactional table. The version of the MaxCompute client must be V0.35.4 or later. For more information about how to download and use the MaxCompute client, see MaxCompute client. Other tools may not be updated and do not display transactional information.
The following result is returned:desc extended t5;
+------------------------------------------------------------------------------------+ | Owner: ALIYUN$xxxxx@aliyun.com | Project: $project_name | | TableComment: | +------------------------------------------------------------------------------------+ | CreateTime: 2021-02-18 10:56:27 | | LastDDLTime: 2021-02-18 10:56:27 | | LastModifiedTime: 2021-02-18 10:56:27 | +------------------------------------------------------------------------------------+ | InternalTable: YES | Size: 0 | +------------------------------------------------------------------------------------+ | Native Columns: | +------------------------------------------------------------------------------------+ | Field | Type | Label | ExtendedLabel | Nullable | DefaultValue | Comment | +------------------------------------------------------------------------------------+ | id | bigint | | | true | NULL | | +------------------------------------------------------------------------------------+ | Extended Info: | +------------------------------------------------------------------------------------+ ... | Transactional: true | +------------------------------------------------------------------------------------+
- Example 12: Check whether the partitioned table t6 is a transactional table. Sample statement: Note We recommend that you use the MaxCompute client to check whether a table is a transactional table. The version of the MaxCompute client must be V0.35.4 or later. For more information about how to download and use the MaxCompute client, see MaxCompute client. Other tools may not be updated and do not display transactional information.
The following result is returned:desc extended t6;
+------------------------------------------------------------------------------------+ | Owner: ALIYUN$xxxxx@test.aliyunid.com | Project: $project_name | | TableComment: | +------------------------------------------------------------------------------------+ | CreateTime: 2021-02-18 15:34:54 | | LastDDLTime: 2021-02-18 15:34:54 | | LastModifiedTime: 2021-02-18 15:34:54 | +------------------------------------------------------------------------------------+ | InternalTable: YES | Size: 0 | +------------------------------------------------------------------------------------+ | Native Columns: | +------------------------------------------------------------------------------------+ | Field | Type | Label | Comment | +------------------------------------------------------------------------------------+ | id | bigint | | | +------------------------------------------------------------------------------------+ | Partition Columns: | +------------------------------------------------------------------------------------+ | ds | string | | +------------------------------------------------------------------------------------+ | Extended Info: | +------------------------------------------------------------------------------------+ ... | Transactional: true | +------------------------------------------------------------------------------------+
- Example 1: View the information about the test1 table.
View partition information
Views the partition information about a partitioned table.
- Syntax
desc <table_name> partition (<pt_spec>);
- Parameters
- table_name: required. The name of the partitioned table whose partition information you want to view.
- pt_spec: required. The information about the partition that you want to view. The value of this parameter is in the
partition_col1=col1_value1, partition_col2=col2_value1...
format. If a table has multi-level partitions, you must specify the values of all the partition key columns.
- Examples
The following result is returned:-- Query partition information about the partitioned table sale_detail. desc sale_detail partition (sale_date='201310',region='beijing');
+------------------------------------------------------------------------------------+ | PartitionSize: 2109112 | +------------------------------------------------------------------------------------+ | CreateTime: 2015-10-10 08:48:48 | | LastDDLTime: 2015-10-10 08:48:48 | | LastModifiedTime: 2015-10-11 01:33:35 | +------------------------------------------------------------------------------------+ OK
View the CREATE TABLE statement
Views the CREATE TABLE statement that is used to create a table. This helps you recreate a schema of the table by using SQL statements.
- Syntax
show create table <table_name>;
- Parameters
table_name: required. The name of the table for which you want to view the CREATE TABLE statement.
- Examples
The following result is returned:-- View the CREATE TABLE statement that is used to create the sale_detail table. show create table sale_detail;
CREATE TABLE IF NOT EXISTS doc_test_dev.sale_detail(shop_name STRING,customer_id STRING,total_price DOUBLE) PARTITIONED BY (sale_date STRING,region STRING) STORED AS ALIORC;
Display tables and views in a project
Displays all the tables and views or the tables and views that meet specific rules in a project.
- Syntax
-- Display all tables and views in a project. show tables; -- Display the tables or views whose names contain the
chart
keyword in a project. show tables like '<chart>'; - Examples
The following result is returned:-- Display the tables whose names contain the
sale*
keyword in a project. The asterisk (*) indicates any character. show tables like 'sale*';ALIYUN$account_name:sale_detail ...... -- ALIYUN is a system prompt, which indicates that the table is created by using an Alibaba Cloud account. If the table was created by a RAM user, the system prompt is RAM.
Display partitions
Displays all the partitions of a table. If the table does not exist or the table is a non-partitioned table, an error is returned.
- Syntax
show partitions <table_name>;
- Parameters
table_name: required. The name of the partitioned table whose partition information you want to view.
- Examples
The following result is returned:-- Display all the partitions of the sale_detail table. show partitions sale_detail;
sale_date=201310/region=beijing sale_date=201312/region=shenzhen sale_date=201312/region=xian sale_date=2014/region=shenzhen OK