All Products
Search
Document Center

MaxCompute:ALTER TABLE

Last Updated:Aug 01, 2023

Updates information about a table.

Change the owner of a table

MaxCompute allows you to change the owner of a table.

  • Syntax

    alter table <table_name> changeowner to <new_owner>;
  • Parameters

    • table_name: required. The name of the table whose owner you want to change.

    • new_owner: required. The new owner of the table.

  • Examples

    -- Change the owner of the test1 table to ALIYUN$xxx@aliyun.com. 
    alter table test1 changeowner to 'ALIYUN$xxx@aliyun.com';

Modify the comment of a table

MaxCompute allows you to modify the comment of a table.

  • Syntax

    alter table <table_name> set comment '<new_comment>';
  • Parameters

    • table_name: required. The name of the table whose comment you want to modify.

    • new_comment: required. The new comment of the table.

  • Examples

    alter table sale_detail set comment 'new coments for table sale_detail';

    You can execute the DESC table_name statement of MaxCompute to view the modification result of the comment in the table.

Change the value of LastModifiedTime

MaxCompute SQL allows you to execute the TOUCH statement to change the value of LastModifiedTime. You can change the value of LastModifiedTime to the current time. After you execute this statement to change the value of LastModifiedTime, MaxCompute determines that the table data has changed, and restarts the lifecycle of the table from the time that is specified by LastModifiedTime.

  • Syntax

    alter table <table_name> touch;
  • Parameters

    table_name: required. The name of the table whose LastModifiedTime you want to modify.

  • Examples

    alter table sale_detail touch;

Modify the clustering attribute of a table

MaxCompute allows you to add or remove the clustering attribute to or from a table by executing the ALTER TABLE statement.

  • Syntax

    • Syntax of the statement that is used to add the hash clustering attribute to a table:

      alter table <table_name> [clustered by (<col_name> [, <col_name>, ...]) [sorted by (<col_name> [asc | desc] [, <col_name> [asc | desc] ...])] into <number_of_buckets> buckets];
    • Syntax of the statement that is used to remove the hash clustering attribute from a table:

      alter table <table_name> not clustered;
    • If you do not specify the number of buckets in a range-clustered table, MaxCompute automatically determines the optimal number based on the amount of data. Syntax:

      alter table <table_name> [range clustered by (<col_name> [, <col_name>, ...]) [sorted by (<col_name> [asc | desc] [, <col_name> [asc | desc] ...])] into <number_of_buckets> buckets];
    • Syntax of the statement that is used to remove the range clustering attribute from a table or partition:

      alter table <table_name> not clustered;
      alter table <table_name> <pt_spec> not clustered;
      Note
      • The ALTER TABLE statement can modify the clustering attribute only for a partitioned table. The clustering attribute of a non-partitioned table cannot be modified after the table is created. The ALTER TABLE statement is suitable for existing tables. After you specify the clustering attribute, new partitions are stored based on the clustering attribute that you specified.

      • ALTER TABLE takes effect only on the new partitions in a partitioned table. The new partitions include those generated by using INSERT OVERWRITE and are stored based on the new clustering attribute. The clustering attribute and storage method remain unchanged for the original partitions. After you specify the clustering attribute for a table, you can remove the clustering attribute and add a clustering attribute for the table again. You can specify different clustering columns, sort columns, and numbers of buckets for new partitions.

      • ALTER TABLE takes effect only on the new partitions. Therefore, this statement cannot be used to specify partitions.

  • Parameters

    For more information about the parameters, see CREATE TABLE.

  • Examples

    -- Create a partitioned table.
    create table if not exists sale_detail(
     shop_name     STRING,
     customer_id   STRING,
     total_price   DOUBLE)
    partitioned by (sale_date STRING, region STRING);
    
    -- Modify the clustering attribute of the table.
    alter table sale_detail clustered by (customer_id) sorted by (customer_id) into 10 buckets;

    For more information about the clustering attribute, see Hash clustering and Range clustering.

Rename a table

MaxCompute allows you to rename a table. After you rename a table, only the name of the table is changed. Data in the table is not changed.

  • Syntax

    alter table <table_name> rename to <new_table_name>;
  • Parameters

    • table_name: required. The name of the table that you want to rename.

    • new_table_name: required. The new name of the table. If the name specified by the new_table_name parameter exists, an error is returned.

  • Examples

    alter table sale_detail rename to sale_detail_rename;

Change the lifecycle of a table

MaxCompute allows you to change the lifecycle configured for an existing partitioned table or a non-partitioned table.

  • Syntax

    alter table <table_name> set lifecycle <days>;
  • Parameters

    • table_name: required. The name of the table whose lifecycle you want to change.

    • days: required. The lifecycle after modification. The value must be a positive integer. Unit: days.

  • Examples

    -- Change the lifecycle configured for the test_lifecycle table to 50 days. 
    alter table test_lifecycle set lifecycle 50;

Disable or restore the lifecycle

MaxCompute allows you to disable or restore the lifecycle configured for a specified table or partition.

  • Syntax

    alter table <table_name> [<pt_spec>] {enable|disable} lifecycle;
  • Parameters

    • table_name: required. The name of the table whose lifecycle you want to disable or restore.

    • pt_spec: optional. The partition of the table whose lifecycle you want to disable or restore. The value of this parameter is in the partition_col1=col1_value1, partition_col2=col2_value1... format. If a table has multi-level partitions, you must specify the values of all the partition key columns.

    • enable: restores the lifecycle of a table or a specified partition of a table.

      • A table and its partitions can be reclaimed again based on the lifecycle. The lifecycle settings of the current table and its partitions are used by default.

      • Before you enable the lifecycle feature for a table, you can modify the lifecycle settings of the table and its partitions. This prevents data from being mistakenly reclaimed due to the use of previous settings.

    • disable: disables the lifecycle feature for a table or a specified partition.

      • Disabling the lifecycle of a table and its partitions takes precedence over restoring the lifecycle of a table and its partitions. If you specify table disable lifecycle, pt_spec enable lifecycle becomes invalid.

      • After you disable the lifecycle feature for a table, the lifecycle settings for the table, and the enable or disable flag for table partitions are retained.

      • After the lifecycle feature is disabled for a table, you can still change the lifecycle settings of the table and its partitions.

  • Examples

    • Example 1: Disable the lifecycle feature for the trans table.

      alter table trans disable lifecycle;
    • Example 2: Disable the lifecycle feature for the dt='20141111' partition in the trans table.

      alter table trans partition (dt='20141111') disable lifecycle;

Add partitions

MaxCompute allows you to add partitions to an existing partitioned table.

  • Limits

    • A MaxCompute table can have a maximum of 60,000 partitions.

    • To add the values of partition key columns to a table that has multi-level partitions, you must specify all the partitions.

    • This operation can add only the values of partition key columns. The names of partition key columns cannot be added.

  • Syntax

    alter table <table_name> add [if not exists] partition <pt_spec> [partition <pt_spec> partition <pt_spec>...];
  • Parameters

    • table_name: required. The name of the partitioned table to which you want to add partitions.

    • if not exists: optional. If you do not specify the if not exists parameter and a partition with the same name already exists, this operation fails and an error is returned.

    • pt_spec: required. The partitions that you want to add. The value of this parameter is in the (partition_col1 = partition_col_value1, partition_col2 = partition_col_value2, ...) format. partition_col indicates the name of a partition key column, and partition_col_value indicates the value of a partition key column. The names of partition key columns are not case-sensitive, but their values are case-sensitive.

  • Examples

    • Example 1: Add a partition to the sale_detail table. The partition stores the sales records in the China (Hangzhou) region in December 2013.

      alter table sale_detail add if not exists partition (sale_date='201312', region='hangzhou');
    • Example 2: Add two partitions to the sale_detail table. The partitions store the sales records in the China (Beijing) and China (Shanghai) regions in December 2013.

      alter table sale_detail add if not exists partition (sale_date='201312', region='beijing') partition (sale_date='201312', region='shanghai');
    • Example 3: Add one or more partitions to the sale_detail table and specify only the partition key column sale_date for the partitions. An error is returned because you must specify the two partition key columns sale_date and region for the partitions.

      alter table sale_detail add if not exists partition (sale_date='20111011');

Drop partitions

MaxCompute allows you to drop partitions from an existing partitioned table.

MaxCompute allows you to drop partitions based on a specified filter condition. If you want to drop multiple partitions that meet a specified filter condition at a time, you can use an expression to specify the filter condition, use the filter condition to match partitions, and drop the partitions at a time.

  • Limits

    • You can specify the information of only one partition key column in a PARTITION (<partition_filtercondition>) clause.

    • If you use an expression to specify PARTITION (<partition_filtercondition>), the function used in the expression must be a built-in scalar function.

  • Precautions

    • After you drop a partition from a table in your MaxCompute project, the volume of stored data in your MaxCompute project decreases.

    • You can specify a lifecycle for a partitioned table. This way, MaxCompute automatically reclaims partitions whose data is not updated within the time specified by the lifecycle. For more information about the lifecycle, see Lifecycle.

  • Syntax

    • The filter condition is not specified.

      -- Drop one partition at a time. 
      alter table <table_name> drop [if exists] partition <pt_spec>;
      -- Drop multiple partitions at a time. 
      alter table <table_name> drop [if exists] partition <pt_spec>,partition <pt_spec>[,partition <pt_spec>....];
    • The filter condition is specified.

      alter table <table_name> drop [if exists] partition <partition_filtercondition>;
  • Parameters

    • table_name: required. The name of the partitioned table from which you want to drop partitions.

    • if exists: optional. If you do not specify the if exists option and the partition that you want to delete does not exist, an error is returned.

    • pt_spec: required. The partitions that you want to drop. The value of this parameter is in the (partition_col1 = partition_col_value1, partition_col2 = partition_col_value2, ...) format. partition_col indicates the name of a partition key column, and partition_col_value indicates the value of a partition key column. The names of partition key columns are not case-sensitive, but their values are case-sensitive.

    • partition_filtercondition: the filter condition. This parameter is required when you specify the filter condition. This parameter is not case-sensitive. The parameter is in the following format:

      partition_filtercondition
          : partition (<partition_col> <relational_operators> <partition_col_value>)
          | partition (scalar(<partition_col>) <relational_operators> <partition_col_value>)
          | partition (<partition_filtercondition1> AND|OR <partition_filtercondition2>)
          | partition (NOT <partition_filtercondition>)
          | partition (<partition_filtercondition1>)[,partition (<partition_filtercondition2>), ...]
      • partition_col: the name of a partition key column.

      • relational_operators: the relational operator. For more information, see Operator.

      • partition_col_value: a value in the partition key column. The value may be a comparison value or regular expression. The data type of this value must be the same as the data type of the partition key column.

      • scalar(): the scalar function. The scalar function generates a scalar based on the input value, processes the values in the column specified by partition_col, and uses relational_operators to compare the processed values with partition_col_value.

      • The filter conditions that are used to drop partitions support the logical operators NOT, AND, and OR. You can use PARTITION (<NOT partition_filtercondition>) to obtain the complementary set of the filter conditions that you specified. You can use PARTITION (<partition_filtercondition1 ANDOR partition_filtercondition2>) to obtain the condition that is used to match the partitions you want to drop.

      • Multiple PARTITION (<partition_filtercondition>) clauses are supported. If these clauses are separated by commas (,), the logical relationship between the clauses is OR. The filter condition is obtained based on the OR logical relationship and used to match the partitions that you want to drop.

  • Examples

    • The filter condition is not specified.

      -- Drop a partition from the sale_detail table. The partition stores the sales record in the China (Hangzhou) region in December 2013. 
      alter table sale_detail drop if exists partition(sale_date='201312',region='hangzhou'); 
      -- Drop two partitions from the sale_detail table. The partitions store the sales records in the China (Hangzhou) and China (Shanghai) regions in December 2013. 
      alter table sale_detail drop if exists partition(sale_date='201312',region='hangzhou'),partition(sale_date='201312',region='shanghai');
    • The filter condition is specified.

      -- Create a partitioned table. 
      create table if not exists sale_detail(
      shop_name     STRING,
      customer_id   STRING,
      total_price   DOUBLE)
      partitioned by (sale_date STRING);
      -- Add partitions to the table. 
      alter table sale_detail add if not exists
      partition (sale_date= '201910')
      partition (sale_date= '201911')
      partition (sale_date= '201912')
      partition (sale_date= '202001')
      partition (sale_date= '202002')
      partition (sale_date= '202003')
      partition (sale_date= '202004')
      partition (sale_date= '202005')
      partition (sale_date= '202006')
      partition (sale_date= '202007');
      -- Drop partitions from the table at a time. 
      alter table sale_detail drop if exists partition(sale_date < '201911');
      alter table sale_detail drop if exists partition(sale_date >= '202007');
      alter table sale_detail drop if exists partition(sale_date LIKE '20191%');
      alter table sale_detail drop if exists partition(sale_date IN ('202002','202004','202006'));
      alter table sale_detail drop if exists partition(sale_date BETWEEN '202001' AND '202007');
      alter table sale_detail drop if exists partition(substr(sale_date, 1, 4) = '2020');
      alter table sale_detail drop if exists partition(sale_date < '201912' OR sale_date >= '202006');
      alter table sale_detail drop if exists partition(sale_date > '201912' AND sale_date <= '202004');
      alter table sale_detail drop if exists partition(NOT sale_date > '202004');
      -- Drop partitions by using multiple PARTITION (<partition_filtercondition>) clauses. The logical relationship between these clauses is OR. 
      alter table sale_detail drop if exists partition(sale_date < '201911'), partition(sale_date >= '202007');
      -- Add partitions in other formats. 
      alter table sale_detail add IF NOT EXISTS
      partition (sale_date= '2019-10-05') 
      partition (sale_date= '2019-10-06') 
      partition (sale_date= '2019-10-07');
      -- Use regular expressions to match the partitions that you want to drop and drop these partitions at a time. 
      alter table sale_detail drop if exists partition(sale_date RLIKE '2019-\\d+-\\d+');
      -- Create a table named region_sale_detail. The table contains multi-level partitions. 
      create table if not exists region_sale_detail(
      shop_name     STRING,
      customer_id   STRING,
      total_price   DOUBLE)
      partitioned by (sale_date STRING , region STRING );
      -- Add partitions to the table. 
      alter table region_sale_detail add IF NOT EXISTS
      partition (sale_date= '201910',region = 'shanghai')
      partition (sale_date= '201911',region = 'shanghai')
      partition (sale_date= '201912',region = 'shanghai')
      partition (sale_date= '202001',region = 'shanghai')
      partition (sale_date= '202002',region = 'shanghai')
      partition (sale_date= '201910',region = 'beijing')
      partition (sale_date= '201911',region = 'beijing')
      partition (sale_date= '201912',region = 'beijing')
      partition (sale_date= '202001',region = 'beijing')
      partition (sale_date= '202002',region = 'beijing');
      -- Execute the following statement to drop multi-level partitions at a time. The logical relationship between the two PARTITION (<partition_filtercondition>) clauses is OR. After the statement is executed, the partitions in which the value of the sale_date column is earlier than 201911 or the partitions in which the value of the region column is beijing are dropped. 
      alter table region_sale_detail drop if exists partition(sale_date < '201911'),partition(region = 'beijing');
      -- Execute the following statement to drop partitions in which the value of the sale_date column is earlier than 201911 and the value of the region column is beijing. 
      alter table region_sale_detail drop if exists partition(sale_date < '201911', region = 'beijing');

      When you drop multi-level partitions at a time, you cannot specify a filter condition that is based on multiple partition key columns in one PARTITION partition_filtercondition clause. Otherwise, the following error is returned: FAILED: ODPS-0130071:[1,82] Semantic analysis exception - invalid column reference region, partition expression must have one and only one column reference.

      -- If you specify the information of multiple partition key columns in a PARTITION (<partition_filtercondition>) clause, an error is returned. Example of incorrect usage: 
      alter table region_sale_detail drop if exists partition(sale_date < '201911' AND region = 'beijing');

Change LastModifiedTime of a partition

MaxCompute SQL allows you to execute the TOUCH statement to change LastModifiedTime of a partition in a partitioned table. This operation can change LastModifiedTime of a partition to the current time. In this case, MaxCompute considers that data is updated, and a new lifecycle of the partition starts from the time specified by LastModifiedTime.

  • Limits

    If a table contains multi-level partitions, you must specify all partition levels when you change LastModifiedTime of a partition in the table.

  • Syntax

    alter table <table_name> touch partition (<pt_spec>);
  • Parameters

    • table_name: required. The name of the partitioned table whose LastModifiedTime you want to change. If the table does not exist, an error is returned.

    • pt_spec: required. The partition whose LastModifiedTime you want to change. The value of this parameter is in the (partition_col1 = partition_col_value1, partition_col2 = partition_col_value2, ...) format. partition_col indicates the name of a partition key column, and partition_col_value indicates the value of a partition key column. If the specified column name or column value does not exist, an error is returned.

  • Examples

    -- Change LastModifiedTime of the partition in which the value of the sale_date column is 201312 and the value of the region column is shanghai in the sale_detail table. 
    alter table sale_detail touch partition (sale_date='201312', region='shanghai');

Change the value of a partition key column

MaxCompute SQL allows you to execute the RENAME statement to change the value of a partition key column.

  • Limits

    • The RENAME statement can change values of partition key columns but cannot change the column names.

    • If a table contains multi-level partitions, you must specify all partition levels when you change LastModifiedTime of a partition in the table.

  • Syntax

    alter table <table_name> partition (<pt_spec>) rename to partition (<new_pt_spec>);
  • Parameters

    • table_name: required. The name of the table in which you want to change column values in a partition.

    • pt_spec: required. The partition in which you want to change column values. The value of this parameter is in the (partition_col1 = partition_col_value1, partition_col2 = partition_col_value2, ...) format. partition_col indicates the name of a partition key column, and partition_col_value indicates the value of a partition key column. If the specified column name or column value does not exist, an error is returned.

    • new_pt_spec: required. The new partition information. The value of this parameter is in the (partition_col1 = new_partition_col_value1, partition_col2 = new_partition_col_value2, ...) format. partition_col indicates the name of a partition key column, and new_partition_col_value indicates the value of the new partition key column.

  • Examples

    -- Change column values in a partition of the sale_detail table. 
    alter table sale_detail partition (sale_date = '201312', region = 'hangzhou') rename to partition (sale_date = '201310', region = 'beijing');

Merge partitions

MaxCompute SQL allows you to execute the MERGE PARTITION statement to merge multiple partitions of a table into one partition. This operation deletes the dimension information about the merged partitions and migrates the partition data to a specified partition.

  • Limits

    • Partitions of external tables cannot be merged. After partitions of a clustered table are merged into one partition, the partition does not have the clustering attribute.

    • You can merge a maximum of 4,000 partitions at a time.

  • Syntax

    alter table <table_name> merge [if exists] partition (<predicate>) [, partition(<predicate2>) ...] overwrite partition (<fullpartitionSpec>) [purge];
  • Parameters

    • table_name: required. The name of the partitioned table whose partitions you want to merge.

    • if exists: optional. If if exists is not specified and the partitions that you want to merge do not exist, an error is returned when you merge partitions. If if exists is specified but no partitions meet the merge condition, a new partition cannot be generated after you merge partitions. If you merge partitions and modify source data by using a statement at the same time, an error is returned even if you specify if exists. The statement may be INSERT, RENAME, or DROP.

    • predicate: required. The condition that is used to match the partitions that you want to merge.

    • fullpartitionSpec: required. The partition that is generated after you merge partitions.

    • purge: optional. If you specify this parameter, logs in the session directory are deleted. By default, logs that are generated in the last three days are deleted. For more information, see Purge.

  • Examples

    • Example 1: Merge the partitions that meet a specified condition into the destination partition.

      -- View the partition that is generated in the partitioned table after you merge partitions. 
      show partitions intpstringstringstring;
      
      ds=20181101/hh=00/mm=00
      ds=20181101/hh=00/mm=10
      ds=20181101/hh=10/mm=00
      ds=20181101/hh=10/mm=10
      
      -- Merge all partitions that meet the hh='00' condition into the ds='20181101', hh='00', mm='00' partition. 
      alter table intpstringstringstring merge partition(hh='00') overwrite partition(ds='20181101', hh='00', mm='00');
      
      -- View the partition that is generated after you merge partitions. 
      show partitions intpstringstringstring;
      
      ds=20181101/hh=00/mm=00
      ds=20181101/hh=10/mm=00
      ds=20181101/hh=10/mm=10                    
    • Example 2: Merge specified partitions into the destination partition.

      -- Merge specified partitions into the destination partition. 
      alter table intpstringstringstring merge if exists partition(ds='20181101', hh='00', mm='00'), partition(ds='20181101', hh='10', mm='00'),  partition(ds='20181101', hh='10', mm='10') overwrite partition(ds='20181101', hh='00', mm='00') purge;
      -- View the partition that is generated in the partitioned table after you merge partitions. 
      show partitions intpstringstringstring;
      
      ds=20181101/hh=00/mm=00

Add columns or column comments

MaxCompute allows you to add columns or column comments to an existing non-partitioned table or partitioned table.

Note

MaxCompute allows you to add columns of the STRUCT type, such as struct<x: string, y: bigint> and map<string, struct<x: double, y: double>>.

  • Syntax

    alter table <table_name> add columns (<col_name1> <type1> comment ['<col_comment>'][, <col_name2> <type2> comment '<col_comment>'...]);
  • Parameters

    • table_name: required. The name of the table to which you want to add columns. You cannot specify the position of a new column in the table. By default, the new column is added as the last column.

    • col_name: required. The name of the column that you want to add to the table.

    • type: required. The data type of the column that you want to add to the table.

    • col_comment: optional. The comment of the column that you want to add to the table.

  • Examples

    • Example 1: Add two columns to the sale_detail table.

      alter table sale_detail add columns (customer_name STRING, education BIGINT);
    • Example 2: Add two columns and their comments to the sale_detail table.

      alter table sale_detail add columns (customer_name STRING comment 'Customer', education BIGINT comment 'Education' );
    • Example 3: Add a column of a complex data type to the sale_detail table.

      alter table sale_detail add columns (region struct<province:string, area:string>);

Drop columns

MaxCompute allows you to drop one or more specified columns from an existing non-partitioned table or partitioned table.

Note

If you drop a column from a table in one of the following scenarios, the read and write operations on the table are affected.

  • If your job is a MapReduce job and the version of MapReduce is V1.0, you cannot use Graph jobs to read data from or write data to the table whose column is dropped.

  • If your job is a Spark job that runs on the Cupid console and one of the following Spark versions is used, you can use your job to read data from the table. You cannot use your job to write data to the table.

    • Spark-2.3.0-odps0.34.0

    • Spark-3.1.1-odps0.34.0

  • If your job is a PAI job, you can use your job to read data from the table. You cannot use your job to write data to the table.

  • If your job is a Hologres job and the version of Hologres is earlier than Hologres V1.3, you cannot read data from or write data to the table whose column is dropped when Hologres references the table as an external table.

  • If you drop a column from a table, you cannot run the CLONE TABLE command on the table.

When you use MaxCompute Streaming Tunnel to write data to the table whose column order is changed, you cannot change the table schema.

  • Syntax

    alter table <table_name> drop columns <col_name1>[, <col_name2>...];
  • Parameters

    • table_name: required. The name of the table from which you want to drop columns.

    • col_name: required. The name of the column that you want to drop.

  • Examples

    -- Drop the customer_id column from the sale_detail table. Enter yes on the MaxCompute client to drop the column. 
    alter table sale_detail drop columns customer_id;
    -- Drop the shop_name and customer_id columns from the sale_detail table. Enter yes on the MaxCompute client to drop the columns. 
    alter table sale_detail drop columns shop_name, customer_id;

Change the data type of a column

MaxCompute allows you to change the data type of an existing column.

Note

If you change the data type of a column in one of the following scenarios, the read and write operations on the table are affected.

  • If your job is a MapReduce job and the version of MapReduce is V1.0, you cannot use Graph jobs to read data from or write data to the table whose column is dropped.

  • If your job is a Spark job that runs on the Cupid console and one of the following Spark versions is used, you can use your job to read data from the table. You cannot use your job to write data to the table.

    • Spark-2.3.0-odps0.34.0

    • Spark-3.1.1-odps0.34.0

  • If your job is a PAI job, you can use your job to read data from the table. You cannot use your job to write data to the table.

  • If your job is a Hologres job and the version of Hologres is earlier than Hologres V1.3, you cannot read data from or write data to the table whose column is dropped when Hologres references the table as an external table.

  • If you change the data type of a column in a table, you cannot run the CLONE TABLE command on the table.

If you use MaxCompute Streaming Tunnel to write data to a table, you cannot change the table schema.

  • Syntax

    alter table <table_name> change [column] <old_column_name> <new_column_name> <new_data_type>;
  • Parameters

    • table_name: required. The name of the table in which you want to change the data type of a column.

    • old_column_name: required. The name of the column whose data type you want to change.

    • new_column_name: required. The new name of the column whose data type you want to change. The value of new_column_name can be the same as the value of old_column_name. In this case, the name of the column is not changed. The name specified by the new_column_name parameter cannot be the same as the name of a column except the name specified by the old_column_name parameter.

    • new_data_type: required. The new data type of the column whose data type you want to change.

  • Examples

    -- Change the data type of the id field in the mf_evol_t3 table from INT to BIGINT.
    alter table mf_evol_t3 change id id bigint;
    -- Change the data type of the id field in the mf_evol_t3 table from BIGINT to STRING.
    alter table mf_evol_t3 change column id id string;
  • Supported data type conversions

    Note

    The following table describes the conversion between data types. Y indicates that the conversion between the data types is supported. N indicates that the conversion between the data types is not supported. - indicates that the conversion between the data types is not involved. Y() indicates that the conversion between the data types is supported only if the conditions in parentheses () are met.

    数据类型支持转换表

Change the order of a column

MaxCompute allows you to change the order of a column in an existing non-partitioned table or partitioned table.

Note

If you change the order of a column in a table, or add a column to a table and change the order of the new column in the table in one of the following scenarios, the read and write operations on the table are affected.

  • If your job is a MapReduce job and the version of MapReduce is V1.0, you cannot use Graph jobs to read data from or write data to the table whose column is dropped.

  • If your job is a Spark job that runs on the Cupid console and one of the following Spark versions is used, you can use your job to read data from the table. You cannot use your job to write data to the table.

    • Spark-2.3.0-odps0.34.0

    • Spark-3.1.1-odps0.34.0

  • If your job is a PAI job, you can use your job to read data from the table. You cannot use your job to write data to the table.

  • If your job is a Hologres job and the version of Hologres is earlier than Hologres V1.3, you cannot read data from or write data to the table whose column is dropped when Hologres references the table as an external table.

  • If you change the order of columns in a table, you cannot run the CLONE TABLE command on the table.

When you use MaxCompute Streaming Tunnel to write data to the table whose column order is changed, you cannot change the table schema.

  • Syntax

    alter table <table_name> change <old_column_name> <new_column_name> <column_type> after <column_name>;
  • Parameters

    • table_name: required. The name of the table in which you want to change the column order.

    • old_column_name: required. The original name of the column whose order you want to change.

    • new_col_name: required. The new name of the column. The value of new_col_name can be the same as the value of old_column_name. In this case, the name of the column is not changed. The name specified by the new_col_name parameter cannot be the same as the name of a column except the name specified by the old_column_name parameter.

    • column_type: required. The original data type of the column whose order you want to change. The value of this parameter cannot be changed.

    • column_name: required. The column after which you want to place the specified column.

  • Examples

    -- Change the name of the customer_id column to customer and place the customer column after the total_price column in the sale_detail table. 
    alter table sale_detail change customer_id customer string after total_price;
    -- Place the customer_id column after the total_price column in the sale_detail table. The name of the customer_id column is not changed. 
    alter table sale_detail change customer_id customer_id string after total_price;

Change the name of a column

MaxCompute allows you to change the name of a column in an existing non-partitioned table or partitioned table.

  • Syntax

    alter table <table_name> change column <old_col_name> rename to <new_col_name>;
  • Parameters

    • table_name: required. The name of the table in which you want to change the name of a column.

    • old_col_name: required. The name of the column that you want to modify. The column specified by old_col_name must already exist in the table.

    • new_col_name: required. The new name of the column. The table does not contain a column named new_col_name.

  • Examples

    -- Change the name of the customer_name column to customer in the sale_detail table. 
    alter table sale_detail change column customer_name rename to customer;

Modify the comment of a column

MaxCompute allows you to modify the comment of a column in an existing non-partitioned table or partitioned table.

  • Syntax

    alter table <table_name> change column <col_name> comment '<col_comment>';
  • Parameters

    • table_name: required. The name of the table in which you want to modify the column comment.

    • col_name: required. The name of the column whose comment you want to modify. The column specified by col_name must already exist in the table.

    • col_comment: required. The new comment of the column. The comment must be a valid string that is 1 to 1,024 bytes in length. If the value of this parameter does not meet the requirements, an error is returned.

  • Examples

    -- Modify the comment of the customer column in the sale_detail table. 
    alter table sale_detail change column customer comment 'customer';

Change the name and comment of a column

MaxCompute allows you to change the name or comment of a column in a non-partitioned table or partitioned table.

  • Syntax

    alter table <table_name> change column <old_col_name> <new_col_name> <column_type> comment '<col_comment>';
  • Parameters

    • table_name: required. The name of the table in which you want to change the name and comment of a column.

    • old_col_name: required. The name of the column whose name and comment you want to change. The column specified by old_col_name must already exist in the table.

    • new_col_name: required. The new name of the column. The table does not contain a column named new_col_name.

    • column_type: required. The data type of the column.

    • col_comment: optional. The new comment of the column. The comment can be 1,024 bytes in length at maximum.

  • Examples

    -- Change the name of the customer_name column in the sale_detail table to customer_newname and the comment of the column to customer. 
    alter table sale_detail change column customer_name customer_newname STRING comment 'customer';

Change the non-nullable property of a non-partition key column in a table

MaxCompute allows you to change the non-nullable property of a non-partition key column in a table. If the values of a non-partition key column in a table cannot be null, you can execute the statement to allow the column values to be null.

You can execute the DESC EXTENDED table_name; statement to view the value of the Nullable property. This property specifies whether the value of a non-partition key column can be null. If the value of the Nullable property is true, the column value can be null. If the value of the Nullable property is false, the column value must not be null.

  • Limits

    After you change the value of the Nullable property to true, you cannot restore the property setting. Proceed with caution.

  • Syntax

    alter table <table_name> change column <old_col_name> null;
  • Parameters

    • table_name: required. The name of the table in which you want to change the value of the Nullable property for a column.

    • old_col_name: required. The name of the non-partition key column whose Nullable property you want to change. The column specified by old_col_name must already exist in the table.

  • Examples

    -- Create a partitioned table. The values in the id column must not be null. 
    create table null_test(id int not null, name string) partitioned by (ds string);
    -- Allow the values in the id column to be null. 
    alter table null_test change column id null;

Merge files of a transactional table

The base files and delta files of a transactional table occupy the physical storage. You cannot directly read such files. If you execute the UPDATE or DELETE statement on a transactional table, the data in the base files is not updated, but a delta file is generated for each operation. In this case, the more delete or update operations are performed, the more storage space the table occupies. The number of delta files increases. As a result, you are charged more for storage usage and subsequent queries.

If you execute the UPDATE or DELETE statement on a table or a partition of the table multiple times, a large number of delta files are generated. When the system reads data from the table, the system loads the delta files to identify the deleted and updated rows. A large number of delta files reduce the efficiency of reading data. In this case, you can merge the base files with the delta files to reduce storage usage and improve the read efficiency.

  • Syntax

    alter table <table_name> [partition (<partition_key> = '<partition_value>' [, ...])] compact {minor|major};
  • Parameters

    • table_name: required. The name of the transactional table for which you want to merge the base files and delta files.

    • partition_key: optional. The name of a partition key column in the partitioned transactional table.

    • partition_value: optional. The value of a partition key column in the partitioned transactional table.

    • major|minor: One of them must be specified. Differences between the two statements:

      • minor: merges each base file with all delta files that are generated based on the base file and deletes the delta files.

      • major: merges each base file with all delta files that are generated based on the base file, deletes the delta files, and merges small base files. If the size of a base file is less than 32 MB or a delta file is generated, the effect of merging files is equivalent to the effect of executing the INSERT OVERWRITE statement. However, if the size of a base file is greater than or equal to 32 MB and no delta files are generated, the data of the table is not overwritten.

  • Examples

    • Example 1: Merge files of the acid_delete table. Sample statement:

      alter table acid_delete compact minor;

      The following result is returned:

      Summary:
      Nothing found to merge, set odps.merge.cross.paths=true if cross path merge is permitted.
      OK
    • Example 2: Merge files of the acid_update_pt table. Sample statement:

      alter table acid_update_pt partition (ds = '2019') compact major;

      The following result is returned:

      Summary:
      table name: acid_update_pt /ds=2019  instance count: 2  run time: 6
        before merge, file count:        8  file size: 2613  file physical size: 7839
         after merge, file count:        2  file size: 679  file physical size: 2037
      
      OK

Merge small files

The distributed file system stores data in blocks. Files whose size is smaller than the block size are called small files. Each block is 64 MB in size. The distributed file system inevitably generates small files. For example, when you execute SQL statements in MaxCompute or use Tunnel to collect data, small files are generated. In this case, you can use this feature to merge small files to improve the computing performance.

Syntax

ALTER TABLE <tablename> [PARTITION(<partition_key>=<partition_value>)] MERGE SMALLFILES;
  • Parameters

    • table_name: required. The name of the table that corresponds to the files to be merged.

    • partition_key: optional. The name of a partition key column when the table is a partitioned table.

    • partition_value: optional. The value of a partition key column when the table is a partitioned table.

  • Examples

    set odps.merge.cross.paths=true;
    set odps.merge.smallfile.filesize.threshold=128;
    set odps.merge.max.filenumber.per.instance = 2000;
    alter table tbcdm.dwd_tb_log_pv_di partition (ds='20151116') merge smallfiles;

Computing resources are consumed to merge small files. If you use a pay-as-you-go instance, fees are incurred. The billing rules are consistent with the pay-as-you-go billing rules for SQL statements. For more information, see Computing pricing.

For more information, see Merge small files.

Related statements

  • CREATE TABLE: creates a non-partitioned table, a partitioned table, an external table, or a clustered table.

  • TRUNCATE: clears data from a specified table.

  • DROP TABLE: drops a partitioned table or a non-partitioned table.

  • DESC TABLE/VIEW: views the information about MaxCompute internal tables, views, materialized views, external tables, clustered tables, or transactional tables.

  • SHOW: views SQL DDL statements that are used to create tables, or queries information of all tables and views in a project or information of all partitions in a table.