This topic provides answers to some frequently asked questions about DDL operations on tables, partitions, and columns in MaxCompute.
Does MaxCompute support virtual tables, such as DUAL tables in MySQL?
No, MaxCompute does not support virtual tables. You can manually create a DUAL table.
Does a MaxCompute table have indexes?
No, a MaxCompute table does not have indexes. You can use the Hash Clustering attribute to achieve the same effect as clustered indexes in databases. For more information, see Table operations.
How do I manage the Hash Clustering attribute of a table?
- To add the Hash Clustering attribute of a table, run the following command:
alter table table_name [clustered by (col_name [, col_name, ...]) [sorted by (col_name [asc | desc] [, col_name [asc | desc] ...])] into number_of_buckets buckets];.
- To remove the Hash Clustering attribute of a table, run the following command:
alter table table_name not clustered;.
How do I change a non-partitioned table to a partitioned table?
You cannot change a non-partitioned table to a partitioned table or add partition key columns. You can create a partitioned table. For more information, see Table operations.
Can I restore a table that is accidentally deleted?
Yes, you can restore the deleted table because MaxCompute supports the backup and restoration feature. If you accidentally delete a table, you can run the backup and restoration commands to restore the table free of charge within 24 hours. You can customize the backup data retention period. For more information about the backup and restoration feature, see Backup and restoration.
How do I query tables that are created by a specified user?
If you want to query tables that are created by a specified user, you can use the metadata view TABLES to filter tables based on the owner_name field. For more information about TABLES, see TABLES.
How do I check whether a specified table exists?
You can use the TABLE_EXISTS function to check whether the specified table exists. For more information, see TABLE_EXISTS.
How do I obtain the names of all tables in my project?
You can run the
show tables; command on the MaxCompute client or use the metadata service of MaxCompute to obtain the names of all tables in your
project. For more information, see Display tables and views in a project or Information Schema.
How do I quickly identify partitioned tables in a project?
select table_name from information_schema.columns where is_partition_key = true group by table_name;
How do I view the time when a specified MaxCompute table is most recently accessed?
How do I view the amount of data in a table?
- You can run the
desccommand to view the physical space of the full table. You can execute the SQL statement
select count() as cnt from table_name;to view the number of data records in the table.
- You can run the
desccommand with a WHERE clause to view the physical space that is occupied by a partition in a partitioned table. You can execute the SQL statement
select count() as cnt from table_name where ...;to view the number of data records in a partition.
How do I view the number of rows in a table?
You can execute the
select count(*) from table_name; statement on the MaxCompute client to view the number of rows in a partitioned table or a non-partitioned table.
What is the difference between partitions and partition key columns?
Tables in MaxCompute can be partitioned. A partitioned table contains partition key columns. You can use a partition key column to create partitions.
ds=20150101 indicates a partition key column and
ds=20150101 indicates a partition.
Is it recommended for a table to contain a large number of partitions?
A table in MaxCompute can have a maximum of 60,000 partitions, and the capacity of each partition is not limited. However, excessive partitions in a table cause inconvenience for data collection and analysis.
In MaxCompute, the number of instances is limited in a job. The number of instances in a job is closely related to the amount of input data and the number of partitions. Therefore, we recommend that you determine the number of partitions in a table based on your business requirements.
Can I add or modify partition fields in a source table if the source table does not have partition fields?
No, MaxCompute does not allow you to add or modify partition fields in a source table. After a partition field is created, it cannot be modified. You can create a partitioned table and use an SQL statement that is used for dynamic partitioning to import data from the source table to the new partitioned table. For more information, see Insert or overwrite data into dynamic partitions (DYNAMIC PARTITION).
How do I check whether a specified partition exists?
You can use the PARTITION_EXISTS function to check whether the specified partition exists. For more information about the PARTITION_EXISTS function, see PARTITION_EXISTS.
How do I obtain the number of partitions?
You can use the PARTITIONS view of Information Schema to obtain the partition names. Then, you can count the number of partitions.
Can I add columns to or delete columns from a MaxCompute table?
You can add columns to a MaxCompute table, but you cannot delete columns from a MaxCompute table.
How do I add columns to a MaxCompute table?
alter table table_name add columns (col_name1 type1, col_name2 type2…);
For more information about how to add columns, see Add columns.
How do I configure auto-increment columns?
MaxCompute does not support auto-increment columns. If you want to use auto-increment columns and the amount of data is small, we recommend that you use ROW_NUMBER.
What is the maximum number of columns that can be stored in a MaxCompute table?
- Perform dimension reduction on the data to reduce the number of columns to no more than 1,200.
- Change the data storage method. For example, use a device certificate, or use a sparse or dense matrix.
What do I do if the name of a column in the table that I want to create is the same as the keyword?
If you use a keyword to name a table, a column, or a partition, you must enclose the
keyword in a pair of grave accents (
``). Otherwise, an error is returned.
How do I change the data type of a column?
You cannot change the data types of columns. You can only add columns. You cannot delete table fields or modify table fields or partition fields in a table in the production environment. If you need to modify the data type of a column in a table, you must delete the table and recreate a table. You can also create an external table. After you delete and recreate the table, you can reload data. For more information about data types, see Data type editions.
How do I delete a column from a table?
- Run the following command to create a table:
create table new_table_name as select c1,c2,c3 from old_table_name;
- Run the following command to delete the original table and rename the new table:
drop table old_table_name; alter table new_table_name rename to old_table_name;
If I set the lifecycle of a partitioned table to 3 days and each partition of the table contains a large amount of data, how does the system delete historical data of the table?
If a partition in the table is not modified within the lifecycle, MaxCompute automatically reclaims the partition.
You can run the
desc table_name partition(pt_spec); command to check whether a specified partition is modified within the lifecycle.
You can run the
desc tablename; command to query the lifecycle of the table. MaxCompute reclaims tables and partitions
at 17:00:00 every day. In most cases, related data is displayed on DataWorks with
a delay of one day.