All Products
Search
Document Center

MaxCompute:FAQ about DDL operations

Last Updated:Sep 15, 2023

This topic provides answers to some frequently asked questions about DDL operations on tables, partitions, and columns in MaxCompute.

Category

FAQ

Table operation

Partition operation

Column operation

Lifecycle management operation

If I set the lifecycle of a partitioned table to 3 days and each partition of the table contains a large amount of data, how does the system delete historical data of the table?

Does MaxCompute support virtual tables, such as DUAL tables in MySQL?

No, MaxCompute does not support virtual tables. You can manually create a DUAL table.

Does a MaxCompute table have indexes?

No, a MaxCompute table does not have indexes. You can use the Hash Clustering attribute to achieve the same effect as clustered indexes in databases. For more information, see Table operations.

How do I manage the Hash Clustering attribute of a table?

  • To add the Hash Clustering attribute of a table, run the following command: alter table table_name [clustered by (col_name [, col_name, ...]) [sorted by (col_name [asc | desc] [, col_name [asc | desc] ...])] into number_of_buckets buckets];.

  • To remove the Hash Clustering attribute of a table, run the following command: alter table table_name not clustered;.

How do I change a non-partitioned table to a partitioned table?

You cannot change a non-partitioned table to a partitioned table or add partition key columns. You can create a partitioned table. For more information, see Table operations.

Can I restore a table that is accidentally deleted?

Yes, you can restore the deleted table because MaxCompute supports the backup and restoration feature. If you accidentally delete a table, you can run backup and restoration commands to restore the table free of charge within 24 hours. You can customize the backup data retention period. Within 24 hours of deleting a table, you can run backup and restoration commands to restore the table data. For more information about the backup and restoration feature, see Backup and restoration.

How do I query tables that are created by a specified user?

If you want to query tables that are created by a specified user, you can use the metadata view TABLES to filter tables based on the owner_name field. For more information about TABLES, see TABLES.

How do I check whether a specified table exists?

You can use the TABLE_EXISTS function to check whether the specified table exists. For more information, see TABLE_EXISTS.

How do I obtain the names of all tables in my project?

You can run the show tables; command on the MaxCompute client or use the metadata service of MaxCompute to obtain the names of all tables in your project. For more information, see Table operations or Information Schema.

How do I quickly identify partitioned tables in a project?

You can run the following command on the MaxCompute client to view the partitioned tables in your project:

select table_name from information_schema.columns where is_partition_key = true group by table_name;

How do I view the time when a specified MaxCompute table is most recently accessed?

You can go to the Data Map page of DataWorks. Then, click the table name to view the last access time of the table on the table details page. Technical Information

How do I view the amount of data in a table?

You can view the number of data records in a table and the physical space occupied by the table.

  • You can run the desc command to view the physical space of the full table. You can execute the SQL statement select count() as cnt from table_name; to view the number of data records in the table.View the physical space

  • You can run the desc command with a WHERE clause to view the physical space that is occupied by a partition in a partitioned table. You can execute the SQL statement select count() as cnt from table_name where ...; to view the number of data records in a partition.View a partitioned table

How do I view the number of rows in a table?

You can execute the select count(*) from table_name; statement on the MaxCompute client to view the number of rows in a partitioned table or a non-partitioned table.

Why I cannot view the table that I created in the DataWorks console?

If a table is created by using SQL statements instead of on the graphical user interface (GUI) of DataWorks, the table is displayed in the DataWorks console 2 or 3 hours after it is created.

What is the difference between partitions and partition key columns?

Tables in MaxCompute can be partitioned. A partitioned table contains partition key columns. You can use a partition key column to create partitions.

For example, ds in ds=20150101 indicates a partition key column and ds=20150101 indicates a partition.

Is it recommended for a table to contain a large number of partitions?

A table in MaxCompute can have a maximum of 60,000 partitions, and the capacity of each partition is not limited. However, excessive partitions in a table cause inconvenience for data collection and analysis.

In MaxCompute, the number of instances is limited in a job. The number of instances in a job is closely related to the amount of input data and the number of partitions. Therefore, we recommend that you determine the number of partitions in a table based on your business requirements.

Can I add or modify partition fields in a source table if the source table does not have partition fields?

No, MaxCompute does not allow you to add or modify partition fields in a source table. After a partition field is created, it cannot be modified. You can create a partitioned table and use an SQL statement that is used for dynamic partitioning to import data from the source table to the new partitioned table. For more information, see Insert or overwrite data into dynamic partitions (DYNAMIC PARTITION).

How do I check whether a specified partition exists?

You can use the PARTITION_EXISTS function to check whether the specified partition exists. For more information about the PARTITION_EXISTS function, see PARTITION_EXISTS.

How do I obtain the number of partitions?

You can use the PARTITIONS view of Information Schema to obtain the partition names. Then, you can count the number of partitions.

Can I add columns to or delete columns from a MaxCompute table?

Yes, you can add columns to or delete columns from a MaxCompute table. The feature of adding columns of complex data types and the feature of deleting columns are in the invitational preview. For more information about how to apply for trial use of new features, see Apply for trial use of new features.

How do I add columns to a MaxCompute table?

The following example shows how to add columns to a MaxCompute table. If data exists in the table, the values of the added column are NULL.

alter table table_name add columns (col_name1 type1, col_name2 type2…);

For more information about how to add columns, see Add columns.

How do I configure auto-increment columns?

MaxCompute does not support auto-increment columns. If you want to use auto-increment columns and the amount of data is small, we recommend that you use ROW_NUMBER.

What is the maximum number of columns that can be stored in a MaxCompute table?

A MaxCompute table can store a maximum of 1,200 columns. If the number of columns exceeds the limit, perform one of the following operations:

  • Perform dimension reduction on the data to reduce the number of columns to no more than 1,200.

  • Change the data storage method. For example, use a device certificate, or use a sparse or dense matrix.

What do I do if the name of a column in the table that I want to create is the same as the keyword?

If you use a keyword to name a table, a column, or a partition, you must enclose the keyword in a pair of grave accents (``). Otherwise, an error is returned.

How do I change the data type of a column?

You cannot change the data types of columns. You can only add columns. You cannot delete table fields or modify table fields or partition fields in a table in the production environment. If you need to change the data type of a column in a table, you must delete the table and recreate a table. You can also create an external table. After you delete and recreate the table, you can reload data. For more information about data types, see Data type editions.

If I set the lifecycle of a partitioned table to 3 days and each partition of the table contains a large amount of data, how does the system delete historical data of the table?

If a partition in the table is not modified within the lifecycle, MaxCompute automatically reclaims the partition.

You can run the desc table_name partition(pt_spec); command to check whether a specified partition is modified within the lifecycle. You can run the desc tablename; command to query the lifecycle of the table. MaxCompute reclaims tables and partitions at 17:00:00 every day. In most cases, related data is displayed on DataWorks with a delay of one day.