Create Table

Last Updated: Sep 27, 2017

After the user has been added into a project and granted with corresponding privileges, next he can oeprate MaxCompute. As the operation objects of MaxCompute (input and output) are tables, we must create tables and partitions before processing data.

Create Table

Command format:

  1. CREATE TABLE [IF NOT EXISTS] table_name
  2. [(col_name data_type [COMMENT col_comment], ...)]
  3. [COMMENT table_comment]
  4. [PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]
  5. [LIFECYCLE days]
  6. [AS select_statement]
  7. CREATE TABLE [IF NOT EXISTS] table_name
  8. LIKE existing_table_name

Description:

  • The table name and column name are both case insensitive.
  • Exception will be thrown if a same name table has already existed. User should specify the option [if not exists] to skip the error. If the option [if not exists] is specified, no matter whether there is a same name table, even if the source table structure and the target table structure are inconsistent, all return success. The Meta information of existing table will not change.
  • The data types will only support: bigint, double, boolean, datetime and string.
  • The table name and column name cannot have special characters. It can only begin with a letter and include a-z, A-Z, digits and underline_. The name length cannot exceed 128 bytes.
  • Use ‘Partitioned by’ to specify the partition and at present only string and bigint are supported. The partition value can not have a double byte characters (such as Chinese), must begin with a letter a-z or A-Z, followed by letter or numbe. The name length cannot exceed 128 bytes. Allowed characters including: space ‘ ’, colon ‘:’, underlined symbol ‘_’, ’$’, ’#’, point ‘.’, exclamation point ‘!’ and’ ‘@’. Other characters are taken as undefined characters, such as ‘\t’, ‘\n’, ‘/’ and so on. If using partition fields in partition table, a full table scan is no need when adding partition, updating data in partition and readiy.
  • The comment content is the effective string which length does not exceed 1024 bytes.
  • Lifecycle indicates the lifecycle of the table. Unit is ‘day’.The statement ‘create table like’ will not copy the lifecycle attribute from source table.
  • At present, the partition hierarchy cannot exceed 6 levels. The maximum partition number of a table can be configured in a certain project. The default maximum number is 60000.

    Notes:Tables in MaxCompute support partition and lifecycle. For more details of creating table, please refer to CREATE TABLE. For partition operation, please refer to Add/Remove Partition. For lifecycle operation, please refer to Modify Lifecycle for a Table.

An example to create table:

  1. create table test1 (key string); -- create a no-partition table.
  2. create table test2 (key bigint) partitioned by (pt string, ds string); --Create a partition table.
  3. create table test3 (key boolean) partitioned by (pt string, ds string) lifecycle 100; -- Create a table with lifecycle.
  4. create table test4 like test3; -- Except the lifecycle properity, other properties of test3 (field type, partition type) were completely consistent with test4.
  5. create table test5 as select * from test2;
  6. -- This operation will create test5, but the partition and lifecycle information will not be copied to the object table.
  7. -- This operation will copy the data of test2 to the table test5.

Here we introduce an instance to create a table: suppose that we need to create a table named user, which includes the following information:

  • user_id: bigint, user identifier, to identify a user.
  • gender: bigint type, sex (0, unknown; 1, male; 2, female)
  • age: bigint, the age of user

It should be partitioned by region and dt and the lifycycle is 365 days.The sentence to create this table is shown as follows:

  1. CREATE TABLE user (
  2. user_id BIGINT, gender BIGINT COMMENT '0 unknow,1 male, 2 Female', age BIGINT)
  3. PARTITIONED BY (region string, dt string) LIFECYCLE 365;

Describer Table

  1. desc <table_name>;

For example, get information from test3:

  1. desc test3;

Get information from test4:

  1. desc test4;

Except the lifecycle properity, other properties of test3 (field type, partition type) were completely consistent with test4. For more introductions of describing table, please refer to Describe Table.

If the user is to view the information of test2, the two fields ‘pt’, ‘ds’ will only exist as two common columns, rather than the table partitions.

Drop Table

  1. DROP TABLE [IF EXISTS] table_name;

For example, delete the table ‘test2’:

  1. drop table test2;

For more introductions, please refer to DROP TABLE.

Add Partition

After we create a partition table and need to import data into different partitions, we should create the partition. The statement format is shown as follows:

  1. alter table table_name add [if not exists] partition partition_spec
  2. partition_spec:
  3. : (partition_col1 = partition_col_value1, partition_col2 = partiton_col_value2, ...)

As shown in last example, we need to add the partitions (region is ‘hangzhou’ and dt is ‘20150923’) for the table user. The statement is shown as follows:

  1. Alter table user add if not exists partition(region='hangzhou',dt='20150923');

Drop Partition

Statement Format:

  1. alter table table_name drop [if exists] partition_spec;
  2. partition_spec:
  3. : (partition_col1 = partition_col_value1, partition_col2 = partiton_col_value2, ...)

For example, we must delete the partitions (region is ‘hangzhou’ and dt is ‘20150923’). The statement is shown as follows:

  1. Alter table user drop if exists partition(region='hangzhou',dt='20150923');
Thank you! We've received your feedback.