Table is the unit for storing data in MaxCompute. You must process data in MaxCompute based on tables. This topic describes how to create tables in MaxCompute.
Prerequisites
Make sure that the following requirements are met:
The MaxCompute client is started.
For more information about how to start the MaxCompute client, see Install and start the MaxCompute client.
A CSV or TXT file that contains the data you want to import is prepared.
The following sample files are used in this topic:
The file that is used to create a non-partitioned table: banking.txt.
The files that are used to create a partitioned table: banking_nocreditcard.csv, banking_uncreditcard.csv, and banking_yescreditcard.csv.
The RAM user that you want to use to create tables is granted the CREATE TABLE permission. If you want to use your Alibaba Cloud account to create tables, ignore this requirement.
For more information about how to check the permissions of a RAM user, see Query permissions by using MaxCompute SQL.
For more information about how to grant permissions to a RAM user, see MaxCompute permissions.
Usage notes
The data types of the columns in the table that you want to create must be consistent with the data types of the columns in the data files. If they are inconsistent, data cannot be imported to the table.
Step 1: Design tables
Determine the table type, table structure, and data type of each column based on the content of the data that you want to import and the expected result data.
The sample files used in this topic contain the following data of mortgagors:
-- Data that is used to create a non-partitioned table.
44,blue-collar,married,basic.4y,unknown,yes,no,cellular,aug,thu,210,1,999,0,nonexistent,1.4,93.444,-36.1,4.963,5228.1,0
53,technician,married,unknown,no,no,no,cellular,nov,fri,138,1,999,0,nonexistent,-0.1,93.2,-42,4.021,5195.8,0
28,management,single,university.degree,no,yes,no,cellular,jun,thu,339,3,6,2,success,-1.7,94.055,-39.8,0.729,4991.6,1
-- Data that is used to create a partitioned table.
53,technician,married,unknown,no,no,cellular,nov,fri,138,1,999,0,nonexistent,-0.1,93.2,-42,4.021,5195.8,0
28,management,single,university.degree,yes,no,cellular,jun,thu,339,3,6,2,success,-1.7,94.055,-39.8,0.729,4991.6,1
39,services,married,high.school,no,no,cellular,apr,fri,185,2,999,0,nonexistent,-1.8,93.075,-47.1,1.405,5099.1,0
You can count the number of columns, identify the data type of each column, and name each column based on the meaning of each column of data. The following table describes the information that you can obtain from the sample files.
Item | Description |
Number of columns | Count the number of columns in each sample file and determine the number of columns in the table that you want to create:
|
Fields of the table | Identify the fields in each sample file and determine the fields in the table that you want to create:
|
Data types | Identify the data type of each column in the sample files:
|
Step 2: Create tables
Create a non-partitioned table named bank_data and a partitioned table named bank_data_pt based on the sample files to store business data. Create non-partitioned tables named result_table1 and result_table2 to store result data. For more information about operations on tables and partitions, see Table operations and Partition and column operations. Perform the following operations to create the tables:
Optional:On the MaxCompute client, run the following command to switch to the MaxCompute project that you want to use.
If the MaxCompute project is opened when you start the MaxCompute client, ignore this step.
use doc_test_dev;
If the command is successfully run, the following information is returned:
doc_test_dev.default>
On the MaxCompute client, create the bank_data, bank_data_pt, result_table1, and result_table2 tables.
For more information about the CREATE TABLE syntax, see Create a table.
Create the non-partitioned table bank_data. Sample statement:
create table if not exists bank_data ( age BIGINT comment 'age', job STRING comment 'job type', marital STRING comment 'marital status', education STRING comment 'education level', credit STRING comment 'have a credit card', housing STRING comment 'mortgagors', loan STRING comment 'have loans', contact STRING comment 'contact information', month STRING comment 'month', day_of_week STRING comment 'day of the week', duration STRING comment 'duration', campaign BIGINT comment 'number of contacts for the current campaign', pdays DOUBLE comment 'time elapsed after the last contact', previous DOUBLE comment 'number of previous contacts', poutcome STRING comment 'results of previous marketing activities', emp_var_rate DOUBLE comment 'employment variance rate', cons_price_idx DOUBLE comment 'consumer price index', cons_conf_idx DOUBLE comment 'consumer confidence index', euribor3m DOUBLE comment '3-month Euribor', nr_employed DOUBLE comment 'number of employees', fixed_deposit BIGINT comment 'have a time deposit' );
Create the partitioned table bank_data_pt and add partitions to the table. Sample statement:
create table if not exists bank_data_pt ( age BIGINT comment 'age', job STRING comment 'job type', marital STRING comment 'marital status', education STRING comment 'education level', housing STRING comment 'mortgagors', loan STRING comment 'have loans', contact STRING comment 'contact information', month STRING comment 'month', day_of_week STRING comment 'day of the week', duration STRING comment 'duration', campaign BIGINT comment 'number of contacts for the current campaign', pdays DOUBLE comment 'time elapsed after the last contact', previous DOUBLE comment 'number of previous contacts', poutcome STRING comment 'results of previous marketing activities', emp_var_rate DOUBLE comment 'employment variance rate', cons_price_idx DOUBLE comment 'consumer price index', cons_conf_idx DOUBLE comment 'consumer confidence index', euribor3m DOUBLE comment '3-month Euribor', nr_employed DOUBLE comment 'number of employees', fixed_deposit BIGINT comment 'have a time deposit' )partitioned by (credit STRING comment 'have a credit card'); alter table bank_data_pt add if not exists partition (credit='yes') partition (credit='no') partition (credit='unknown');
Create the non-partitioned table result_table1. Sample statement:
create table if not exists result_table1 ( education STRING comment 'education level', num BIGINT comment 'number of persons' );
Create the non-partitioned table result_table2. Sample statement:
create table if not exists result_table2 ( education STRING comment 'education level', num BIGINT comment 'number of persons' credit STRING comment 'have a credit card' );
Step 3: Check the table creation results
Perform the following operations to check whether the tables are created and whether the table schemas are correct:
On the MaxCompute client, run the following command to check whether the tables that you created are in the MaxCompute project as expected:
show tables;
The names of the created tables are returned:
ALIYUN$****:bank_data ALIYUN$****:bank_data_pt ALIYUN$****:result_table1 ALIYUN$****:result_table2
On the MaxCompute client, execute the following statements to check whether the table schemas are correct:
-- View the schema of bank_data. desc bank_data; -- View the schema of bank_data_pt. desc bank_data_pt; -- View the partitions of bank_data_pt. show partitions bank_data_pt; -- View the schema of result_table1. desc result_table1; -- View the schema of result_table2. desc result_table2;
Returned result of the desc bank_data_pt; statement:
desc bank_data_pt; -- The following result is returned: +------------------------------------------------------------------------------------+ | Owner: ALIYUN$**** | Project: doc_test_dev | | TableComment: | +------------------------------------------------------------------------------------+ | CreateTime: 2021-05-17 14:02:21 | | LastDDLTime: 2021-05-17 14:02:21 | | LastModifiedTime: 2021-05-17 14:02:21 | +------------------------------------------------------------------------------------+ | InternalTable: YES | Size: 0 | +------------------------------------------------------------------------------------+ | Native Columns: | +------------------------------------------------------------------------------------+ | Field | Type | Label | Comment | +------------------------------------------------------------------------------------+ | age | bigint | | age | | job | string | | job type | | marital | string | | marital status | | education | string | | educational level | | housing | string | | mortgagors | | loan | string | | have loans | | contact | string | | contact information | | month | string | | month | | day_of_week | string | | day of the week | | duration | string | | duration | | campaign | bigint | | number of contacts for the current campaign | | pdays | double | | time elapsed after the last contact | | previous | double | | number of previous contacts | | poutcome | string | | results of previous marketing activities | | emp_var_rate | double | | employment variation rate | | cons_price_idx | double | | consumer price index | | cons_conf_idx | double | | consumer confidence index | | euribor3m | double | | 3-month Euribor | | nr_employed | double | | number of employees | | fixed_deposit | bigint | | have a time deposit | +------------------------------------------------------------------------------------+ | Partition Columns: | +------------------------------------------------------------------------------------+ | credit | string | have a credit card | +------------------------------------------------------------------------------------+
show partitions bank_data_pt; -- The following result is returned: credit=no credit=unknown credit=yes
What to do next
After you create the tables and confirm that the table schemas are correct, you can import data in the sample files to the tables. For more information about how to import data, see Import data to tables.