This topic describes how to create tables and import data in the DataWorks console. The bank_data and result_table tables are used in the example.

Prerequisites

A MaxCompute compute engine instance is associated with the workspace in which you want to create tables. The MaxCompute service is available in a workspace only after you associate a MaxCompute compute engine instance with the workspace on the Workspace Management page. For more information, see Configure a workspace.

Background information

The bank_data table stores business data and the result_table table stores data analytics results.

Create the bank_data table

  1. Go to the DataStudio page.
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. In the top navigation bar, select the region where the required workspace resides, find the workspace, and then click Data Analytics.
  2. On the DataStudio page, move the pointer over the Create icon icon and choose MaxCompute > Table.
    Alternatively, you can click a workflow in the Business Flow section, right-click MaxCompute, and then select Create Table.
  3. In the Create Table dialog box, set the Table Name parameter to bank_data and click Create.
    Notice
    • The table name can be up to 64 characters in length. The table name must start with a letter and cannot contain special characters.
    • If multiple MaxCompute compute engine instances are associated with the current workspace, you must select one from the drop-down list.
  4. On the table configuration tab, click DDL Statement.
  5. In the DDL Statement dialog box, enter the following statement and click Generate Table Schema:
    CREATE TABLE IF NOT EXISTS bank_data
    (
     age             BIGINT COMMENT 'Age',
     job             STRING COMMENT 'Job type',
     marital         STRING COMMENT 'Marital status',
     education       STRING COMMENT 'Education level',
     default         STRING COMMENT 'Credit card',
     housing         STRING COMMENT 'Mortgage',
     loan            STRING COMMENT 'Loan',
     contact         STRING COMMENT 'Contact information',
     month           STRING COMMENT 'Month',
     day_of_week     STRING COMMENT 'Day of the week',
     duration        STRING COMMENT 'Duration',
     campaign        BIGINT COMMENT 'Number of contacts during the campaign',
     pdays           DOUBLE COMMENT 'Interval from the last contact',
     previous        DOUBLE COMMENT 'Number of contacts with the customer',
     poutcome        STRING COMMENT 'Result of the previous marketing campaign',
     emp_var_rate    DOUBLE COMMENT 'Employment change rate',
     cons_price_idx  DOUBLE COMMENT 'Consumer price index',
     cons_conf_idx   DOUBLE COMMENT 'Consumer confidence index',
     euribor3m       DOUBLE COMMENT 'Euro deposit rate',
     nr_employed     DOUBLE COMMENT 'Number of employees',
     y               BIGINT COMMENT 'Time deposit available or not'
    );

    For more information about the SQL syntax for creating tables, see Create tables.

  6. In the Confirm message, click OK.
  7. Set the Display Name parameter in the General section and click Commit to Development Environment and Commit to Production Environment.
    Note This topic uses a workspace in standard mode as an example. If you are using a workspace in basic mode, you only need to click Commit to Production Environment.
  8. In the left-side navigation pane, click the Workspace Tables icon.
  9. On the Workspace Tables tab that appears, double-click the name of the created table to view the table information.

Create the result_table table

  1. On the DataStudio page, move the pointer over Create icon and choose MaxCompute > Table.
    Alternatively, you can click a workflow in the Business Flow section, right-click MaxCompute, and then select Create Table.
  2. In the Create Table dialog box, set the Table Name parameter to result_table and click Create.
  3. On the table configuration tab, click DDL Statement. In the DDL Statement dialog box, enter the following statement and click Generate Table Schema:
    CREATE TABLE IF NOT EXISTS result_table
    (  
     education   STRING COMMENT 'Education level',
     num         BIGINT COMMENT 'Number of persons'
    );
  4. In the Confirm message, click OK.
  5. Set the Display Name parameter in the General section and click Commit to Development Environment and Commit to Production Environment.
  6. In the left-side navigation pane, click the Workspace Tables icon.
  7. In the Workspace Tables pane, double-click the name of the created table to view the table information.

Upload a local file to import its data to the bank_data table

You can perform the following operations in the DataWorks console:
  • Upload a local text file to import its data to a table in a workspace.
  • Use Data Integration to import business data from different data sources to a workspace.
Note Comply with the following rules when you upload a local file:
  • File format: The file must be in the .txt, .csv, or .log format.
  • File size: The size of the file cannot exceed 30 MB.
  • Destination object: The destination object can be a partitioned table or a non-partitioned table. The partition key value cannot contain special characters such as ampersands (&) and asterisks (*).

To upload the local file banking.txt to DataWorks, perform the following steps:

  1. Click the Import icon icon on the DataStudio page.
  2. In the Data Import Wizard dialog box, enter at least three letters to search for tables, select the table to which you want to import data, and then click Next.
    Note If you cannot find the table that you create, you can manually synchronize the table in Data Map. After that, you can search for the table by keyword in the dialog box again. For more information about how to manually synchronize a table, see Manually synchronize a table.
  3. In the dialog box that appears, set the Select Data Import Method parameter to Upload Local File and click Browse next to Select File. Select the local file that you want to upload and specify other parameters.
    bank
    Parameter Description
    Select Data Import Method The method of importing data. Default value: Upload Local File.
    Select File The file to upload. To upload a file, click Browse and select the local file to upload.
    Select Delimiter The delimiter used in the file. Valid values: Comma (,), Tab, Semicolon (;), Space, |, #, and &. In this example, select Comma (,).
    Original Character Set The character set of the file. Valid values: GBK, UTF-8, CP936, and ISO-8859. In this example, select GBK.
    Import First Row The row from which data is to be imported. In this example, select 1.
    First Row as Field Names Specifies whether to use the first row as the header row. In this example, do not select First Row as Field Names.
    Preview The preview of the data to be imported.
    Note If the data amount is large, only the data in the first 100 rows and 50 columns appears.
  4. Click Next.
  5. Select a matching mode for the fields in the source file and destination table. In this example, select By Location.
  6. Click Import Data.

Subsequent steps

You have learned how to create tables and import data. You can proceed with the next tutorial. In the next tutorial, you will learn how to create, configure, and commit a workflow and then you can use the DataStudio service to further compute and analyze data in the workspace. For more information, see Create a workflow.