This topic describes how to create an E-MapReduce table.

Prerequisites

An E-MapReduce cluster is bound to the workspace where you want to create an E-MapReduce table. The E-MapReduce service is available in a workspace only after you bind an E-MapReduce cluster to the workspace on the Workspace Management page. For more information, see Configure a workspace.

Procedure

  1. Go to the DataStudio page.
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. In the top navigation bar, select the region where your workspace resides, find the workspace, and then click Data Analytics in the Actions column.
  2. On the Data Development tab, move the pointer over the Create icon icon and choose EMR > table.
    Alternatively, you can click a workflow in the Business process section, right-click EMR, and then choose New > Table.
  3. In the New table dialog box, set the parameters as required.
    New table dialog box
    Parameter Description
    Please select an Engine type The type of the compute engine. The default value is EMR, which cannot be modified.
    Table name The name of the E-MapReduce table to create.
    Engine instance The E-MapReduce cluster where the table resides. Select a cluster from the drop-down list.
    Database The database where the E-MapReduce cluster resides. Select a database from the drop-down list.
  4. Click Submit. The table configuration tab appears.
    The upper part of the tab shows the table, cluster, and database that you specified in the New table dialog box. You can modify the database where the E-MapReduce cluster resides. To create a database, click New Library. In the New Library dialog box, set the parameters and click OK.
  5. In the Basic properties section, set the parameters as required.
    Parameter Description
    The stair theme The name of the level-1 folder where the table resides.
    Note Level-1 and level-2 folders show the table locations in DataWorks for you to manage tables more conveniently.
    The secondary theme The name of the level-2 folder where the table resides.
    New theme Click New theme to go to the Theme management tab. On this tab, you can create level-1 and level-2 folders for tables.
    Refresh After you create a folder, click Refresh next to New theme to synchronize the folder.
    Description The description of the table.
  6. In the Physical model design section, set the parameters as required.
    Parameter Description
    Hierarchy The levels and categories of the table. Select the appropriate level and category from the drop-down list. To add levels and categories, click New Level to go to the Hierarchical management tab. After levels and categories are created, click Refresh.
    Physical classification
    Partition type Specifies whether the table is partitioned. Valid values: Partition Table and Non-partitioned table.
    Table type The type of the table. Valid values: Internal table and External table.
  7. In the Table structure design section, set the parameters as required.
    Parameter Description
    Add Field The button for adding a field. To add a field, click Add Field, configure the field information, and then click Save.
    Move up The buttons for adjusting the field sequence of the table. If you adjust the sequence of fields in an existing table and then commit the table, DataWorks requests you to delete the table and create another table with the same name. These operations are forbidden in the production environment.
    Move down
    Field name The name of the field. The name can contain letters, digits, and underscores (_).
    Field type The data type of the field. The E-MapReduce table supports the following data types: TINYINT, SMALLINT, INT, BIGINT, FLOAT, DOUBLE, DECIMAL, VARCHAR, CHAR, STRING, BINARY, DATETIME, DATE, TIMESTAMP, BOOLEAN, ARRAY, MAP, and STRUCT.
    Length/ settings The length limit of the field. You must set this parameter if the data type that you specify for the field has a length limit.
    Description The description of the field.
    Primary key Specifies whether the field serves as the primary key. The primary key ensures that a record is unique for your business. In DataWorks, any field can be used as the primary key.
    Edit icon The icon for editing the field. After you save the field, you can click this icon to edit the field and then click the Save icon to save the edited field.
    Delete icon The icon for deleting the field.
    Note If you delete a field from an existing table and then commit the table, DataWorks requests you to delete the table and create another table with the same name. This operation is forbidden in the production environment.
    Add partition The button for adding a partition. If you set Partition type to Partition Table in the Physical model design section, you must add and configure a partition for the table.

    You can click Add partition to add a partition to the table. If you add a partition to an existing table and then commit the table, DataWorks requests you to delete the table and create another table with the same name. This operation is forbidden in the production environment.

  8. Click the Submit icon icon in the toolbar to commit the E-MapReduce table to the production environment.
    If you are using a workspace in standard mode, commit the E-MapReduce table to the development environment and the production environment in sequence.
    Notice You cannot create an E-MapReduce table in data definition language (DDL) mode.