All Products
Search
Document Center

DataWorks:Create a Hologres internal table

Last Updated:Oct 11, 2023

You can create a Hologres internal table by executing a DDL statement. You can also create a Hologres internal table in the DataWorks console. This topic describes how to create a Hologres internal table in the DataWorks console.

Prerequisites

Background information

Hologres tables are classified into Hologres internal tables and Hologres external tables.

  • Hologres internal tables: This type of Hologres table can store source MaxCompute data. You can synchronize data from source MaxCompute tables to Hologres internal tables for quick queries and analysis. The performance of querying MaxCompute data by using Hologres internal tables is better than that by using Hologres external tables.

  • Hologres external tables: This type of Hologres table cannot store source MaxCompute data. A Hologres external table can be used to map data of a source MaxCompute table for accelerated queries and analysis. Hologres external tables prevent redundant data storage and allow you to obtain query results in an efficient manner without the need to import or export data.

DataWorks works as a data processing and development platform that allows you to create Hologres tables in the DataWorks console. You can also create Hologres tables by executing DDL statements. For more information, see Overview.

Limits

You can create a Hologres internal table in the DataWorks console only in the China (Shanghai) and China (Beijing) regions.

Procedure

  1. Go to the DataStudio page.
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. In the top navigation bar, select the region where your workspace resides. Find your workspace and click DataStudio in the Actions column.
  2. Create a workflow.
    If you have an existing workflow, skip this step.
    1. Move the pointer over the Create icon and select Create Workflow.
    2. In the Create Workflow dialog box, configure the Workflow Name parameter.
    3. Click Create.
  3. Create a Hologres internal table.

    1. Move the pointer over the Create icon and choose Create Table > Hologres > Table.

    2. In the Create Table dialog box, set Table Type to Internal Table and configure other parameters such as Engine Instance, Path, and Name.

      Configure parameters
  4. Configure the Hologres internal table.

    On the configuration tab of the Hologres internal table, configure the parameters.

    1. Configure the parameters in the Basic attribute section.

      Basic attribute sectionThe following table describes the parameters in the Basic attribute section.

      Parameter

      Description

      Storage Mode

      The storage mode of the Hologres internal table. Default value: Column storage.

      • Column storage: This storage mode is suitable for online analytical processing (OLAP) scenarios and supports complex queries, data association, scanning, filtering, and statistics collection. The efficiency of inserting and updating data in column-oriented tables is lower than that in row-oriented tables.

      • Row deposit: This storage mode is suitable for scenarios in which key-value pairs are used for queries and supports point queries and scans based on primary keys. The efficiency of inserting and updating data in row-oriented tables is higher than that in column-oriented tables.

      • Row-row coexistence: This storage mode is suitable for scenarios in which both column-oriented storage and row-oriented storage are used. It supports both point queries and OLAP. This storage mode involves more storage costs and costs of internal data status synchronization.

      Note

      For more information about storage modes, see the description of the orientation parameter in Overview.

      Life Cycle

      The time-to-live (TTL) period of the Hologres internal table. Unit: second. The default TTL period of the Hologres internal table is permanent.

      Note

      The system counts the start of the TTL period from the time when data is first written to the table. If the TTL period expires, the data of the table is deleted in an unspecified time period.

      Binlog

      Specifies whether to enable binary logging for the Hologres internal table. If binary logging is enabled, you must configure the Binlog Lifecycle parameter. The default TTL period of binary logging is permanent.

      Note

      Only Hologres V0.9 and later support subscription to the binary logs of a single table. For more information about binary logging, see Subscribe to Hologres binary logs.

    2. Configure the parameters in the Business description section.

      Note

      The parameter configurations in the Business description section are only used to facilitate table management based on your business requirements and are not used to implement the underlying logic.

      Business description section

      Parameter

      Description

      Theme

      The level-1 folder and level-2 folder to which the Hologres internal table belongs. The parameters can be used to categorize tables based on business categories. You can store tables of the same business category in the same folder.

      Note

      The level-1 folder and level-2 folder are folders in DataWorks. You can easily manage tables in folders.

      Level

      The physical data layer to which the Hologres internal table belongs. Data layering is used to define and manage data layers. In most cases, data layers are categorized into data import layers, data sharing layers, and data analysis layers. You can store a table at a specific data layer based on the business category.

      Note

      You can click the Create icon to create a data layer. For more information, see Manage settings for tables.

      Physical Classification

      The physical category of the Hologres internal table. The parameter is used to categorize tables in a finer-grained manner from the business dimension. In most cases, tables are categorized into tables at the basic service level, tables at the advanced service level, and tables at other levels.

      Note

      You can click the Create icon to create a physical category. For more information, see Manage settings for tables.

    3. Configure the parameters in the Table structure design section.

      Table structure design section

      Parameter

      Description

      Field design

      The tab on which you can add and define fields for the Hologres internal table. For information about the data types supported by Hologres, see Data types.

      Storage design

      The tab on which you can specify the storage method for the fields in the Hologres internal table.

      • Distribution Column: the distribution key for the Hologres internal table. Hologres distributes data in the Hologres internal table to each shard based on the distribution key for subsequent data computing and scanning by shard.

      • Event Time Column: a column of a time data type. If event time columns are contained in query conditions, Hologres can find the storage location of data with ease based on the event time columns. This parameter is suitable for logs, traffic, and time-related data.

      • Clustering Key: the columns for which Hologres creates clustered indexes. Hologres sorts data based on clustered indexes. Hologres allows you to use clustered indexes to accelerate RANGE and FILTER queries on indexed columns.

      • Dictionary Encoding: specifies whether to build dictionary mappings for the values of specific columns. Dictionary mappings can convert string comparisons to numeric comparisons to accelerate queries such as GROUP BY and FILTER.

      • Bitmap Column: specifies whether to build bitmap indexes for specific columns. Bitmap indexes can help filter data that equals a specified value in a stored file. Therefore, we recommend that you convert equality filter conditions to bitmap indexes.

      For more information about storage methods, see Overview.

      Partition

      The tab on which you can define a partition field for the Hologres internal table.

      Note

      If a partitioned table contains a primary key, the primary key must contain the partition field.

  5. Commit and deploy the Hologres internal table.

    After the configuration of the Hologres internal table is complete, you can commit the table to the development and production environments. After the table is committed, you can query the table in the compute engine instance in the related environment.

    Note

    If you use a workspace in basic mode, you need to only commit the table to the production environment. For information about workspaces in basic and standard modes, see Differences between workspaces in basic mode and workspaces in standard mode.

    Operation

    Description

    Load from development environment

    Load the table information from the development environment and display the table information on the current page.

    Note

    You can perform this operation only after the table is committed to the development environment. After you perform this operation, the table information in the development environment overwrites the table information on the current page.

    Submit to development environment

    Commit the table to the development environment. This way, you can create the current table in the Hologres database that you associate with your workspace in the development environment.

    After the table is committed, you can view the table schema in the Hologres folder of the related workflow in DataStudio. The folder is that you specified when you created the table.

    Load from production environment

    Load the table information from the production environment and display the table information on the current page.

    Note

    You can perform this operation only after the table is committed to the production environment. After you perform this operation, the table information in the production environment overwrites the table information on the current page.

    Submit to production environment

    Commit the table to the production environment. This way, you can create the current table in the Hologres database that you associate with your workspace in the production environment.

What to do next

After the Hologres internal table is created, you can perform the following operations: