All Products
Search
Document Center

Hologres:Import MaxCompute data with a few clicks

Last Updated:Dec 27, 2023

This topic describes how to use HoloWeb to import data from MaxCompute tables in a visualized manner.

Prerequisites

You have logged on to a Hologres instance. For more information, see Log on to an instance.

Background information

HoloWeb allows you to synchronize MaxCompute data with a few clicks. You can import data from MaxCompute tables and query the data in a visualized manner in the HoloWeb console. This method provides better performance than the method of creating foreign tables for data queries.

Procedure

  1. Log on to the Hologres console.

  2. In the top navigation bar, select a region from the drop-down list.

  3. In the left-side navigation pane of the Hologres console, click Go to HoloWeb to go to the HoloWeb console. Go to HoloWeb

  4. In the top navigation bar of the HoloWeb console, choose Metadata Management > MaxCompute Acceleration > Import MaxCompute Data.

  5. On the Create MaxCompute Data Import Task page, configure the parameters.

    The following table describes the parameters.

    Section

    Parameter

    Description

    Instance

    Instance Name

    The name of the instance.

    Source MaxCompute Table

    Project Name

    The name of the MaxCompute project.

    Schema Name

    The name of the schema in MaxCompute. If your MaxCompute project uses the two-layer model, this parameter is not displayed by default. If your MaxCompute project uses the three-layer model, you can select an authorized schema from the drop-down list.

    Table Name

    The name of the MaxCompute table. Prefix-based fuzzy search is supported.

    Destination Hologres Table

    Database Name

    The name of the Hologres database to which the foreign table belongs.

    Schema Name

    The name of the schema in Hologres.

    The default value is public. You can also select another authorized schema.

    Table Name

    The name of the Hologres foreign table.

    The name of the source MaxCompute table is automatically specified for this parameter. You can also manually rename the table.

    Destination Table Description

    The description of the Hologres foreign table that you create, which is user-defined.

    Parameter Settings

    GUC Parameters

    The GUC parameters. For more information about GUC parameters, see GUC parameters.

    Import Task

    Field

    The fields to be synchronized from the MaxCompute table.

    You can import part of or all fields in the MaxCompute table.

    Partition

    • Partition Field

      If you select a partition field, Hologres automatically creates a partitioned table as the destination table.

      Hologres supports one level of partitioning. MaxCompute supports multiple levels of partitioning. When you import data from a MaxCompute table that involves multiple levels of partitioning to a partitioned Hologres table, you need to set only the first-level partition field of the MaxCompute table for the destination table. Other partition fields in the MaxCompute table are mapped to regular fields in the destination table.

    • Data Timestamp

      If a MaxCompute table is partitioned by date, you can specify a date. The system automatically imports data of the specified date to the destination table.

    Property

    • Storage Format

      • Column storage: This mode is applicable to various complex queries.

      • Row storage: This mode is applicable to point queries and scans based on primary keys.

      • Row-column storage. This mode is applicable to all scenarios that support column-oriented storage and row-oriented storage. This mode is also applicable to point queries that are not based on primary keys.

      Default value: Column storage.

    • Data Lifecycle

      The lifecycle of table data. If you do not set this parameter, the table data is permanently stored.

      If the data is not updated within the specified period, the system deletes the data after the period expires.

    • Binlog

      Specifies whether to enable binary logging. For more information, see Subscribe to Hologres binary logs.

    • Lifecycle of Binlog

      The time to live (TTL) of binary logs. Unit: seconds. Default value: 2592000, which indicates 30 days.

    • Distribution Column

      The distribution key. Hologres shuffles data to each shard based on the specified column. Data entries with the same distribution key value are distributed to the same shard. If you use a distribution key as a filter condition, the execution efficiency can be improved.

    • Event Time Column

      The fields that are used to segment data. If the specified fields are involved in the query conditions, Hologres can find the storage location of data based on the fields.

    • Clustering Key

      You can specify some columns to constitute a cluster key. The type of indexes determines the order of fields. Hologres can use clustering indexes to accelerate range and filter queries on index fields.

    • Dictionary Encoding

      The fields based on whose values a dictionary mapping is built. Dictionary encoding can convert string comparisons to numeric comparisons to accelerate queries such as GROUP BY and FILTER.

      By default, the system selects all the fields of the TEXT type for this parameter.

    • Bitmap Column

      The bit fields on which bit codes are built. You can filter the data that meets the query conditions based on the specified fields.

      By default, the system selects all the fields of the TEXT type for this parameter.

    In the SQL Statements section, the SQL statements of the visualized operations are automatically generated in the SQL editor.

  6. Click Submit in the upper-right corner.