All Products
Search
Document Center

Hologres:Create a foreign table in Hologres to accelerate queries on MaxCompute data

Last Updated:May 17, 2024

Hologres allows you to create foreign tables to accelerate queries on MaxCompute data. You can manually create foreign tables or use the Auto Load feature to automatically create foreign tables. This way, you can directly access and analyze data stored in MaxCompute by using foreign tables in Hologres. This improves query efficiency and simplifies data processing.

Prerequisites

Usage notes

When you use a foreign table in Hologres to accelerate queries on MaxCompute data, take note of the following items:

  • You must make sure that the foreign table that is created in Hologres is in the same region as the MaxCompute table that the foreign table is sourced from. Otherwise, the stability of data queries cannot be ensured because of potential network errors.

  • You can accelerate queries on data only in internal tables in MaxCompute but not external tables or views in MaxCompute.

  • Partition fields in MaxCompute tables are mapped to regular fields in Hologres foreign tables. You can use values of partition fields as filter conditions to query data.

  • The amount of data to be scanned in a query cannot exceed 200 GB, and the number of partitions to be queried at a time cannot exceed 512. However, if you import MaxCompute data to Hologres internal tables before you query the data, the amount of data to be scanned and the number of partitions to be queried are not limited.

  • You cannot query data of the MAP or STRUCT type from MaxCompute tables.

  • Range-clustered tables in MaxCompute that use columns of the DATETIME, TIMESTAMP, or DECIMAL type as clustering key columns are not supported.

Data type mappings

Data types that are supported by MaxCompute and those supported by Hologres have a one-to-one mapping relationship. You can create tables based on the data type mappings. For more information, see Data type mappings between MaxCompute and Hologres in the "Data types" topic.

Description

You can use one of the following methods to accelerate queries on MaxCompute data:

  • Manually create foreign tables: This method is suitable for scenarios in which the number of foreign tables to be created is small and schemas of MaxCompute tables are not frequently changed.

  • Enable Auto Load: This method is suitable for scenarios in which the number of foreign tables to be created is large or schemas of MaxCompute tables are frequently changed. In this case, you can enable the Auto Load feature to automatically create foreign tables for queried MaxCompute tables based on your business requirements or for all MaxCompute tables.

Solution 1: Manually create foreign tables to accelerate queries on MaxCompute data

Hologres allows you to use the IMPORT FOREIGN SCHEMA statement to create multiple MaxCompute foreign tables at a time. You can also use the CREATE FOREIGN TABLE statement to create MaxCompute foreign tables. In this statement, you can customize table names, columns, and comments. This section describes how to use the CREATE FOREIGN TABLE statement in Hologres to create foreign tables to accelerate queries on MaxCompute non-partitioned tables and partitioned tables.

Note

For more information about the statement for creating multiple MaxCompute foreign tables at a time, see IMPORT FOREIGN SCHEMA. You can also create foreign tables in the HoloWeb console. For more information, see Create a foreign table.

Example 1: Query data from a MaxCompute non-partitioned table

  1. Create a non-partitioned table in MaxCompute.

    Create a non-partitioned table in MaxCompute and import data to the table. In this example, the customer table in the MaxCompute public dataset named BIGDATA_PUBLIC_DATASET.tpcds_1t is used. The following sample code shows the DDL statement that is used to create the table and the SELECT statement that is used to query data from the table:

    -- The DDL statement that is used to create the table.
    CREATE TABLE IF NOT EXISTS public_data.customer(
      c_customer_sk BIGINT,
      c_customer_id STRING,
      c_current_cdemo_sk BIGINT,
      c_current_hdemo_sk BIGINT,
      c_current_addr_sk BIGINT,
      c_first_shipto_date_sk BIGINT,
      c_first_sales_date_sk BIGINT,
      c_salutation STRING,
      c_first_name STRING,
      c_last_name STRING,
      c_preferred_cust_flag STRING,
      c_birth_day BIGINT,
      c_birth_month BIGINT,
      c_birth_year BIGINT,
      c_birth_country STRING,
      c_login STRING,
      c_email_address STRING,
      c_last_review_date_sk STRING);
    
    -- The SELECT statement that is used to query the table.
    SELECT * FROM BIGDATA_PUBLIC_DATASET.tpcds_10t.customer;

    The following figure shows part of the data in the customer table.image

  2. Create a foreign table in Hologres.

    Create a foreign table in Hologres to map the table in MaxCompute. Sample statements:

    set hg_enable_convert_type_for_foreign_table = true;
    CREATE FOREIGN TABLE customer (
        "c_customer_sk" int8,
        "c_customer_id" text,
        "c_current_cdemo_sk" int8,
        "c_current_hdemo_sk" int8,
        "c_current_addr_sk" int8,
        "c_first_shipto_date_sk" int8,
        "c_first_sales_date_sk" int8,
        "c_salutation" text,
        "c_first_name" text,
        "c_last_name" text,
        "c_preferred_cust_flag" text,
        "c_birth_day" int8,
        "c_birth_month" int8,
        "c_birth_year" int8,
        "c_birth_country" text,
        "c_login" text,
        "c_email_address" text,
        "c_last_review_date_sk" text)
    SERVER odps_server
    OPTIONS (project_name 'BIGDATA_PUBLIC_DATASET#tpcds_10t', table_name 'customer');
    

    The following table describes the parameters in the preceding statements.

    Parameter

    Description

    SERVER

    The server on which you want to create the foreign table.

    You can use the odps_server server that is created at the underlying layer of Hologres. For more information, see Postgres FDW.

    project_name

    • If the MaxCompute project uses the three-layer model, set the project_name parameter to a combination of the MaxCompute project name and schema name in the odps_project_name#odps_schema_name format.

    • If the MaxCompute project uses the two-layer model, set the project_name parameter to the MaxCompute project name.

    For more information about the three-layer model, see Schema-related operations.

    table_name

    The name of the MaxCompute table that you want to query.

  3. Accelerate queries on the MaxCompute table by using the foreign table in Hologres.

    After you create a foreign table, you can directly query data from the MaxCompute table by using the foreign table. Sample statement:

    select * from customer limit 10;

Example 2: Query data from a MaxCompute partitioned table

  1. Create a partitioned table in MaxCompute.

    Create a partitioned table in MaxCompute and import data to the table. In this example, the ods_enterprise_share_trade_h table in the MaxCompute public dataset named BIGDATA_PUBLIC_DATASET.finance is used. The following sample code shows the DDL statement that is used to create the table and the SELECT statement that is used to query data from the table:

    -- The DDL statement that is used to create the table.
    CREATE TABLE IF NOT EXISTS public_data.ods_enterprise_share_trade_h(
      code STRING COMMENT 'Code'
      name STRING COMMENT 'Name',
      industry STRING COMMENT 'Industry',
      area STRING COMMENT 'Country/region',
      pe STRING COMMENT 'Price-earnings ratio',
      outstanding STRING COMMENT 'Outstanding',
      totals STRING COMMENT 'Total shares outstanding (ten thousand)',
      totalassets STRING COMMENT 'Total assets (ten thousand)',
      liquidassets STRING COMMENT 'Liquid assets',
      fixedassets STRING COMMENT 'Fixed assets',
      reserved STRING COMMENT 'Reserved',
      reservedpershare STRING COMMENT 'Reserved per share',
      eps STRING COMMENT 'earnings per share',
      bvps STRING COMMENT 'book value per share',
      pb STRING COMMENT 'Price-to-book ratio',
      timetomarket STRING COMMENT 'Time to market',
      undp STRING COMMENT 'Undistributed profits',
      perundp STRING COMMENT 'Undistributed profits per share',
      rev STRING COMMENT 'Year-over-year revenue (%)',
      profit STRING COMMENT 'Year-over-year profit (%)',
      gpr STRING COMMENT 'Gross profit margin (%)',
      npr STRING COMMENT 'Net profit ratio (%)',
      holders_num STRING COMMENT 'Holders') 
      PARTITIONED BY (ds STRING) STORED AS ALIORC TBLPROPERTIES ('comment'='date when data is imported');
      
    -- Query data from a specific partition in the MaxCompute partitioned table.
     SELECT * FROM BIGDATA_PUBLIC_DATASET.finance.ods_enterprise_share_trade_h WHERE ds = '20170113';

    The following figure shows some data in the table.image

  2. Create a foreign table in Hologres.

    Create a foreign table in Hologres to map the table in MaxCompute. Sample statement:

    CREATE FOREIGN TABLE public.foreign_ods_enterprise_share_trade_h (
        "code" text,
        "name" text,
        "industry" text,
        "area" text,
        "pe" text,
        "outstanding" text,
        "totals" text,
        "totalassets" text,
        "liquidassets" text,
        "fixedassets" text,
        "reserved" text,
        "reservedpershare" text,
        "eps" text,
        "bvps" text,
        "pb" text,
        "timetomarket" text,
        "undp" text,
        "perundp" text,
        "rev" text,
        "profit" text,
        "gpr" text,
        "npr" text,
        "holders_num" text,
        "ds" text
    )
    SERVER odps_server
    OPTIONS (project_name 'BIGDATA_PUBLIC_DATASET#finance', table_name 'ods_enterprise_share_trade_h');
    comment on foreign table public.foreign_ods_enterprise_share_trade_h is 'Stock transaction history';
    comment on column public.foreign_ods_enterprise_share_trade_h."code" is 'Stock symbol';
    comment on column public.foreign_ods_enterprise_share_trade_h."name" is 'Name';
    comment on column public.foreign_ods_enterprise_share_trade_h."industry" is 'Industry';
    comment on column public.foreign_ods_enterprise_share_trade_h."area" is 'Country/region';
    comment on column public.foreign_ods_enterprise_share_trade_h."pe" is 'Price-earnings ratio';
    comment on column public.foreign_ods_enterprise_share_trade_h."outstanding" is 'Outstanding';
    comment on column public.foreign_ods_enterprise_share_trade_h."totals" is 'Total shares outstanding (ten thousand)';
    comment on column public.foreign_ods_enterprise_share_trade_h."totalassets" is 'Total assets (ten thousand)';
    comment on column public.foreign_ods_enterprise_share_trade_h."liquidassets" is 'Liquid assets';
    comment on column public.foreign_ods_enterprise_share_trade_h."fixedassets" is 'Fixed assets';
    comment on column public.foreign_ods_enterprise_share_trade_h."reserved" is 'Reserved';
    comment on column public.foreign_ods_enterprise_share_trade_h."reservedpershare" is 'Reserved per share';
    comment on column public.foreign_ods_enterprise_share_trade_h."eps" is 'Earnings per share';
    comment on column public.foreign_ods_enterprise_share_trade_h."bvps" is 'Book value per share';
    comment on column public.foreign_ods_enterprise_share_trade_h."pb" is 'Price-to-book ratio';
    comment on column public.foreign_ods_enterprise_share_trade_h."timetomarket" is 'Time to market';
    comment on column public.foreign_ods_enterprise_share_trade_h."undp" is 'Undistributed profits';
    comment on column public.foreign_ods_enterprise_share_trade_h."perundp" is 'Undistributed profits per share';
    comment on column public.foreign_ods_enterprise_share_trade_h."rev" is 'Year-over-year revenue (%)';
    comment on column public.foreign_ods_enterprise_share_trade_h."profit" is 'Year-over-year profit (%)';
    comment on column public.foreign_ods_enterprise_share_trade_h."gpr" is 'Gross profit margin (%)';
    comment on column public.foreign_ods_enterprise_share_trade_h."npr" is 'Net profit ratio (%)';
    comment on column public.foreign_ods_enterprise_share_trade_h."holders_num" is 'Holders';
    
  3. Query data from the MaxCompute partitioned table.

    • To query the first 10 data records, execute the following statement:

      SELECT * FROM foreign_ods_enterprise_share_trade_h limit 10;
    • To query data from a specific partition, execute the following statement:

      SELECT * FROM foreign_ods_enterprise_share_trade_h 
      WHERE ds = '20170113';

Solution 2: Use the Auto Load feature to implement accelerated queries on MaxCompute tables

If the number of foreign tables to be created is large or schemas of MaxCompute tables are frequently changed, you can enable the Auto Load feature to automatically create foreign tables for queried MaxCompute tables based on your business requirements or for all MaxCompute tables. Schema changes on MaxCompute tables include operations such as deleting columns, changing the order of columns, and changing the data types of columns. You do not need to manually change the schemas of foreign tables. This improves the query efficient. For more information, see Auto Load.