This topic describes how to access MaxCompute Delta tables from Hologres.
Introduction
In Hologres V3.0.38 and later, you can read data from MaxCompute Delta tables. A Delta table is an incremental data table that is supported by MaxCompute. It supports data import in minutes, uses the AliORC file format at the underlying layer, supports the UPSERT semantics, and allows you to read and write incremental data in the Change Data Capture (CDC) format. For more information, see Basic concepts.
Prerequisites
A Hologres instance is purchased. For more information, see Purchase a Hologres instance.
MaxCompute is activated, and a MaxCompute project is created. For more information about how to activate MaxCompute, see Activate MaxCompute.
You are granted the permissions to access the MaxCompute project and tables. For more information, see Manage user permissions by using commands.
Usage notes
Hologres only allows you to read the latest snapshot of a MaxCompute Delta table.
You cannot write data back to a MaxCompute Delta table from Hologres.
To access a MaxCompute Delta table, you must configure a GUC parameter. We recommend that you run the following command at the session level together with the SQL statement that is used to query the MaxCompute Delta table:
SET hg_experimental_external_catalog_routing = 'odps:common_table';
Use MaxCompute Delta tables
Create a Delta table on the MaxCompute side
Use one of the following methods to create a table:
Add a MaxCompute data source in the DataWorks console. Then, create a table. For more information, see DataWorks.
Use the MaxCompute client to create a table. For more information, see Use the MaxCompute client.
Sample SQL statement:
CREATE TABLE IF NOT EXISTS mf_dt (pk BIGINT NOT NULL PRIMARY KEY,
val BIGINT NOT NULL)
PARTITIONED BY (dd STRING, hh STRING)
tblproperties ("transactional"="true");
Write data to a Delta table on the MaxCompute side
Use one of the following methods to write data to a Delta table on the MaxCompute side:
Execute the INSERT INTO or INSERT OVERWRITE statement to write data in batches. For more information, see Insert data into or overwrite data in a table or a static partition (INSERT INTO and INSERT OVERWRITE).
Use Flink to write data in near real time. For more information, see Use Flink to write data to a Delta table.
Sample SQL statements for writing data in batches:
-- Execute the INSERT INTO statement to append data to a specific partition in the mf_dt table.
INSERT INTO TABLE mf_dt PARTITION(dd='01', hh='01')
VALUES (3, 30), (4, 4), (5, 5);
-- Insert test data into the partition that meets the dd='01' and hh='01' conditions in the mf_dt table.
INSERT OVERWRITE TABLE mf_dt PARTITION (dd='01', hh='01')
VALUES (1, 1), (2, 2), (3, 3);
Create an external data source mapping on the Hologres side
Use one of the following methods to create an external data source mapping on the Hologres side:
Use the CREATE EXTERNAL DATABASE statement to map all tables in a specified MaxCompute project. For more information, see CREATE EXTERNAL DATABASE.
Create foreign tables to map a specific schema or table. For more information, see IMPORT FOREIGN SCHEMA and CREATE FOREIGN TABLE.
Sample CREATE EXTERNAL DATABASE statement:
CREATE EXTERNAL DATABASE ext_database_mc WITH
metastore_type 'maxcompute'
mc_project 'mc_3_layer_project'
comment 'mc three layer project';
Query all schemas in the external database.
SELECT * FROM hologres.hg_external_schemas('ext_database_mc');
Query all tables in default mode in the external database.
SELECT * FROM hologres.hg_external_tables('ext_database_mc', 'default');
Query data from a Delta table on the Hologres side
To query data from a Delta table on the Hologres side, execute the following SQL statements:
SET hg_experimental_external_catalog_routing = 'odps:common_table';
SELECT * FROM ext_database_mc.mc_schema_name.mf_dt;