You can use the external table feature of MaxCompute to access data of Hologres data sources by using the Java Database Connectivity (JDBC) driver. This topic describes how to execute the CREATE EXTERNAL TABLE statement to create a Hologres external table. In the statement, you must specify a Hologres data source, Security Token Service (STS) authentication information, a Hologres source table, and a JDBC driver.
Background information
Hologres is a real-time interactive analytics data warehouse. It is compatible with PostgreSQL and seamlessly integrates with MaxCompute.
You can create a Hologres external table in MaxCompute to query the data of a Hologres data source based on STS authentication information and a PostgreSQL JDBC driver. This method prevents redundant data storage and allows you to obtain query results at a fast speed without the need to import or export data.
Prerequisites
- A Hologres database and a Hologres source table are created.
For more information about how to create a Hologres database, see Create a database.
For more information about how to create a Hologres table, see CREATE TABLE.
Information of the sample Hologres instance in this topic:- Name of the Hologres database:
mc_test
. - Schema of the Hologres database:
public
. - Endpoint of the Hologres instance in the classic network:
hgprecn-cn-oew210ut****-cn-hangzhou-internal.hologres.aliyuncs.com:80
. - Name of the Hologres table:
holo
. The following figure shows the data of the table.
- Name of the Hologres database:
- The MaxCompute project for which you want to create a Hologres external table is created.
For more information about how to create a MaxCompute project, see Create a MaxCompute project.
- The MaxCompute client is installed.
For more information about how to install the MaxCompute client, see Install and configure the MaxCompute client.
Limits
When you use Hologres external tables, take note of the following limits:
- MaxCompute does not support the update or delete operation on Hologres external tables.
- Partitioned tables of Hologres are not mapped to partitioned tables of MaxCompute. Hologres external tables cannot be partitioned.
- If you use multiple processes to concurrently write a large amount of data to a Hologres external table, a process rewrites data to the Hologres external table in rare cases. As a result, duplicate data exists.
- In a Hologres external table that you create in MaxCompute, data of the DECIMAL type
must have a precision of 38 and a scale of 18. If the number of decimal places of
data in a column is less than 18, you can set the data type of the column to string
when you create a Hologres external table in MaxCompute. When you use the data in
the external table, you can use the
CAST
function to forcefully convert the data of the STRING type to the DECIMAL type. - Complex data types, such as ARRAY, MAP, and STRUCT, are not supported in the Hologres external table that you create in MaxCompute.
- The JSON, JSONB, and MONEY data types that are supported in Hologres do not match data types that are supported in MaxCompute. When you create a Hologres external table in MaxCompute, you cannot configure columns that match data of these types in the Hologres source table.
Usage notes
- The names of the parent and child tables in a Hologres database are specified in Hologres external tables. SQL statements can be executed on the Hologres external tables. Parent and child tables can be mapped to Hologres external tables. However, parent tables can only be read.
- If the error
FAILED: Generating job conf failed, gen jobconf failed: External table location scheme "jdbc:postgresql" is not supported
is reported when you query data from a Hologres external table or insert data into a Hologres external table, thejobconf2
feature that is used to optimize and update SQL execution plans is not enabled. By default, the jobconf2 feature is enabled for most MaxCompute projects. If the jobconf2 feature is disabled for your project, you can add the following properties at the session level to enable the feature:set odps.sql.jobconf.odps2=true; set odps.sql.jobconf.odps2.enforce=true; set odps.sql.split.hive.bridge=true; set odps.sql.hive.compatible=true;
- You cannot use the INSERT ON CONFLICT statement to write data to Hologres external tables. For more information about the INSERT ON CONFLICT statement, see INSERT ON CONFLICT. If the Hologres source table contains a primary key, you must ensure that the primary key of the data that you want to write to the Hologres external table is not the same as the primary key of the Hologres source table.
- When you create an external table, the table name and field names are not case-sensitive. When you query external tables or fields, the table names and field names are not case-sensitive, and forcible uppercase and lowercase conversions are not supported.
Syntax
create external table [if not exists] <table_name>(
<col1_name> <data_type>,
<col2_name> <data_type>,
......
)
stored by '<com.aliyun.odps.jdbc.JdbcStorageHandler>'
with serdeproperties (
'odps.properties.rolearn'='<ram_arn>')
location '<jdbc:postgresql://<endpoint>:<port>/<database>?ApplicationName=MaxCompute&[currentSchema=<schema>&][useSSL={true|false}&]table=<holo_table_name>/>'
tblproperties (
'mcfed.mapreduce.jdbc.driver.class'='org.postgresql.Driver',
'odps.federation.jdbc.target.db.type'='holo',
['odps.federation.jdbc.colmapping'='<col1:column1,col2:column2,col3:column3,...>']
);
- if not exists: optional. If you create a table by using the name of an existing table but do not specify the if not exists parameter, an error is returned. If you specify the if not exists parameter, a success message is returned no matter whether a table with the same name already exists. The success message is returned even if the schema of the existing table is different from that of the table you want to create. If you create a table by using the name of an existing table, the table is not created and the metadata of the existing table is not changed.
- table_name: required. The name of the Hologres external table that you want to create.
- col_name: required. The name of a column in the Hologres external table.
- data_type: required. The data type of a column in the Hologres external table.
- stored by: required. A storage handler, which defines the method that you want to use to access
the Hologres external table. Set the value to
com.aliyun.odps.jdbc.JdbcStorageHandler
. The value indicates that the Hologres external table is accessed by using JdbcStorageHandler. - ram_arn: required. The Alibaba Cloud Resource Name (ARN) of the specified RAM role. The ARN
is used as the STS authentication information. To obtain the ARN of the specified
RAM role, you can perform the following steps: Log on to the Resource Access Management
(RAM) console. On the Roles page, click the name of the RAM role whose ARN you want to query in the Role Name
column. On the page that appears, view the ARN in the Basic Information section.
- location: required. The JDBC URL of the Hologres instance. Description of the fields in this
parameter:
- endpoint: required. The endpoint of the Hologres instance in the classic network. For more information about how to obtain endpoints, see Instance configurations.
- port: required. The port number of the Hologres instance. For more information about how
to obtain port numbers, see Instance configurations.
- database: required. The name of the Hologres database that you want to access. For more information about Hologres databases, see CREATE DATABASE.
- ApplicationName: required. The default value is MaxCompute and no modification is required.
- schema: optional. If the name of the source table is unique in the Hologres database or the source table is a table in the default schema, you do not need to specify this parameter. For more information about schemas, see CREATE SCHEMA.
- holo_table_name: required. The name of the Hologres source table. For more information about Hologres
source tables, see CREATE TABLE.
- tblproperties:
- mcfed.mapreduce.jdbc.driver.class: required. The JDBC driver that is used to access the Hologres database. Set the
value to
org.postgresql.Driver
. - odps.federation.jdbc.target.db.type: required. The type of the Hologres database that you want to access. Set the value
to
holo
. - odps.federation.jdbc.colmapping: optional. If you want to map some columns of the Hologres source table to the Hologres
external table, you must configure this parameter. This parameter specifies the mappings
between the fields of the Hologres source table and the fields of the Hologres external
table. If you do not configure this parameter, the fields in the source table are
mapped to the fields of the Hologres external table based on the sequence of the fields
in the source table. The value of this parameter is in the
Column name 1 of the Hologres external table:Column name 1 of the Hologres source table,Column name 2 of the Hologres external table:Column name 2 of the Hologres source table,...
format.
- mcfed.mapreduce.jdbc.driver.class: required. The JDBC driver that is used to access the Hologres database. Set the
value to
Procedure
To create a Hologres external table, perform the following steps:
Step 1: Create a RAM role
Step 2: Add the RAM role to a Hologres instance and grant permissions to the RAM role
Before the RAM role can use the Hologres instance, the role must obtain the required development permissions on the Hologres instance. By default, the RAM role is not granted the permissions to view or manage instances in the Hologres console. You must grant the permissions to the RAM role by using your Alibaba Cloud account. After you add the RAM role to a Hologres instance, you can use one of the following methods to grant the permissions to the RAM role:
- Use the Hologres console to grant the required permissions to the RAM role.
- Log on to the Hologres console.
- In the left-side navigation pane, click Instances. On the Hologres Instances page, click the name of the Hologres instance to which you want to add the RAM role.
- On the instance details page, click Accounts.
- On the User Management page, click Add New User to add the RAM role to the Hologres instance.
- On the Database Authorization page, grant the development permissions on the instance to the RAM role.
- Use an SQL statement to grant the required permissions to the RAM role.
For more information about how to use an SQL statement to grant the required permissions to a RAM role, see Overview.
- By default, a RAM user is not granted the permissions to perform operations in the
Hologres console. If you want a RAM user to assume the RAM role, you must attach the
AliyunRAMReadOnlyAccess policy to the RAM user by using your Alibaba Cloud account. Otherwise, the RAM user
cannot perform operations in the Hologres console. For more information, see Grant permissions on Hologres to RAM users.
Step 3: Create a Hologres external table
Log on to the MaxCompute client and create a Hologres external table based on the prepared data. For more information about the syntax, see Syntax.