You can use MaxCompute external tables to access data of Hologres data sources by using the Java Database Connectivity (JDBC) driver. This topic describes how to create a Hologres external table and specify the Hologres data source, authentication method, mapping table, and JDBC driver in the external table.

Background information

Hologres is a real-time interactive analytics engine. It is compatible with PostgreSQL and seamlessly integrates with MaxCompute.

You can create a Hologres external table in MaxCompute to query the data of Hologres data sources based on STS authentication information by using the JDBC driver provided by PostgreSQL. This method prevents redundant data storage and allows you to obtain query results at a fast speed without the need to import or export data.

Prerequisites

Before you create a Hologres external table, make sure that the following conditions are met:
  • A Hologres database and a Hologres table are created.

    For more information about how to create a Hologres database, see Create a database.

    For more information about how to create a Hologres table, see CREATE TABLE.

    Information of the Hologres instance:
    • Name of the Hologres database: mc_test.
    • Schema of the Hologres database: public.
    • Endpoint of the Hologres instance in the classic network: hgprecn-cn-oew210ut****-cn-hangzhou-internal.hologres.aliyuncs.com:80.
    • Name of the Hologres table: holo. The following figure shows the data of the table.Data of the table
  • The MaxCompute project for which you want to create a Hologres external table is created.

    For more information about how to create a MaxCompute project, see Create a MaxCompute project.

  • The MaxCompute client is installed.

    For more information about how to install the MaxCompute client, see Install and configure the MaxCompute client.

Limits

When you use Hologres external tables, take note of the following limits:

  • MaxCompute does not support the update or delete operation on the Hologres external tables.
  • Partitioned tables of Hologres do not correspond to partitioned tables of MaxCompute. Hologres external tables cannot be partitioned.

Usage notes

When you use Hologres external tables, take note of the following items:
  • The names of the parent and child tables in a Hologres database are specified in Hologres external tables. SQL statements are executed on the Hologres external tables. Both parent and child tables can be mapped to Hologres external tables. However, the parent table can only be read.
  • If the error FAILED: Generating job conf failed, gen jobconf failed: External table location scheme "jdbc:postgresql" is not supported is reported when you query the data of a Hologres external table or insert data into a Hologres external table, the jobConf2 feature that is used to optimize and update SQL execution plans is not enabled. This feature is enabled in most MaxCompute projects. If you want to enable this feature, add the following properties at the session level:
    set odps.sql.jobconf.odps2=true;
    set odps.sql.jobconf.odps2.enforce=true;
    set odps.sql.split.hive.bridge=true;
    set odps.sql.hive.compatible=true;
  • INSERT ON CONFLICT of Hologres databases cannot be used when you write data from MaxCompute to a Hologres external table. If the Hologres source table has a primary key, you must ensure that the primary key of the data that you want to write to the Hologres external table does not conflict with the primary key of the Hologres source table.

Syntax

When you create a Hologres external table, you must specify a storage handler in the CREATE TABLE statement, configure Security Token Service (STS) authentication information, and specify the JDBC URL to access Hologres data sources. Syntax for creating a Hologres external table:
create external table [if not exists] <table_name>(
  <col1_name> <data_type>,
  <col2_name> <data_type>,
  ......
)
stored by '<com.aliyun.odps.jdbc.JdbcStorageHandler>'
with serdeproperties (
  'odps.properties.rolearn'='<ram_arn>')
location '<jdbc:postgresql://<endpoint>:<port>/<database>?[currentSchema=<schema>&][useSSL={true|false}&]table=<holo_table_name>/>' 
tblproperties (
  'mcfed.mapreduce.jdbc.driver.class'='org.postgresql.Driver', 
  'odps.federation.jdbc.target.db.type'='holo',
  ['odps.federation.jdbc.colmapping'='<col1:column1,col2:column2,col3:column3,...>']
);
  • if not exists: optional. If you create a table by using the name of an existing table but do not specify the if not exists parameter, an error is returned. If you specify the if not exists parameter, a success message is returned when you create a table by using the name of an existing table. The success message is returned even if the schema of the existing table is different from that of the table you want to create. If you create a table by using the name of an existing table, the table is not created and the metadata of the existing table is not changed.
  • table_name: required. The name of the Hologres external table that is created in MaxCompute.
  • col_name: required. The name of the column in the Hologres external table.
  • data_type: required. The data type of the columns in the Hologres external table.
  • stored by: required. This parameter specifies a storage handler, which defines the method to access the Hologres external table. Set the value to Hologres. The value indicates that the Hologres external table is accessed by using JdbcStorageHandler
  • ram_arn: required. The Alibaba Cloud Resource Name (ARN) of the specified Resource Access Management (RAM) role. To obtain the ARN of the specified RAM role, you can perform the following operations: Log on to the RAM console. On the RAM Roles page, click the name of the RAM role whose ARN you want to query in the RAM Role Name column. On the page that appears, view the ARN in the Basic Information section.
  • location: required. The JDBC URL of your Hologres instance. Description of the fields in this parameter:
    • endpoint: required. The endpoint of the Hologres instance in the classic network. For more information about how to obtain endpoints, see Instance configurations.
    • port: required. The port number of the Hologres instance. For more information about how to obtain port numbers, see Instance configurations.
    • database: required. The name of the Hologres database that you want to access. For more information about Hologres databases, see CREATE DATABASE.
    • schema: optional. If the name of the external table is unique in the Hologres database or the source table is a table in the default schema, you do not need to specify this parameter. For more information, see CREATE SCHEMA.
    • holo_table_name: required. The name of the Hologres source table. For more information about Hologres source tables, see CREATE TABLE.
  • tblproperties:
    • mcfed.mapreduce.jdbc.driver.class: required. The driver that is used to access the Hologres database. Set the value to org.postgresql.Driver.
    • odps.federation.jdbc.target.db.type: required. The type of the Hologres database that you want to access. Set the value to holo.
    • odps.federation.jdbc.colmapping: optional. If you want to map some columns of the Hologres source table to the Hologres external table, you must configure this parameter. This parameter specifies the mapping between the fields of the Hologres source table and the fields of the Hologres external table. If you do not configure this parameter, the fields in the source table are mapped to the Hologres external table by following the sequence of the fields in the source table. The value of this parameter is in the Column name 1 of the Hologres external table:Column name 1 of the Hologres source table,Column name 2 of the Hologres external table:Column name 2 of the Hologres source table,... format.

Procedure

You can perform the following operations to create a Hologres external table.

  1. Step 1: Create a RAM role
    Create a RAM role to obtain the ARN. The ARN is used when you enter STS authentication information to create a Hologres external table.
  2. Step 2: Add the RAM role to a Hologres instance and authorize the role
    Add the new RAM role to the Hologres instance and grant the permission to access the Hologres instance to the RAM role.
  3. Step 3: Create a Hologres external table
    Create a Hologres external table that is mapped to the Hologres source table.

Step 1: Create a RAM role

Log on to the RAM console by using your Alibaba Cloud account, create a RAM role, and then record the ARN of the RAM role. In the Create RAM Role panel, select Alibaba Cloud Account or IdP for Select Trusted Entity.
The following example shows the configuration of the policy that is attached to the RAM role. You can obtain the ID of your Alibaba Cloud account on the Security Settings page of Account Center.
{
  "Statement": [
    {
      "Action": "sts:AssumeRole",
      "Effect": "Allow",
      "Principal": {
        "RAM": [
          "acs:ram::<ID of the Alibaba Cloud account>:root"
        ]
      }
    },
    {
      "Action": "sts:AssumeRole",
      "Effect": "Allow",
      "Principal": {
        "Service": [
          "odps.aliyuncs.com"
        ]
      }
    }
  ],
  "Version": "1"
}
Create a RAM role

Step 2: Add the RAM role to a Hologres instance and authorize the role

Before the RAM role can use the Hologres instance based on the granted permissions, the role must obtain the required development permissions on the Hologres instance. By default, the RAM role does not have the permissions to view or manage instances in the Hologres console. Therefore, you must first use your Alibaba Cloud account to grant the required permissions to the RAM role. After you add the RAM role to a Hologres instance, you can use one of the following methods to authorize the RAM role:

  • Use the Hologres console to authorize the RAM role
    1. Log on to the Hologres console.
    2. In the left-side navigation pane, click Instances. On the Hologres Instances page, click the name of the Hologres instance to which you want to add the RAM role. On the User Management page, click Add New User to add the RAM role to the Hologres instance.
    3. On the Database Authorization page, grant the development permissions on the instance to the RAM role.
  • Use an SQL statement to authorize the RAM role

    For more information about how to use an SQL statement to authorize a RAM role, see Overview.

  • By default, a RAM role does not have the permissions to perform operations in the Hologres console. If you want a RAM user to assume the RAM role, you must first use your Alibaba Cloud account to attach the AliyunRAMReadOnlyAccess policy to the RAM user. Otherwise, the RAM role cannot perform operations in the Hologres console. For more information, see Grant permissions on Hologres to RAM users.
Grant permissions to the RAM role

Step 3: Create a Hologres external table

Log on to the MaxCompute client and create a Hologres external table based on the prepared data. For more information, see Syntax.

  1. Start the MaxCompute client and go to the MaxCompute project for which you want to create a Hologres external table.
    For more information about the commands that are used to go to a project, see Project operations.
  2. Execute the following statement to create a Hologres external table:
    create external table if not exists my_table_holo_jdbc
    (
     id bigint,
     name string
    )
    stored by 'com.aliyun.odps.jdbc.JdbcStorageHandler' 
    with serdeproperties (
      'odps.properties.rolearn'='acs:ram::139699392458****:role/aliyunodpsholorole')
    location 'jdbc:postgresql://hgprecn-cn-oew210ut****-cn-hangzhou-internal.hologres.aliyuncs.com:80/mc_test?currentSchema=public&useSSL=false&table=holo/'
    tblproperties (
      'mcfed.mapreduce.jdbc.driver.class'='org.postgresql.Driver',
      'odps.federation.jdbc.target.db.type'='holo',
      'odps.federation.jdbc.colmapping'='id:id,name:name'
    );
  3. Run the following commands to query the data of the Hologres source table based on the created Hologres external table:
    -- Add the following properties to access the Hologres external table. 
    set odps.sql.split.hive.bridge=true;
    set odps.sql.hive.compatible=true;
    -- Query the data of the Hologres external table. 
    select * from my_table_holo_jdbc limit 10;
    The following result is returned:
    +------------+------------+
    | id         | name       |
    +------------+------------+
    | 1          | kate       |
    | 2          | mary       |
    | 3          | bob        |
    | 4          | tom        |
    | 5          | lulu       |
    | 6          | mark       |
    | 7          | haward     |
    | 8          | lilei      |
    | 9          | hanmeimei  |
    | 10         | lily       |
    +------------+------------+
  4. Optional:Exchange data between MaxCompute and Hologres based on the Hologres external table and perform joint data analysis.
    For example, write data processed by MaxCompute to a Hologres database by using the Hologres external table. This accelerates data analysis and implements online services. Sample commands:
    -- Add the following properties to access the Hologres external table. 
    set odps.sql.split.hive.bridge=true;
    set odps.sql.hive.compatible=true;
    -- Insert data into the Hologres external table. 
    insert into my_table_holo_jdbc values (12,'alice');
    -- Query the data of the Hologres external table. 
    select * from my_table_holo_jdbc;
    The following result is returned:
    +------------+------------+
    | id         | name       |
    +------------+------------+
    | 12          | alice      |
    | 1          | kate       |
    | 2          | mary       |
    | 3          | bob        |
    | 4          | tom        |
    | 5          | lulu       |
    | 6          | mark       |
    | 7          | haward     |
    | 8          | lilei      |
    | 9          | hanmeimei  |
    | 10         | lily       |
    | 11         | lucy       |
    +------------+------------+
    Dimension tables that are frequently updated are stored in Hologres databases. This meets the real-time data update requirements. MaxCompute uses external tables to access the dimension tables in Hologres databases. Then, association analysis is performed on the data in the dimension tables and the fact tables in MaxCompute. Sample commands:
    -- Add the following properties to access the Hologres external table. 
    set odps.sql.split.hive.bridge=true;
    set odps.sql.hive.compatible=true;
    -- Create a MaxCompute internal table. 
    create table holo_test as select * from my_table_holo_jdbc;
    -- Perform association analysis on the data in the MaxCompute internal table and Hologres external table. 
    select * from my_table_holo_jdbc t1 inner join holo_test t2 on t1.id=t2.id;
    The following result is returned:
    +------------+------------+------------+------------+
    | id         | name       | id2        | name2      |
    +------------+------------+------------+------------+
    | 1          | kate       | 1          | kate       |
    | 2          | mary       | 2          | mary       |
    | 3          | bob        | 3          | bob        |
    | 4          | tom        | 4          | tom        |
    | 5          | lulu       | 5          | lulu       |
    | 6          | mark       | 6          | mark       |
    | 7          | harward    | 7          | harward    |
    | 8          | lilei      | 8          | lilei      |
    | 9          | hanmeimei  | 9          | hanmeimei  |
    | 10         | lily       | 10         | lily       |
    | 11         | lucy       | 11         | lucy       |
    | 12         | alice      | 12         | alice      |
    +------------+------------+------------+------------+