You can use the external table feature of MaxCompute to access data of Hologres data sources by using the Java Database Connectivity (JDBC) driver. This topic describes how to execute the CREATE EXTERNAL TABLE statement to create a Hologres external table. In the statement, you must specify a Hologres data source, Security Token Service (STS) authentication information, a Hologres source table, and a JDBC driver.

Background information

Hologres is a real-time interactive analytics data warehouse. It is compatible with PostgreSQL and seamlessly integrates with MaxCompute.

You can create a Hologres external table in MaxCompute to query the data of a Hologres data source based on STS authentication information and a PostgreSQL JDBC driver. This method prevents redundant data storage and allows you to obtain query results at a fast speed without the need to import or export data.

Prerequisites

Before you create a Hologres external table, make sure that the following conditions are met:
  • A Hologres database and a Hologres source table are created.

    For more information about how to create a Hologres database, see Create a database.

    For more information about how to create a Hologres table, see CREATE TABLE.

    Information of the sample Hologres instance in this topic:
    • Name of the Hologres database: mc_test.
    • Schema of the Hologres database: public.
    • Endpoint of the Hologres instance in the classic network: hgprecn-cn-oew210ut****-cn-hangzhou-internal.hologres.aliyuncs.com:80.
    • Name of the Hologres table: holo. The following figure shows the data of the table.Table data
  • The MaxCompute project for which you want to create a Hologres external table is created.

    For more information about how to create a MaxCompute project, see Create a MaxCompute project.

  • The MaxCompute client is installed.

    For more information about how to install the MaxCompute client, see Install and configure the MaxCompute client.

Limits

When you use Hologres external tables, take note of the following limits:

  • MaxCompute does not support the update or delete operation on Hologres external tables.
  • Partitioned tables of Hologres are not mapped to partitioned tables of MaxCompute. Hologres external tables cannot be partitioned.
  • If you use multiple processes to concurrently write a large amount of data to a Hologres external table, a process rewrites data to the Hologres external table in rare cases. As a result, duplicate data exists.
  • In a Hologres external table that you create in MaxCompute, data of the DECIMAL type must have a precision of 38 and a scale of 18. If the number of decimal places of data in a column is less than 18, you can set the data type of the column to string when you create a Hologres external table in MaxCompute. When you use the data in the external table, you can use the CAST function to forcefully convert the data of the STRING type to the DECIMAL type.
  • Complex data types, such as ARRAY, MAP, and STRUCT, are not supported in the Hologres external table that you create in MaxCompute.
  • The JSON, JSONB, and MONEY data types that are supported in Hologres do not match data types that are supported in MaxCompute. When you create a Hologres external table in MaxCompute, you cannot configure columns that match data of these types in the Hologres source table.

Usage notes

When you use Hologres external tables, take note of the following items:
  • The names of the parent and child tables in a Hologres database are specified in Hologres external tables. SQL statements can be executed on the Hologres external tables. Parent and child tables can be mapped to Hologres external tables. However, parent tables can only be read.
  • If the error FAILED: Generating job conf failed, gen jobconf failed: External table location scheme "jdbc:postgresql" is not supported is reported when you query data from a Hologres external table or insert data into a Hologres external table, the jobconf2 feature that is used to optimize and update SQL execution plans is not enabled. By default, the jobconf2 feature is enabled for most MaxCompute projects. If the jobconf2 feature is disabled for your project, you can add the following properties at the session level to enable the feature:
    set odps.sql.jobconf.odps2=true;
    set odps.sql.jobconf.odps2.enforce=true;
    set odps.sql.split.hive.bridge=true;
    set odps.sql.hive.compatible=true;
  • You cannot use the INSERT ON CONFLICT statement to write data to Hologres external tables. For more information about the INSERT ON CONFLICT statement, see INSERT ON CONFLICT. If the Hologres source table contains a primary key, you must ensure that the primary key of the data that you want to write to the Hologres external table is not the same as the primary key of the Hologres source table.
  • When you create an external table, the table name and field names are not case-sensitive. When you query external tables or fields, the table names and field names are not case-sensitive, and forcible uppercase and lowercase conversions are not supported.

Syntax

When you create a Hologres external table, you must specify a storage handler, STS authentication information, and the JDBC URL of a Hologres data source in the CREATE EXTERNAL TABLE statement. Syntax for creating a Hologres external table:
create external table [if not exists] <table_name>(
  <col1_name> <data_type>,
  <col2_name> <data_type>,
  ......
)
stored by '<com.aliyun.odps.jdbc.JdbcStorageHandler>'
with serdeproperties (
  'odps.properties.rolearn'='<ram_arn>')
location '<jdbc:postgresql://<endpoint>:<port>/<database>?ApplicationName=MaxCompute&[currentSchema=<schema>&][useSSL={true|false}&]table=<holo_table_name>/>' 
tblproperties (
  'mcfed.mapreduce.jdbc.driver.class'='org.postgresql.Driver', 
  'odps.federation.jdbc.target.db.type'='holo',
  ['odps.federation.jdbc.colmapping'='<col1:column1,col2:column2,col3:column3,...>']
);
  • if not exists: optional. If you create a table by using the name of an existing table but do not specify the if not exists parameter, an error is returned. If you specify the if not exists parameter, a success message is returned no matter whether a table with the same name already exists. The success message is returned even if the schema of the existing table is different from that of the table you want to create. If you create a table by using the name of an existing table, the table is not created and the metadata of the existing table is not changed.
  • table_name: required. The name of the Hologres external table that you want to create.
  • col_name: required. The name of a column in the Hologres external table.
  • data_type: required. The data type of a column in the Hologres external table.
  • stored by: required. A storage handler, which defines the method that you want to use to access the Hologres external table. Set the value to com.aliyun.odps.jdbc.JdbcStorageHandler. The value indicates that the Hologres external table is accessed by using JdbcStorageHandler.
  • ram_arn: required. The Alibaba Cloud Resource Name (ARN) of the specified RAM role. The ARN is used as the STS authentication information. To obtain the ARN of the specified RAM role, you can perform the following steps: Log on to the Resource Access Management (RAM) console. On the Roles page, click the name of the RAM role whose ARN you want to query in the Role Name column. On the page that appears, view the ARN in the Basic Information section. Obtain the ARN
  • location: required. The JDBC URL of the Hologres instance. Description of the fields in this parameter:
    • endpoint: required. The endpoint of the Hologres instance in the classic network. For more information about how to obtain endpoints, see Instance configurations.
    • port: required. The port number of the Hologres instance. For more information about how to obtain port numbers, see Instance configurations. Obtain the endpoint and port number
    • database: required. The name of the Hologres database that you want to access. For more information about Hologres databases, see CREATE DATABASE.
    • ApplicationName: required. The default value is MaxCompute and no modification is required.
    • schema: optional. If the name of the source table is unique in the Hologres database or the source table is a table in the default schema, you do not need to specify this parameter. For more information about schemas, see CREATE SCHEMA.
    • holo_table_name: required. The name of the Hologres source table. For more information about Hologres source tables, see CREATE TABLE. View a table
  • tblproperties:
    • mcfed.mapreduce.jdbc.driver.class: required. The JDBC driver that is used to access the Hologres database. Set the value to org.postgresql.Driver.
    • odps.federation.jdbc.target.db.type: required. The type of the Hologres database that you want to access. Set the value to holo.
    • odps.federation.jdbc.colmapping: optional. If you want to map some columns of the Hologres source table to the Hologres external table, you must configure this parameter. This parameter specifies the mappings between the fields of the Hologres source table and the fields of the Hologres external table. If you do not configure this parameter, the fields in the source table are mapped to the fields of the Hologres external table based on the sequence of the fields in the source table. The value of this parameter is in the Column name 1 of the Hologres external table:Column name 1 of the Hologres source table,Column name 2 of the Hologres external table:Column name 2 of the Hologres source table,... format.

Procedure

To create a Hologres external table, perform the following steps:

  1. Step 1: Create a RAM role
    Create a RAM role and obtain the ARN of the RAM role. The ARN is used to specify the STS authentication information when you create an external table.
  2. Step 2: Add the RAM role to a Hologres instance and grant permissions to the RAM role
    Add the RAM role that you created to the Hologres instance and grant the RAM role the permissions to access the Hologres instance.
  3. Step 3: Create a Hologres external table
    Create a Hologres external table that is mapped to the Hologres source table.

Step 1: Create a RAM role

  1. Log on to the RAM console and create a RAM role.
    Create a RAM roleIn the Create Role panel, select Alibaba Cloud Account or IdP for Select Trusted Entity.
    • Alibaba Cloud Account:

      A RAM user of your Alibaba Cloud account can access cloud resources by assuming a RAM role. For more information, see Create a RAM role for a trusted Alibaba Cloud account.

    • IdP:

      You can log on to the Alibaba Cloud Management Console by using role-based single sign-on (SSO). You can log on without the need to provide a username and password. For more information about DataWorks, see Create a RAM role for a trusted IdP.

  2. Edit the trust policy.
    Edit the trust policy
    1. On the Roles page, click the name of the RAM role that you create.
    2. Click the Trust Policy Management tab.
    3. On the Trust Policy Management tab, click Edit Trust Policy.
    4. In the Edit Trust Policy panel, modify the trust policy configuration based on the following content.
      The configuration of the trust policy varies based on the type of the trusted entity that you selected.
      • Alibaba Cloud Account is selected for Select Trusted Entity:
        {
          "Statement": [
            {
              "Action": "sts:AssumeRole",
              "Effect": "Allow",
              "Principal": {
                "RAM": [
                  "acs:ram::<ID of your Alibaba Cloud account>:root"
                ]
              }
            },
            {
              "Action": "sts:AssumeRole",
              "Effect": "Allow",
              "Principal": {
                "Service": [
                  "odps.aliyuncs.com"
                ]
              }
            }
          ],
          "Version": "1"
        }
      • IdP is selected for Select Trusted Entity:
        "Statement": [
                {
                    "Action": "sts:AssumeRole",
                    "Condition": {
                        "StringEquals": {
                            "saml:recipient": "https://signin.aliyun.com/saml-role/sso"
                        }
                    },
                    "Effect": "Allow",
                    "Principal": {
                        "Federated": [
                            "acs:ram::<ID of your Alibaba Cloud account>:saml-provider/IDP"
                        ]
                    }
                },
                {
                    "Action": "sts:AssumeRole",
                    "Effect": "Allow",
                    "Principal": {
                        "Service": [
                            "odps.aliyuncs.com"
                        ]
                    }
                }
            ],
            "Version": "1"
        }
        }
        Note You can obtain the ID of your Alibaba Cloud account on the Security Settings page of Account Center.
    5. Click OK.

Step 2: Add the RAM role to a Hologres instance and grant permissions to the RAM role

Before the RAM role can use the Hologres instance, the role must obtain the required development permissions on the Hologres instance. By default, the RAM role is not granted the permissions to view or manage instances in the Hologres console. You must grant the permissions to the RAM role by using your Alibaba Cloud account. After you add the RAM role to a Hologres instance, you can use one of the following methods to grant the permissions to the RAM role:

  • Use the Hologres console to grant the required permissions to the RAM role.
    1. Log on to the Hologres console.
    2. In the left-side navigation pane, click Instances. On the Hologres Instances page, click the name of the Hologres instance to which you want to add the RAM role.
    3. On the instance details page, click Accounts.
    4. On the User Management page, click Add New User to add the RAM role to the Hologres instance. Add New User
    5. On the Database Authorization page, grant the development permissions on the instance to the RAM role. Database Authorization
  • Use an SQL statement to grant the required permissions to the RAM role.

    For more information about how to use an SQL statement to grant the required permissions to a RAM role, see Overview.

  • By default, a RAM user is not granted the permissions to perform operations in the Hologres console. If you want a RAM user to assume the RAM role, you must attach the AliyunRAMReadOnlyAccess policy to the RAM user by using your Alibaba Cloud account. Otherwise, the RAM user cannot perform operations in the Hologres console. For more information, see Grant permissions on Hologres to RAM users. Add permissions

Step 3: Create a Hologres external table

Log on to the MaxCompute client and create a Hologres external table based on the prepared data. For more information about the syntax, see Syntax.

  1. Start the MaxCompute client and go to the MaxCompute project for which you want to create a Hologres external table.
    For more information about the commands that are used to go to a project, see Project operations.
  2. Execute the following statement to create a Hologres external table:
    create external table if not exists my_table_holo_jdbc
    (
     id bigint,
     name string
    )
    stored by 'com.aliyun.odps.jdbc.JdbcStorageHandler' 
    with serdeproperties (
      'odps.properties.rolearn'='acs:ram::139699392458****:role/aliyunodpsholorole')
    location 'jdbc:postgresql://hgprecn-cn-oew210ut****-cn-hangzhou-internal.hologres.aliyuncs.com:80/mc_test?ApplicationName=MaxCompute&currentSchema=public&useSSL=false&table=holo/'
    tblproperties (
      'mcfed.mapreduce.jdbc.driver.class'='org.postgresql.Driver',
      'odps.federation.jdbc.target.db.type'='holo',
      'odps.federation.jdbc.colmapping'='id:id,name:name'
    );
  3. Execute the following statements to query the data of the Hologres source table by using the created Hologres external table:
    -- Add the following properties to access the Hologres external table. 
    set odps.sql.split.hive.bridge=true;
    set odps.sql.hive.compatible=true;
    -- Query the data of the Hologres external table. 
    select * from my_table_holo_jdbc limit 10;
    The following results are returned:
    +------------+------------+
    | id         | name       |
    +------------+------------+
    | 1          | kate       |
    | 2          | mary       |
    | 3          | bob        |
    | 4          | tom        |
    | 5          | lulu       |
    | 6          | mark       |
    | 7          | haward     |
    | 8          | lilei      |
    | 9          | hanmeimei  |
    | 10         | lily       |
    +------------+------------+
  4. Optional:Exchange data between MaxCompute and Hologres based on the Hologres external table and perform joint data analysis.
    For example, you can write data that is processed by MaxCompute to a Hologres database by using the Hologres external table. This accelerates data analysis and implements online services. Sample statement:
    -- Add the following properties to access the Hologres external table. 
    set odps.sql.split.hive.bridge=true;
    set odps.sql.hive.compatible=true;
    -- Insert data into the Hologres external table. 
    insert into my_table_holo_jdbc values (12,'alice');
    -- Query the data of the Hologres external table. 
    select * from my_table_holo_jdbc;
    The following result is returned:
    +------------+------------+
    | id         | name       |
    +------------+------------+
    | 12          | alice      |
    | 1          | kate       |
    | 2          | mary       |
    | 3          | bob        |
    | 4          | tom        |
    | 5          | lulu       |
    | 6          | mark       |
    | 7          | haward     |
    | 8          | lilei      |
    | 9          | hanmeimei  |
    | 10         | lily       |
    | 11         | lucy       |
    +------------+------------+
    Dimension tables that are frequently updated are stored in Hologres databases. This meets real-time data update requirements. MaxCompute uses external tables to access the dimension tables in Hologres databases. Then, association analysis is performed on the data in the dimension tables and the fact tables in MaxCompute. Sample statements:
    -- Add the following properties to access the Hologres external table. 
    set odps.sql.split.hive.bridge=true;
    set odps.sql.hive.compatible=true;
    -- Create a MaxCompute internal table. 
    create table holo_test as select * from my_table_holo_jdbc;
    --- Perform association analysis on the data in the MaxCompute internal table and Hologres external table. 
    select * from my_table_holo_jdbc t1 inner join holo_test t2 on t1.id=t2.id;
    The following result is returned:
    +------------+------------+------------+------------+
    | id         | name       | id2        | name2      |
    +------------+------------+------------+------------+
    | 1          | kate       | 1          | kate       |
    | 2          | mary       | 2          | mary       |
    | 3          | bob        | 3          | bob        |
    | 4          | tom        | 4          | tom        |
    | 5          | lulu       | 5          | lulu       |
    | 6          | mark       | 6          | mark       |
    | 7          | harward    | 7          | harward    |
    | 8          | lilei      | 8          | lilei      |
    | 9          | hanmeimei  | 9          | hanmeimei  |
    | 10         | lily       | 10         | lily       |
    | 11         | lucy       | 11         | lucy       |
    | 12         | alice      | 12         | alice      |
    +------------+------------+------------+------------+