You can use MaxCompute external tables to access data of MC-Hologres data sources by using the Java Database Connectivity (JDBC) driver. This topic describes how to create an MC-Hologres external table and specify the MC-Hologres data source, authentication method, mapping table, and JDBC driver in the external table.

Prerequisites

  • An MC-Hologres database and an MC-Hologres table are created.

    For more information about how to create an MC-Hologres database, see Create a database.

    For more information about how to create an MC-Hologres table, see CREATE TABLE.

  • A MaxCompute project is created.

    For more information about how to create a MaxCompute project, see Create a project.

Background information

MC-Hologres is a real-time interactive analytics engine. It is compatible with PostgreSQL and seamlessly integrates with MaxCompute.

You can create an MC-Hologres external table in MaxCompute to query the data of MC-Hologres data sources by using the JDBC driver provided by PostgreSQL. This method prevents redundant data storage and allows you to obtain query results at a fast speed without the need to import or export data.

Limits

When you use MC-Hologres external tables, take note of the following limits:

  • MaxCompute does not support the update or delete operation on the MC-Hologres external tables.
  • Partitioned tables of MC-Hologres do not correspond to partitioned tables of MaxCompute. MaxCompute external tables cannot be partitioned.

Usage notes

When you use MC-Hologres external tables, take note of the following items:
  • The names of the parent and child tables in an MC-Hologres database are specified in the MaxCompute external table that is read by the JDBC driver. SQL statements are executed on the MaxCompute external table. Both parent and child tables can be mapped to MC-Hologres external tables. However, only the parent table can be read.
  • If the error FAILED: Generating job conf failed, gen jobconf failed: External table location scheme "jdbc:postgresql" is not supported is reported when you query the data of an MC-Hologres external table or insert data into an MC-Hologres external table, the jobconf2 feature that is used to optimize and update SQL execution plans is not enabled. This feature is enabled in most MaxCompute projects. If you want to enable this feature, add the following properties at the session level:
    set odps.sql.jobconf.odps2=true;
    set odps.sql.jobconf.odps2.enforce=true;
  • INSERT ON CONFLICT of MC-Hologres databases cannot be used when you write data from MaxCompute to an MC-Hologres external table. If the MC-Hologres source table has a primary key, you must ensure that the primary key of the data that you want to write to the MC-Hologres external table does not conflict with the primary key of the MC-Hologres source table.

Define an MC-Hologres external table

When you create an external table, you must specify a storage handler in the CREATE TABLE statement and configure JDBC driver parameters to access MC-Hologres data sources. Syntax for creating an MC-Hologres external table:
create external table <table_name>(
  <col1_name> <data_type>,
  <col2_name> <data_type>,
  ......
)
stored by '<com.aliyun.odps.jdbc.JdbcStorageHandler>'
location '<jdbc:postgresql://<accessid>:<accesskey>@<endpoint>:<port>/<database>?currentSchema=<schema>&table=<holo_table_name>/>' 
tblproperties (
  'mcfed.mapreduce.jdbc.driver.class'='org.postgresql.Driver', 
  'odps.federation.jdbc.target.db.type'='holo',
  ['odps.federation.jdbc.colmapping'='<col1:column1,col2:column2,col3:column3,...>']
);
  • table_name: required. The name of the MC-Hologres external table that is created in MaxCompute.
  • col_name: required. The name of the column in the MC-Hologres external table.
  • data_type: required. The data type of the columns in the MC-Hologres external table.
  • stored by: required. This parameter specifies a storage handler, which defines the method to access the MC-Hologres external table. Set the value to MC-Hologres. The value indicates that the MC-Hologres external table is accessed by using JdbcStorageHandler
  • location: required. The JDBC endpoint of your MC-Hologres instance. Description of the fields in this parameter:
    • accessid: required. The AccessKey ID of the Alibaba Cloud account that has the permissions to access the MC-Hologres instance.
    • accesskey: required. The AccessKey secret that corresponds to the AccessKey ID.
    • endpoint: required. The endpoint of the MC-Hologres instance in the classic network. For more information about how to obtain endpoints, see Instance configurations.
    • pat: required. The port number of the MC-Hologres instance. For more information about how to obtain port numbers, see Instance configurations.
    • database: required. The name of the MC-Hologres database that you want to access. For more information about MC-Hologres databases, see CREATE DATABASE.
    • schema: optional. If the name of the external table is unique in the MC-Hologres database or the source table is a table in the default schema, you do not need to specify this parameter. For more information, see CREATE SCHEMA.
    • holo_table_name: required. The name of the MC-Hologres source table. For more information about MC-Hologres source tables, see CREATE TABLE.
    Note In the JDBC connection string, you must use the plaintext AccessKey pair to create a table. Keep your AccessKey pair confidential.
  • tblproperties: required. You must configure the following parameters:
    • mcfed.mapreduce.jdbc.driver.class: required. The driver that is used to access the MC-Hologres database. Set the value to org.postgresql.Driver.
    • odps.federation.jdbc.target.db.type: required. The type of the MC-Hologres database that you want to access. Set the value to holo.
    • odps.federation.jdbc.colmapping: optional. If you want to map some columns of the MC-Hologres source table to the MC-Hologres external table, you must configure this parameter. This parameter specifies the mapping between the fields of the MC-Hologres source table and the fields of the MC-Hologres external table. If you do not configure this parameter, the fields in the source table are mapped to the MC-Hologres external table by following the sequence of the fields in the source table.

Examples

Information of the MC-Hologres instance:
  • Name of the MC-Hologres database: mc_test.
  • Schema of the MC-Hologres database: public.
  • Endpoint of the MC-Hologres instance in the classic network: hgprecn-cn-oew210ut****-cn-hangzhou-internal.hologres.aliyuncs.com:80.
  • Name of the MC-Hologres table: holo. The following figure shows the data of the table.Data of the table
  1. Log on to the MaxCompute client and go to the specified MaxCompute project.
    For more information about the commands that can be used to access a project, see Project operations.
  2. Execute the following statement to create an MC-Hologres external table.
    Sample statement:
    create external table if not exists my_table_holo_jdbc
    (
     id bigint,
     name string
    )
    stored by 'com.aliyun.odps.jdbc.JdbcStorageHandler' 
    LOCATION 'jdbc:postgresql://LTAI4FzxmCwzb4BJqFce****:hKZMEFjdLe8ngRT5qp75UYufvT****@hgprecn-cn-oew210utf003-cn-hangzhou-internal.hologres.aliyuncs.com:80/mc_test?currentSchema=public&useSSL=false&table=holo/'
    TBLPROPERTIES (
      'mcfed.mapreduce.jdbc.driver.class'='org.postgresql.Driver',
      'odps.federation.jdbc.target.db.type'='holo',
      'odps.federation.jdbc.colmapping'='id:id,name:name'
    );
  3. Run the following commands to query the data of the MC-Hologres source table based on the created MC-Hologres external table.
    Sample commands:
    -- Add the following properties to access the MC-Hologres external table. 
    set odps.sql.split.hive.bridge=true;
    set odps.sql.hive.compatible=true;
    -- Query the data of the MC-Hologres external table. 
    select * from my_table_holo_jdbc limit 10;
    The following result is returned:
    +------------+------------+
    | id         | name       |
    +------------+------------+
    | 1          | kate       |
    | 2          | mary       |
    | 3          | bob        |
    | 4          | tom        |
    | 5          | lulu       |
    | 6          | mark       |
    | 7          | haward     |
    | 8          | lilei      |
    | 9          | hanmeimei  |
    | 10         | lily       |
    +------------+------------+
  4. Optional:Exchange data between MaxCompute and MC-Hologres based on the MC-Hologres external table and perform joint data analysis.
    For example, write data processed by MaxCompute to an MC-Hologres database by using the MC-Hologres external table. This accelerates data analysis and implements online services. Sample commands:
    -- Add the following properties to access the MC-Hologres external table. 
    set odps.sql.split.hive.bridge=true;
    set odps.sql.hive.compatible=true;
    -- Insert data into the MC-Hologres external table. 
    insert into my_table_holo_jdbc values (12,'alice');
    -- Query the data of the MC-Hologres external table. 
    select * from my_table_holo_jdbc;
    The following result is returned:
    +------------+------------+
    | id         | name       |
    +------------+------------+
    | 12          | alice      |
    | 1          | kate       |
    | 2          | mary       |
    | 3          | bob        |
    | 4          | tom        |
    | 5          | lulu       |
    | 6          | mark       |
    | 7          | haward     |
    | 8          | lilei      |
    | 9          | hanmeimei  |
    | 10         | lily       |
    | 11         | lucy       |
    +------------+------------+
    Dimension tables that are frequently updated are stored in MC-Hologres databases. This meets the real-time data update requirements. MaxCompute uses external tables to access the dimension tables in MC-Hologres databases. Then, an association analysis is performed on the data in the dimension tables and the fact tables in MaxCompute. Sample commands:
    -- Add the following properties to access the MC-Hologres external table. 
    set odps.sql.split.hive.bridge=true;
    set odps.sql.hive.compatible=true;
    -- Create a MaxCompute internal table. 
    create table holo_test as select * from my_table_holo_jdbc;
    -- Perform association analysis on the data in the MaxCompute internal table and MC-Hologres external table. 
    select * from my_table_holo_jdbc t1 inner join holo_test t2 on t1.id=t2.id;
    The following result is returned:
    +------------+------------+------------+------------+
    | id         | name       | id2        | name2      |
    +------------+------------+------------+------------+
    | 1          | kate       | 1          | kate       |
    | 2          | mary       | 2          | alice      |
    | 2          | mary       | 2          | mary       |
    | 3          | bob        | 3          | bob        |
    | 4          | tom        | 4          | tom        |
    | 5          | lulu       | 5          | lulu       |
    | 6          | mark       | 6          | mark       |
    | 7          | haward     | 7          | haward     |
    | 8          | lilei      | 8          | lilei      |
    | 9          | hanmeimei  | 9          | hanmeimei  |
    | 10         | lily       | 10         | lily       |
    | 11         | lucy       | 11         | lucy       |
    | 12         | alice      | 12         | alice      |
    +------------+------------+------------+------------+