This topic describes how to create a Hologres source table in Flink exclusive mode.
Usage notes
By default, a Hologres source table reads data in batch mode by scanning the entire table once. After the scan is complete, data consumption stops. The source table does not read any data that is written after the scan. To consume data from Hologres in real time, see Subscribe to Hologres binary logs.
Limits
You can use the Holo-blink connector to read data from a Hologres source table and join it with a dimension table. The following limits apply:
-
By default, you can read data only from tables in row-store format. To read data from tables in column-store format, set the
bulkread='true'parameter. -
When you create a row-store table with a primary key, you must also configure the primary key as a clustering key. The following example shows the statement to create a Hologres source table.
begin; create table test(a int primary key, b text, c text[], d float8, e int8); call set_table_property('test', 'orientation', 'row'); call set_table_property('test', 'clustering_key', 'a'); commit; -
Versions of Flink exclusive mode earlier than 3.6 do not support Hologres as a data source. To run a job, you must use the Hologres connector and reference the related JAR file. To obtain the JAR file, join the Hologres exchange DingTalk group and contact technical support. Then, use the following Data Definition Language (DDL) statement in your job to create the Hologres source table. For more information, see How do I get more online support?.
DDL definition
-
Flink exclusive mode 3.6 and later supports Hologres as a data source. The DDL statement to create a Hologres source table is as follows:
CREATE TABLE holo_dim_table ( pk VARCHAR ,seller_id VARCHAR ,seller_bc_type VARCHAR ,seller_tag VARCHAR ,PRIMARY KEY (pk) ) with ( type = 'hologres', `endpoint` = '<yourEndpoint>', --The Endpoint of the VPC network for the Hologres instance. `username` = '<yourUsername>', --The AccessKey ID of your Alibaba Cloud account. `password` = '<yourPassword>', --The AccessKey secret of your Alibaba Cloud account. `dbname` = '<yourDbname>', --The name of the Hologres database. `tablename` = '<yourTablename>', --The name of the Hologres table that receives data. `bulkread`='true' );
The following table describes the parameters in the WITH clause.
|
Parameter |
Description |
Required |
|
type |
The type of the data source. Default value: hologres. |
Yes |
|
dbname |
The name of the Hologres database. |
Yes |
|
tablename |
The name of the Hologres table that receives data. |
Yes |
|
username |
The AccessKey ID of your Alibaba Cloud account. Log on to the AccessKey Management console to obtain the AccessKey ID. |
Yes |
|
password |
The AccessKey secret of your Alibaba Cloud account. Log on to the AccessKey Management console to obtain the AccessKey secret. |
Yes |
|
endpoint |
The VPC network address of the Hologres instance. Log on to the Hologres console. Go to the product page of the destination instance and find the Endpoint in the Network Information section. The Endpoint must include the port number in the ip:port format. |
Yes |
|
bulkread |
Valid values:
Note
If you want to use a Hologres column-store table as the source table for a job, set this parameter to true. |
Yes |
Data type mapping
For more information about the data type mapping between Flink exclusive mode and Hologres, see Data types.
Real-time consumption of Hologres binary logs with Flink
Real-time Computing for Flink version 3.7 and later supports the real-time consumption of binary logs using the Hologres connector. For more information, see Consume Hologres binary logs in real time using Flink/Blink.