Hologres result table - Hologres - Alibaba Cloud Documentation Center

Hologres is seamlessly integrated with Blink in exclusive mode, which is a service type in the original product line of Realtime Compute for Apache Flink. You can write data to a Hologres result table by using Holo-blink connectors and query the data in real time. This topic describes how to write data to a Hologres result table by using Blink in exclusive mode.

Limits

Different versions of Blink in exclusive mode support different development syntaxes. You need to check the version of Blink in exclusive mode and refer to the sample code that is provided for the version of Blink in exclusive mode before you use Realtime Compute for Apache Flink.
Make sure that your Realtime Compute service and Hologres service reside in the same region. Otherwise, the services cannot connect to each other.
Blink in exclusive mode whose version is earlier than V3.6 does not provide a built-in Holo-blink connector. You must reference relevant JAR files to create a Hologres result table. You can manually upgrade your Hologres instance in the Hologres console or join a Hologres DingTalk group to apply for an instance upgrade. For more information about how to manually upgrade a Hologres instance, see Instance upgrades. For more information about how to join a DingTalk group, see Obtain online support for Hologres.
Note
We recommend that you upgrade Blink in exclusive mode to V3.6 or later.
Blink V3.7 in exclusive mode can automatically create Hologres partitioned tables only after you enter createparttable='true' in the code of your job. When you use a partitioned table, take note of the following items:
- Hologres tables support only list partitioning.
- If you create a partitioned table, you must specify partition key columns. The data of the partition key columns must be of the TEXT or INT4 type. A partition key value cannot contain hyphens (-). For example, the partition key value 2020-09-12 is invalid.
- If you configure a primary key for a partitioned table, you must include the partition key columns in the primary key.
- If you create a child table, you must specify fixed values as partition key values.
- If you write data to a child table, make sure that the specified columns in the source data have values that fall within the partition key values that you specify for the child table. Otherwise, an error message is returned.
- Hologres tables do not support default partitions.
If you have configured a primary key for a Hologres table to which data is to be written, the default syntax for writing data in real time is not updated based on the primary key. If the new data that you want to write shares the same primary key value as the existing data, the new data is discarded.
Hologres asynchronously writes data. Therefore, you must enter blink.checkpoint.fail_on_checkpoint_error=true in the code. This way, a failover is triggered only when a job exception occurs. You do not need to add this configuration to the code when you use Blink V3.7.6 or later in exclusive mode.

DDL syntax

In this example, the following DDL statement is executed to create a Hologres result table:

create table Hologres_sink(
  name varchar,
  age BIGINT,
  birthday BIGINT
) with (
  type='hologres',
  dbname='<yourDbname>', -- The name of the Hologres database to which you want to connect. 
  tablename='<yourTablename>', -- The name of the Hologres table to which you want to write data. 
  username='<yourUsername>', -- The AccessKey ID of your Alibaba Cloud account. 
  password='<yourPassword>', -- The AccessKey secret of your Alibaba Cloud account. 
  endpoint='<yourEndpoint>'); -- The virtual private cloud (VPC) endpoint of your Hologres instance.

Parameters in the WITH clause

Parameter	Description	Example
type	The type of the result table. Set the parameter to hologres.	hologres
endpoint	The VPC endpoint of your Hologres instance. You can log on to the Hologres console, go to the details page of the instance, and then check the endpoint from the Network Information section. The endpoint must contain a port number and follow the format of the IP address:Port number.	demo-cn-hangzhou-vpc.hologres.aliyuncs.com:80
username	The AccessKey ID of your Alibaba Cloud account. You can obtain the AccessKey ID on the Security Management page.	xxxxm3FMWaxxxx
password	The AccessKey secret of your Alibaba Cloud account. You can obtain the AccessKey secret on the Security Management page.	xxxxm355fffaxxxx
dbname	The name of the Hologres database to which you want to connect.	Holodb
tablename	The name of the Hologres table to which you want to write data.	blink_test
arraydelimiter	The delimiter based on which a Hologres streaming sink splits data in a column of the STRING type into elements in an array. Then, the Hologres streaming sink imports the array to Hologres. Default value: \u0002.	\u0002
mutatetype	The mode in which data is to be written. For more information, see Create a Hologres result table. Default value: insertorignore.	insertorignore
ignoredelete	Specifies whether to ignore retraction messages. Valid values: true: ignores retraction messages. false: does not ignore retraction messages. Note This parameter takes effect only when the streaming semantics is used. Default value: false. In most cases, retraction messages are generated by the Groupby operation in Realtime Compute for Apache Flink. When the retraction messages are transferred to Holo-blink connectors, Delete requests are generated.	false
partitionrouter	Specifies whether to write the data to a partitioned table. Valid values: true: writes the data to a partitioned table. false: does not write the data to a partitioned table. Default value: false.	false
createparttable	Specifies whether to automatically create a partitioned table based on partition key values if you set the partitionrouter parameter to true. This feature is available only in Blink V3.7 and later in exclusive mode. Default value: false. Important Exercise caution when you use this parameter. Make sure that partition key values do not contain dirty data. Otherwise, an invalid partitioned table is created.	false

Note

The following parameters are not provided in the sample DDL statement: arraydelimiter, mutatetype, ignoredelete, partitionrouter, and createparttable. If you need to use these parameters, set these parameters based on the description in the preceding table.

Write data to a standard Hologres result table in real time

Create a table in Hologres.
Create a Hologres table to which you want to write data. Sample statement:
```
 create table blink_test (a int, b text, c text, d float8, e bigint);
```

Create a real-time computing job.

Log on to the Realtime Compute for Apache Flink console.

Create a real-time computing job.

Hologres data sources are supported in Blink V3.6 and later in exclusive mode. You can call a Hologres data source when you create a real-time computing job. Sample statements:

create table randomSource (a int, b VARCHAR, c VARCHAR, d DOUBLE, e BIGINT) with (type = 'random');

create table test (
  a int,
  b VARCHAR,
  c VARCHAR,
  PRIMARY KEY (a)
) with (
  type = 'hologres',
  `endpoint` = '$ip:$port', -- The VPC endpoint and the port number of your Hologres instance. 
  `username` = 'The AccessKey ID of your Alibaba Cloud account',
  `password` = 'The AccessKey secret of your Alibaba Cloud account',
  `dbname` = 'The name of the Hologres database to which you want to connect',
  `tablename` = 'blink_test'-- The name of the Hologres table to which you want to write data. 
);

insert
  into test
select
  a,b,c
from
  randomSource;

Publish the job.
1. After you edit the job code, click Syntax Check in the upper part of the job editor. If the success message appears, the code syntax is correct.
2. Click Save.
3. Click Publish. In the dialog box that appears, set the parameters for publishing the job to the production environment.
Start the job.
After the job is published to the production environment, manually start the job.
In the top navigation bar of the Development Platform page, click Administration in the upper-right corner. On the Administration page, select the job that you want to start and click Start in the upper-right corner.
Query data in Hologres in real time.
You can query data written to the table that you created in Step 1. Sample statement:
```
select * from blink_test;
```

Merge data into a wide table

This section describes how to write data from multiple data streams to the same Hologres wide table.

In this example, you have a Hologres wide table named WIDE_TABLE. The table has the following columns: A, B, C, D, and E. Column A is configured as the primary key. In Realtime Compute for Apache Flink, one stream contains columns A, B, and C, and another stream contains columns A, D, and E.

Use Flink SQL to declare two Hologres result tables. For one of the tables, declare columns A, B, and C. For the other table, declare columns A, D, and E. Both the result tables are mapped to the WIDE_TABLE table.
Set the mutatetype parameter to insertorupdate for both the result tables.
Set the ignoredelete parameter to true for both the result tables. This prevents retraction messages from generating Delete requests.
Insert the data from the two streams into their result tables.

Take note of the following limits:

The wide table must have a primary key.
The data of each stream must contain all the data in the primary key column.
If the wide table is a column-oriented table and a large number of requests are initiated per second, the CPU utilization becomes high. We recommend that you disable dictionary encoding for the columns in the table.

Write data to a partitioned result table in Hologres in real time

You can call the HoloHub API to write data to a parent table. Then, the system routes data of different partitions to the corresponding child tables. You can also write data to a partitioned table. For more information about the HoloHub API, see Overview of the HoloHub API.

Take note of the following limits:

Hologres tables support only list partitioning.
If you create a partitioned table, you must specify partition key columns. The data of the partition key columns must be of the TEXT or INT4 type.
If you configure a primary key for a partitioned table, you must include the partition key columns in the primary key.
If you create a child table, you must specify fixed values as partition key values.
If you write data to a child table, make sure that the specified columns in the source data have values that fall within the partition key values that you specify for the child table. Otherwise, an error message is returned.
Hologres tables do not support default partitions.

Create a partitioned table in Hologres.
Create a partitioned table in Hologres to which you want to write data and create child tables that correspond to the partitioned table. Sample statements:
```
-- Create a parent table named test_message and child tables that correspond to the parent table. 

drop table if exists test_message;

begin;
create table test_message (
 "bizdate" text NOT NULL,
 "tag" text NOT NULL,
 "id" int4 NOT NULL,
 "title" text NOT NULL,
 "body" text,
PRIMARY KEY (bizdate,tag,id)
)
PARTITION BY LIST (bizdate);
commit;
```
Note
- When you execute the statements, assign a value to the ${bizdate} parameter.
- Only Blink V3.7 in exclusive mode can automatically create child tables. If the version of your Blink in exclusive mode is earlier than V3.7, you must manually create child tables in Hologres before you import data. Otherwise, the data import fails.

Create a real-time computing job when Blink in exclusive mode is used.

In this example, the following statements are executed to create a real-time computing job when Blink in exclusive mode is used:

Note

The following sample statements apply to Blink V3.7 and later in exclusive mode. If the version of your Blink in exclusive mode is earlier than V3.7, upgrade your Blink in exclusive mode to V3.7 or later, or delete `createparttable` = 'true' from the code that is used to automatically create child tables.

create table test_message_src(
  tag VARCHAR,
  id INTEGER,
  title VARCHAR,
  body VARCHAR
) with (
  type = 'random',
  `interval` = '10',
  `count` = '100'
);

create table test_message_sink (
  bizdate VARCHAR,
  tag VARCHAR,
  id INTEGER,
  title VARCHAR,
  body VARCHAR
) with (
  type = 'hologres',
  `endpoint` = '$ip:$port', -- The VPC endpoint of your Hologres instance. 
  `username` ='<AccessID>', -- The AccessKey ID of your Alibaba Cloud account. 
  `password` = '<AccessKey>', -- The AccessKey secret of your Alibaba Cloud account. 
  `dbname` = '<DBname>', -- The name of the Hologres database to which you want to connect. 
  `tablename` = '<Tablename>', -- The name of the Hologres table to which you want to write data. 
  `partitionrouter` = 'true', -- Write data to a partitioned table in Hologres. 
  `createparttable` = 'true', -- Automatically create child tables in Hologres. 
);

insert into test_message_sink select "20200327",* from test_message_src;
insert into test_message_sink select "20200328",* from test_message_src;

Publish and start the job.
For more information, see the Publish the job and Start the job steps in the Write data to a standard Hologres result table in real time section.
Query data in Hologres in real time.
You can query data written to the parent table and child tables that are created in Step 1. Sample statements:
```
select * from test_message;
select * from test_message where bizdate = '20200327';
```

Data type mappings

For more information about the mappings between the data types in Blink in exclusive mode and Hologres, see Data types.