This topic describes Hologres connector options in the WITH clause for Ververica Runtime (VVR) 8.0.x or earlier.
Connector options in the WITH clause
General
Option | Description | Data type | Required? | Default value | Remarks |
| The type of the connector. | String | Yes | No default value | Set this option to |
| The database name. | String | Yes | No default value | Hologres V2.0 introduces virtual warehouse instances as a new type of elastic and high-availability instances. Computing resources are divided into multiple virtual warehouses to implement high-availability deployments. Different virtual warehouses share the same endpoint. You can add a specific suffix to the value of the Note Virtual warehouses are supported only when JDBC-related modes are used for tables. For more information, see the |
| The table name. | String | Yes | No default value | If the schema is not public, set |
|
| String | Yes | No default value |
Important To enhance security, use variables instead of hardcoding your AccessKey pair. |
|
| String | Yes | No default value | |
| The endpoint of Hologres. | String | Yes | No default value | |
| Specifies whether to enable SSL-encrypted transmission and specifies the SSL-encrypted transmission mode to use. | String | No |
|
Note
|
| The path of the certificate if a CA certificate is used. | String | No | No default value | If you set Note
|
| The maximum number of retries allowed to read and write data if a connection failure occurs. | Integer | No |
| |
| The fixed waiting period for each retry. | Long | No |
| The actual waiting period for each retry is calculated by using the following formula: |
| The accumulated waiting period for each retry. | Long | No |
| The actual waiting period for each retry is calculated by using the following formula: |
| The maximum duration for which the JDBC connection can remain idle. | Long | No |
| If a JDBC connection stays idle for a period of time that exceeds the value of this option, the connection is closed and released. Unit: milliseconds. |
| The maximum time for storing the TableSchema information in the cache. | Long | No |
| Unit: milliseconds. |
| The factor for triggering automatic cache refresh. If the remaining time for storing data in the cache is less than the time for triggering an automatic refresh of the cache, the system automatically refreshes the cache. | Integer | No |
| The remaining time for storing data in the cache is calculated by using the following formula: Remaining time for storing data in the cache = Cache expiration time - Time for which data has been stored in the cache. After the cache is automatically refreshed, the duration for which data is cached is recalculated from 0. The time for triggering an automatic refresh of the cache is calculated by using the following formula: jdbcMetaCacheTTL/jdbcMetaAutoRefreshFactor. |
| Specifies whether to perform time type conversions between Realtime Compute for Apache Flink and Hologres. | Boolean | No |
|
Note
|
| The connector option version. | Integer | No |
| Valid values:
Note
|
Source-specific
Option | Description | Data type | Required? | Default value | Remarks |
| The delimiter used between rows when data is being exported. | String | No |
| |
| Specifies whether to consume binary log data. | Boolean | No |
|
Note
|
| The SDK mode. | String | No |
|
For information about recommended values for different versions, see Precautions. |
| The slot name of the binary log source table in JDBC mode. | String | No | No default value | This option is effective only when Note If you use Hologres V2.1 or later and VVR 8.0.5 or later, skip configuring this option, and the connector does not attempt to automatically create a slot. |
| The number of retries after Realtime Compute for Apache Flink fails to read binary log data. | Integer | No |
| |
| The interval between retries after Realtime Compute for Apache Flink fails to read the binary log data. | Long | No |
| Unit: milliseconds. |
| The number of rows in which the binary log data is read at a time. | Integer | No |
| |
| Specifies whether to read binary log data in CDC mode. | Boolean | No |
|
Note
|
| Specifies whether the source table reads a changelog stream that contains UPSERT messages. | Boolean | No |
| This option takes effect only in CDC mode.
Note If retraction operators exist in the sink table, such as the |
| The binary logs consumption mode. | String | No |
|
Note The startTime option has a higher priority. This means if you configure the startTime option or select a start time point at job startup, the binlogStartupMode option is forcibly set to Note
|
| The start time when Hologres data is consumed. | String | No | No default value | The format is yyyy-MM-dd hh:mm:ss. If this option is not configured and jobs are not resumed from a state, Realtime Compute for Apache Flink starts to consume Hologres data from the earliest binary log. |
| The number of records that can be buffered during the scan operation. | Integer | No |
| |
| The timeout period of the scan operation. | Integer | No |
| Unit: seconds. |
| The timeout period for the transaction to which the scan operation belongs. | Integer | No |
| This option corresponds to the Hologres GUC parameter idle_in_transaction_session_timeout. The value |
| Specifies whether to perform filter pushdown during the full data reading phase. | Boolean | No |
|
|
| The mode in which binary logs in a partitioned table are consumed. | Enum | No |
|
|
| The maximum latency allowed before a timeout is triggered when data in a partitioned table is dynamically consumed. | Boolean | No |
|
|
| The partitions to be consumed when data in a partitioned table is consumed in static mode. | String | No | No default value |
|
Sink-specific
Option | Description | Data type | Required? | Default value | Remarks |
| The SDK mode. | String | No |
|
For information about recommended values in different VVR versions, see Precautions. |
| Specifies whether to write data in bulkload mode. | Boolean | No |
| This option takes effect only when the Note This option is supported when you use VVR 8.0.5 or later and Hologres V2.1 or later. |
| Specifies whether to use the Hologres connector in RPC mode. | Boolean | No |
|
Note
|
| The data writing mode. | String | No |
|
Note
|
| Specifies whether to write data to a partitioned table. | Boolean | No |
| |
| Specifies whether to automatically create non-existing partitioned tables based on partition values. | Boolean | No |
| In RPC mode, if partition values contain hyphens ( Note
|
| Specifies whether to ignore retraction messages. | Boolean | No |
| Note
|
| Specifies the strategy to process retraction messages. | String | No | No default value | Valid values:
Note
|
| The size of the JDBC connection pool that is created in a Realtime Compute for Apache Flink deployment. | Integer | No |
| The size of the JDBC connection pool is proportional to data throughput. If the deployment has poor performance, increase the size of the connection pool. |
| The maximum number of records that can be buffered by the sink operator in JDBC mode. | Integer | No |
| Unit: rows. Note If you specify all the preceding parameters (jdbcWriteBatchSize, jdbcWriteBatchByteSize, and jdbcWriteFlushInterval), the system writes data to a Hologres sink table when one of the related conditions is met. |
| The maximum number of bytes of data that can be buffered by the sink operator before they are processed at once in JDBC mode. | Long | No |
| Note You can specify only one of the following options: jdbcWriteBatchSize, jdbcWriteBatchByteSize, and jdbcWriteFlushInterval. If you specify all of them, the system writes data to a Hologres sink table when one of the conditions is met. |
| The maximum waiting time for the sink operator to buffer data before processing it at once in JDBC mode. | Long | No |
| Unit: milliseconds. Note You can specify only one of the following options: jdbcWriteBatchSize, jdbcWriteBatchByteSize, and jdbcWriteFlushInterval. If you specify all of them, the system writes data to a Hologres sink table when one of the conditions is met. |
| Specifies whether to ignore null values in the data that is written when mutatetype='insertOrUpdate' is specified. | Boolean | No |
|
Note If you set the |
| The name of the connection pool. In the same TaskManager, tables for which the same connection pool is configured can share the connection pool. | String | No | No default value | Set this option to any string other than Note
|
| Specifies whether to allow the Hologres connector to fill a default value if a null value is written to a non-null column for which the default value is not configured in the Hologres table. | Boolean | No |
|
|
| Specifies whether to allow the Hologres connector to remove the invalid characters \u0000 from STRING data written to the sink table. | Boolean | No |
|
Important
|
| Specifies whether to insert only the fields declared in the INSERT statement. | Boolean | No |
|
Note This option takes effect only if the |
| Specifies whether to perform deduplication when buffered data is written in JDBC or jdbc_fixed mode. | Boolean | No |
|
Note
|
| Specifies whether to enable the conditional update feature and configure the names of the fields that you want to check. | String | No | No default value | You must set this option to a field name in the Hologres table. Important
|
| The comparison operator for the conditional update operation. | String | No |
| This option allows you to compare the check field in the new data record with the check field in the old data record in the table. If the comparison result meets the value of this option, you can perform the conditional update operation. Valid values: GREATER, GREATER_OR_EQUAL, EQUAL, NOT_EQUAL, LESS, LESS_OR_EQUAL, IS_NULL, and IS_NOT_NULL. Note Only VVR 8.0.11 or later supports this option. |
| When you perform the conditional update operation, if the old data record is null, the null value is regarded as the valid value of this option. | String | No | No default value | In PostgreSQL, the result of comparing any value with NULL is FALSE. Therefore, when the original data in the table is NULL, you must set NULL-AS as a parameter when you perform the conditional update operation. The NULL-AS parameter equals the COALESCE function in Flink SQL. Note Only VVR 8.0.11 or later supports this option. |
| Specifies whether to enable the aggressive commit mode. | Boolean | No |
| If you set this option to Note
|
Dimension table-specific
Option | Description | Data type | Required? | Default value | Remarks |
| The SDK mode. | String | No |
|
For more information about the recommended values for different VVR versions, see Precautions. |
| Specifies whether to connect to the Hologres connector by using RPC. | Boolean | No |
| Valid values:
Note If you set this option to |
| The size of the JDBC connection pool that is created in a job. | Integer | No |
| If the job has poor performance, we recommend that you increase the size of the connection pool. The size of the JDBC connection pool is proportional to data throughput. |
| The name of the connection pool. In the same TaskManager, tables for which the same connection pool is configured can share the connection pool. | String | No | No default value | Set this option to any string other than Note
|
| The maximum number of records that can be buffered and processed in a single batch for a point lookup on a Hologres dimension table. | Integer | No |
| |
| The maximum number of queued requests allowed in a thread to perform a point query in a Hologres dimension table. | Integer | No |
| |
| The timeout period for performing a point query in a Hologres dimension table. | Long | No |
| The default value |
| The number of retries when a point query performed in a Hologres dimension table times out. | Integer | No |
| This option is different from |
| The number of records that can be buffered and processed in a single batch at the same time by calling the scan operation when you perform a one-to-many table join. In a one-to-many table join, no complete primary key is used. | Integer | No |
| |
| The maximum timeout period for a scan operation. | Integer | No |
| Unit: seconds. |
| The cache policy. | String | No |
| Valid values:
|
| The maximum number of rows of data that can be cached. | Integer | No |
| This option is available after you set |
| The interval at which the system refreshes the cache. | Long | No | See the remarks column. | Unit: milliseconds. The default value of the cacheTTLMs option varies based on the value of the cache parameter:
|
| Specifies whether to cache the JOIN queries whose return results are empty. | Boolean | No |
|
|
| Specifies whether to return data asynchronously. | Boolean | No |
|
Note Data is not sorted when data is synchronized in asynchronous mode. |
Time zones of Realtime Compute for Apache Flink and Hologres
Time types
Service | Type | Description |
Flink | The date and time without the time zone. Data of the TIMESTAMP type is a timestamp that represents the year, month, day, hour, minute, second, and fractional second. Data of the TIMESTAMP type can be a string, such as | |
Used to describe an absolute point in time on the timeline. Data of the LONG type indicates the number of milliseconds that have elapsed since the epoch time. Data of the INT type indicates the number of nanoseconds in milliseconds. The epoch time refers to 00:00:00 UTC on January 1, 1970 in Java. Data of the TIMESTAMP_LTZ type is interpreted for calculations and visualization based on the time zone that is configured in the current session. The TIMESTAMP_LTZ type can be used for calculations across time zones because it represents the same absolute point in time in different time zones based on the epoch time. The same TIMESTAMP_LTZ value may reflect different local TIMESTAMP values in different time zones. For example, if a TIMESTAMP_LTZ value is | ||
Hologres | TIMESTAMP | The date and time without the time zone, which is similar to the |
TIMESTAMP WITH TIME ZONE (TIMESTAMPTZ) | The date and time with the time zone, which is similar to the For example, if the timestamp of the time zone of Beijing (UTC+8) is |
Time zone mappings
If you set the
type-mapping.timestamp-converting.legacyoption tofalsein VVR 8.0.6 or later, you can perform conversions of all time types between Realtime Compute for Apache Flink and Hologres.Flink
Hologres
Description
TIMESTAMP
TIMESTAMP
Time type conversions are performed without time zone conversions. We recommend that you use this type of time type conversion to read data from or write data to Hologres.
TIMESTAMP LTZ
TIMESTAMPTZ
TIMESTAMP
TIMESTAMPTZ
Time type conversions are performed with time zone conversions. To maintain accuracy during conversion, you need to set the Flink time zone through
table.local-time-zone. For more information, see How do I configure custom parameters for deployment running?.For example, you specify
'table.local-time-zone': 'Asia/Shanghai'to set the time zone of Realtime Compute for Apache Flink to the time zone of Shanghai (UTC+8). After you write the data 2022-01-01 01:01:01.123456 of the TIMESTAMP type from Realtime Compute for Apache Flink to Hologres, the data is converted to 2022-01-01 01:01:01: 01.123456+8 of the TIMESTAMPTZ type.TIMESTAMP LTZ
TIMESTAMP
In VVR 8.0.6 or later and you specify
type-mapping.timestamp-converting.legacy=trueor in VVR 8.0.5 or earlier, data deviation may occur during time type conversions except for the conversions of the TIMESTAMP type.Flink
Hologres
Notes
TIMESTAMP
TIMESTAMP
Time type conversions are performed without time zone conversions. We recommend that you use this type of time type conversion to read data from or write data to Hologres.
TIMESTAMP LTZ
TIMESTAMPTZ
Data of the TIMESTAMP LTZ and TIMESTAMPTZ types is expressed as the time without the time zone when Realtime Compute for Apache Flink reads data from or write data to Hologres. This may cause data deviation.
For example, if data of the TIMESTAMP_LTZ type in Realtime Compute for Apache Flink is 2024-03-19T04:00:00Z, the time without the time zone in Shanghai (UTC+8) is 2024-03-19T12:00:00. However, when data is written to Hologres, 2024-03-19T04:00:00 is used as the time without the time zone and is converted to 2024-03-19T04:00:00+08 of the TIMESTAMPTZ type in Hologres. This causes an 8-hour data deviation.
TIMESTAMP
TIMESTAMPTZ
Time zone conversions are performed based on the time zone of JVM in the runtime environment instead of the time zone of Realtime Compute for Apache Flink. This is different from the time zone conversions in Realtime Compute for Apache Flink. If the time zone of Realtime Compute for Apache Flink is different from the time zone of JVM, data deviation may occur. We recommend that you read data from and write data to Hologres based on the time zone of Realtime Compute for Apache Flink.
TIMESTAMP LTZ
TIMESTAMP