All Products
Search
Document Center

Lindorm:Configure wide table connectors for the Lindorm streaming engine

Last Updated:Feb 28, 2024

When you use the Lindorm streaming engine to process computing tasks, you can use wide tables as dimension tables or result tables. This topic describes how to configure wide table connectors when you use Flink SQL to submit computing tasks.

Background information

When you use Flink SQL to submit a computing task in the Lindorm streaming engine, you must execute the CREATE TABLE statement and configure connector-related parameters.

Use wide tables in the Lindorm streaming engine

CREATE TABLE lindorm_table(
  c1 VARCHAR,
  c2 DOUBLE,
  c3 BIGINT,
  PRIMARY KEY (c1, c2) NOT ENFORCED -- The primary key specified in the statement must be the same as the primary key of the result table created in LindormTable.
)WITH(
      'connector'='lindorm',
      'seedServer'='ld-bp17pwu1541ia****-proxy-lindorm.lindorm.rds.aliyuncs.com:30020',
      'userName'='yourUser',
      'password'='yourPassword',
      'tableName'='yourTablename',
      'namespace'='yourNamespace'
    ); -- Configure connector-related parameters in the WITH clause.
Note

For more information about the CREATE TABLE statement, see CREATE TABLE.

Connector-related parameters

Common parameters

Parameter

Default value

Required

Description

seedServer

None

Yes

The endpoint that is used to connect to LindormTable by using HBase Java API. For more information, see View the endpoints of LindormTable.

namespace

None

Yes

The namespace to which the wide table belongs.

userName

None

Yes

The username used to connect to LindormTable.

password

None

Yes

The password used to connect to LindormTable.

tableName

None

Yes

The name of the wide table.

bufferSize

5000

No

The number of batches in which data is written.

flushIntervalMs

2000

No

The interval at which the flush operation is performed when data is written to the wide table. Unit: milliseconds.

If the amount of data written to the wide table is small, the flush operation is performed based on the specified interval.

Note

The amount of data based on which the flush operation is performed at the specified interval varies with business scenarios.

columnFamily

f

No

The name of the column family.

Note

When you use a wide table in the Lindorm streaming engine, whether the columnFamily parameter is required depends on how the wide table is created.

  • If the wide table is created by using the ApsaraDB for HBase API for Java, this parameter is required.

  • If the wide table is created by using the Lindorm SQL, this parameter is optional.

Spatio-temporal parameters

Parameter

Default value

Required

Description

cacheTTLMs

-1

No

The time to live (TTL) of the data cache. The default value of this parameter is -1, which indicates that no data cache is generated. Unit: milliseconds. After the data cache expires, the spatio-temporal index is rebuilt when the next query is performed.

geomHint

None

No

The specified Lookup Join query. The value of this parameter is in the <columnName>:<queryFunction> format. Example: fence:st_contains. When the Lookup Join query is executed, the spatio-temporal equation in the Join condition is replaced with the specified spatio-temporal query. For example, the fence=ST_MakePoint(x,y)) equation is replaced with the specified ST_Contains(fence, ST_MakePoint(x,y)) query.

queryFunction supports the following functions:

  • ST_Contains

  • ST_Within

  • ST_DWithin

  • ST_DWithinSphere

  • ST_Intersects

  • ST_Overlaps

  • ST_Equals

geomIndex

None

No

The column based on which the spatio-temporal index is created. The Lookup Join queries for the table are accelerated by using the in-memory index.

Indexes can be created only for a single column in a table. The following functions can be accelerated by using in-memory indexes:

  • ST_Contains

  • ST_Within

  • ST_DWithin

  • ST_DWithinSphere

  • ST_Intersects

  • ST_Overlaps

  • ST_Equals

Important

The geomIndex parameter must be specified together with the geomHint and cacheTTLMs parameters. The value of the cacheTTLMs parameter must be larger than 0.