Lindorm Streams pushes data changes of tables in ApsaraDB for HBase Performance-enhanced Edition (Lindorm) to the downstream Message Queue for Apache Kafka service.
Message delivery semantics
Order: Messages are sent in random order. Messages will be sent in order based on keys in the future. In this case, if multiple records of the same primary key are updated, the updates are sent in the same order as the records are updated.
Repeatability: Lindorm Streams supports the at-least-once semantics, which may cause message duplication.
Message format
Lindorm Streams sends messages that contain only updated content. Assume that a table has three columns: a, b, and c. If a user updates only the c column, the message contains only the primary key data and the updated data of the c column. In the future, Lindorm Streams will support querying the new values and old values of all the cells in the updated row. The following code shows the message format:
{
"op" : <opType> // The type of the operation. Valid values: Put and DeleteFamily.
"table" : <tableName> // The name of a table.
"ts" : <defaultVersion> // The default version. The default version is used if the data in the column does not contain the ts key.
"keyOnly":<keyOnly> // Specifies whether only the primary key data is included.
"data" : [
{
"type" : <type> // The data type of the column.
"name" : <name> // The name of the column.
"ts" : <version> // The data version. If this value is not specified, the dts value is used as the version.
"value" : <value> // The value. If this value is not specified, the value in the table is null.
},
{
"type" : <type> // The data type of the column.
"name" : <name> // The name of the column.
"ts" : <version> // The data version. If this value is not specified, the dts value is used as the version.
"value" : <value> // The value. If this value is not specified, the value in the table is null.
}
]
}
The following table describes the content in the message.
Parameter | Type | Required | Example | Description |
key_only | boolean | No | false | Specifies whether to synchronize only primary keys. Default value: false. |
unique_key | boolean | No | false | Specifies whether a primary key is sent only once for the same batch. Default value: false. |
Features available soon
Feature | Description |
Support for newImage and oldImage |
|
Order preserving at the key level | Allows you to export the incremental data of the same key in order. |
Data retention time and data playback | Allows you to set the retention period for incremental data and reset the synchronization offset to pull data of a specified time point. |
UDF | Supports user-defined functions (UDFs). You can transform or filter data based on your business logic. |
Streams Api | Allows you to use the Streams API to directly pull Streams messages. You do not need to connect to downstream intermediate messaging or storage services. |
Function Compute | Supports connecting to Function Compute. |