The Lindorm streaming engine is developed to process streaming data. This engine provides streaming data storage and lightweight computing capabilities. You can use the engine to store streaming data in a Lindorm database in a convenient manner and process and use streaming data based on your business requirements. This topic describes the scenarios and features of the Lindorm streaming engine.
Architecture
The following figure shows the architecture of the Lindorm streaming engine:
Scenarios
The Lindorm streaming engine is used to store streaming data such as application logs and real-time IoT data in the Lindorm wide table engine (also known as LindormTable) or time series engine (also known as LindormTSDB). A typical use of the Lindorm streaming engine is to perform Extract, Load, and Transform (ELT) operations. The following section describes the process. Compared with a traditional complex solution that combines Message Queue for Apache Kafka, Realtime Compute for Apache Flink, and databases, Lindorm provides integrated storage, computing, and query capabilities, simplifies O&M operations, and reduces development costs.
- Write streaming data from data sources to the Lindorm streaming engine. The supported data formats include CSV, Avro, and JSON.
- Use the SQL statements for the Lindorm streaming engine to perform lightweight computing. For example, you can filter or convert data.
- Synchronize computing results to LindormTable or LindormTSDB.
Features
The Lindorm streaming engine provides the following features:
- SQL statements for the Lindorm streaming engine
Feature Description SQL client Supports the Java Database Connectivity (JDBC) protocol and can integrate with SQL tools. SQL syntax Supports basic DDL and DML operations. Support for various functions Supports common SQL functions, Lindorm built-in functions, and user-defined functions. Visualization Allows you to view the trace information about streaming data processing in the console. Support for window functions Supports window functions for streaming data computing. - Schema management
Feature Description Schema mapping Maps streaming data from data sources to table formats. Streaming data in the CSV, Avro, or JSON format can be written from data sources to the Lindorm streaming engine. Dirty data processing During the schema mapping process, the data that does not conform to the schema is dirty data. Dirty data can be caused by a mismatched data type or an empty value in a primary key column. The Lindorm streaming engine provides specific built-in mechanisms for processing dirty data. For example, this engine blocks or ignores dirty data. This engine also supports dead-letter queues. - Data writing and data storage
Feature Description Full compatibility with the Apache Kafka protocol Allows you to use an open source Apache Kafka client to write data to the Lindorm streaming engine. Compute-storage separation and high-capacity storage Allows you to independently scale out storage resources to store petabytes of data.