Job development map - Realtime Compute for Apache Flink - Alibaba Cloud Documentation Center

Understand upstream and downstream systems

Upstream (Source): The source system from which data is read.
- Examples include Kafka, MySQL CDC, Hologres, and Simple Log Service (SLS).
Downstream (Sink): The destination system to write the processed results.
- Examples include databases (MySQL, PostgreSQL), data warehouses (ClickHouse, Doris, StarRocks), message queues, and data lakes (Paimon, OSS).

Realtime Compute for Apache Flink supports over 30 upstream and downstream connectors, including databases, message queues, and data lakes. This enables fast data pipeline development. For more information, see Supported connectors.

Define job types according to your use cases

Job type	Use cases
Flink SQL	Real-time extract, transform, and load (ETL), real-time metric computation, multi-stream joins, streaming warehousing and lakehousing.
Data ingestion with Flink CDC	Real-time database synchronization, data migration, and automatic table synchronization.
Datastream API	Complex event processing (CEP), high-frequency external calls, complex window logic, and custom sources or sinks.

Job development

Flink SQL

ETL, data aggregations, and lookup joins.

Data ingestion with Flink CDC

Real-time database synchronization and batch table ingestion.

Datastream API

CEP, custom states, and complex job logic.

Typical scenarios

Query and test

Advanced usage

Ecosystem integration

O&M and optimization

Troubleshooting