Apache Flink FLIP-15: Smart Stream Iterations & Optimization

Learn Apache Flink FLIP-15 smart iterations with StreamScope and intelligent termination. Master backpressure optimization, deadlock prevention, and advanced loop processing for real-time analytics.

This is Technical Insights Series by Perry Ma | Product Lead, Real-time Compute for Apache Flink at Alibaba Cloud.

Introduction

Imagine a crowd control system in a massive shopping mall. Sometimes, customers need to pass through the same area multiple times (like the food court), and different areas generate inter-area foot traffic. The current system has a problem: it either sets a fixed waiting time for each area or doesn't know when to close. This approach clearly isn't flexible enough. FLIP-15 aims to solve this problem by enabling the system to handle such circular flow scenarios more intelligently.

Current Problems

The existing Flink stream processing iteration model has three main problems:

_2025_09_05_18_02_58

Problem	Manifestation	Impact
Unstructured Loops	Can add feedback edges arbitrarily, no scope restrictions	Hard to ensure correctness, difficult maintenance
Unreliable Termination	Depends on fixed timeout	May end too early or wait too long
Poor Backpressure Handling	Circular data flows prone to deadlock	Affects system stability

New Solution: StreamScope and Smart Termination

The new solution is like designing an intelligent management system for the shopping mall, with three main improvements:

_2025_09_05_18_03_04

1. Introducing StreamScope

Each loop has its own "territory," like functional areas in a shopping mall:

_2025_09_05_18_03_10

2. New API Design

The new API is designed to be more intuitive and safe. Here's an example analyzing mall customer flow:

// Define loop logic
DataStream<Customer> result = customerFlow.iterate(new LoopFunction<Customer, Customer>() {
    @Override
    public Tuple2<DataStream<Customer>, DataStream<Customer>> loop(DataStream<Customer> input) {
        // Analyze customer flow
        DataStream<Customer> analysis = input.map(new AnalyzeCustomerFlow());
        
        // Feed part of customer flow back to loop start
        DataStream<Customer> feedback = analysis
            .filter(new NeedsRecheck())
            .map(new PrepareForNextIteration());
            
        // Return streams for continued processing and final results
        return new Tuple2<>(feedback, analysis.filter(new IsComplete()));
    }
});

3. Smart Termination Mechanism

Instead of relying on fixed timeouts, it uses distributed coordination to determine when to end processing:

_2025_10_30_16_09_56

Backpressure Handling Optimization

For loop scenarios, two strategies are provided for handling backpressure:

Strategy	Advantages	Disadvantages	Suitable Scenarios
Feedback Priority	Strong predictability, avoids disk writes	May reduce throughput	Low latency requirements
Dynamic Priority	High overall throughput	Single iteration latency may increase	High throughput requirements

Current Status and Evolution

This FLIP has an interesting development history. Initially aimed at solving iteration problems in the DataSet API, the entire iteration processing architecture has evolved with Flink:

Early Stage: The FLIP proposed two main prototype branches:
- Loops + StreamScope API implementation
- Job termination mechanism improvements
Architecture Transition: Since Flink 1.12, the DataSet API has been soft-deprecated, with official recommendations for:
- Using Flink ML Iterations for machine learning iterations
- Using Table API and SQL for batch processing
- Using DataStream API's BATCH execution mode
Current Recommendations: For scenarios requiring iteration functionality:
- New projects should use Flink ML Iterations or Table API directly
- Existing projects using old iteration APIs should migrate following new best practices

Parallel Processing of Iterations

In distributed environments, iteration computation introduces an important concept: Superstep Synchronization.

_2025_09_05_18_03_31

This synchronization mechanism ensures:

Each parallel task is on the same logical step
Termination conditions are evaluated after all tasks complete the current superstep
Data consistency is maintained

Impact on Existing Programs

After implementing this improvement, note the following:

Need to remove iteration timeout settings from code
Loop logic needs to use new LoopFunction or CoLoopFunction approach
Binary operations (like union, connect) can only be used within the same loop context
Operators from different scopes cannot be chained

Summary

FLIP-15 makes Flink more reliable and efficient in handling stream loops by introducing loop scopes and smart termination mechanisms. It's like upgrading a shopping mall's intelligent management system, accurately monitoring each area while flexibly controlling customer flow. This improvement eliminates fixed timeout limitations, provides a more elegant programming model, and resolves deadlock risks. Although still under development, it represents an important advancement in Flink's stream processing capabilities.

Community

Apache Flink FLIP-15: Smart Stream Iterations & Optimization

Introduction

Current Problems

New Solution: StreamScope and Smart Termination

1. Introducing StreamScope

2. New API Design

3. Smart Termination Mechanism

Backpressure Handling Optimization

Current Status and Evolution

Parallel Processing of Iterations

Impact on Existing Programs

Summary

Read previous post:

Read next post:

Apache Flink Community

You may also like

Comments

Apache Flink Community

Related Products

Realtime Compute for Apache Flink

Big Data Consulting for Data Technology Solution

Big Data Consulting Services for Retail Solution

Quick BI