Flink Core Technology

Latest Progress of Flink Fault Tolerance 2.0

Mei Yuan | Head of Alibaba Cloud Flink storage engine team, Apache Flink engine architect, Apache Flink PMC&Committee

Flink is the industry recognized standard for real-time computing in the field of big data. However, we have not relaxed Flink's further efforts in real-time computing. This talk will report some progress and achievements of Flink fault-tolerant in the past year, as well as future prospects.

The first part is the improvement we have made in checkpoint in the past year. With the further improvement of Generic log based incremental checkpoints, the improvement of checkpoint (fast and stable checkpoint) is also gradually maturing.

In this talk, I will share some experimental results of the combination of Changelog, unaligned checkpoint and buffer de blocking. At the same time, I will also talk about the improvement of rescaling in the cloud native context, as well as the community's efforts in unifying the concept and operation of snapshots.

Flink Shuffle 3.0: Vision, Roadmap and Progress

Song Xintong | Alibaba Cloud Senior Technical Expert, Apache Flink PMC Member&Committee

Since the birth of Flink, the evolution of its Shuffle architecture can be divided into two stages:

• The initial 1.0 stage only supports the most basic data transmission function; In Phase 2.0, a series of improvements were made around the performance, stability and streaming batch architecture.

• Now, as Flink's application scenarios and forms become more and more rich, the Shuffle architecture is also facing many new challenges and opportunities, and is about to enter the 3.0 phase.

In this sharing, I will focus on such keywords as cloud native, streaming batch fusion, and adaption to present the Flink Shuffle 3.0 blueprint we described, and introduce the planning and progress of related work. Specific contents include: streaming batch adaptive Hybrid Shuffle, cloud native multi-level storage adaptive Shuffle, overall architecture upgrade and function integration of Shuffle, etc.

Adaptive and flexible execution management for batch jobs

Zhu Zhu | Alibaba Cloud Senior Technical Expert, Apache Flink PMC&Committer

In order to better support the execution of batch jobs and improve their ease of use, execution efficiency and stability, we have improved Flink's execution control mechanism:

1: Dynamic execution plans are supported, enabling Flink to adjust execution plans based on runtime information. Based on this, we introduced Flink's ability to automatically set the parallelism and automatically balance the amount of data issued.

2: It supports more flexible (finer grained) execution control, enabling Flink to run multiple execution instances of the same task at the same time.

Based on this, we introduced the predictive execution capability to Flink. In addition, these improved mechanisms also provide more possibilities for further optimizing Flink job execution in the future.

Flink OLAP Improvement of Resource Management and Runtime

Cao Dizhou | ByteDance infrastructure engineer

Flink OLAP job QPS and resource isolation are the biggest challenges faced by Flink OLAP computing, as well as the biggest pain points that need to be solved for ByteDance internal business to use Flink to perform OLAP computing. On the basis of job scheduling and execution optimization last year, the ByteDance Flink technical team has made a lot of in-depth optimization of the Flink engine architecture and function implementation, so that the QPS of the Flink engine for complex job execution has increased from 10 to more than 100 and the QPS of simple job execution has increased from 30 to more than 1000 with a small amount of data. This sharing will introduce Flink OLAP difficulties and bottlenecks analysis, job scheduling, runtime execution, revenue and future planning.

1: Introduce the difficulties and bottlenecks Flink encounters in OLAP.

2: Job scheduling

• Transformation of job resource registration, application and release processes from Slot granularity to TaskManager granularity

• Job initialization Job Manager and TaskManager module interaction optimization, supporting cross job • TaskManager connection reuse

• Job computing task deployment structure and serialization optimization

Runtime Execution

• Job submission and result acquisition from Pull model to Push model

• TaskManager CPU bottleneck and optimization of high concurrency initialization computing tasks

• Job Manager and TaskManager jobs perform memory optimization


• Improved scheduling performance

• Core business access

Benefits and future planning

• Serverless capacity building

• Performance improvement

Interpretation of the latest progress of PyFlink and introduction of typical application scenarios

Payment | Alibaba Cloud Senior Technical Expert, Apache Flink PMC&Committee

PyFlink is a new function introduced from Flink 1.9. It supports users to develop Flink jobs using Python language, greatly facilitates developers of Python technology stack, lowers the development threshold of Flink jobs, and also facilitates users to use rich Python third-party libraries in Flink jobs, greatly expanding Flink's application scenarios. After functional iteration from Flink 1.9 to Flink 1.16, PyFlink functions have become increasingly mature, basically covering most of the functions provided by Flink Java&Scala API.

In this sharing, we will first briefly introduce the current status of the PyFlink project and the new functions supported in the newly released version 1.16 of Flink; Then we will use some specific examples to deeply interpret the typical application scenarios of PyFlink, so that users can have an intuitive understanding of some typical uses and application scenarios of PyFlink.

Apache Flink 1.16 Function Interpretation

Huang Xingbo | Alibaba Cloud Senior Development Engineer, Apache Flink Committee, Flink 1.16 Release Manager

Apache Flink has always been one of the most active and rapidly developing projects in the Apache community. In each release in the past, Apache Flink has brought many new functions and features, as has just released Flink 1.16.

In this sharing, we will first introduce the overall situation of Flink 1.16; Then we will explain the improvement of Flink 1.16 in the general direction of streaming and batching from three aspects:

1: Flink batch processing with more stability, ease of use and high performance;

2: Continuous leading Flink stream processing;

3: Flourishing Flink ecology.

Apache Flink+Volcano: Practice of efficient scheduling capability in big data scenarios

Jiang Yikun | Volcano Reviewer, openEuler Infra Maintainer

Wang Leibo | Cloud Container Service Architect and Volcano Community Leader

Flink on Kubernetes has gained more and more attention and use. Due to Kubernetes' lack of support for batch scheduling, the problem of resource deadlock often occurs in the scheduling of big data scenarios. At the same time, it lacks advanced capabilities such as queue, priority, resource reservation, and diversity computing power scheduling.

This topic will introduce the latest progress and best practices of the Apache Flink community FLIP-250: Support Customized Kubernetes Schedulers Proposal.

Query optimization and landing practice of Flink OLAP in ByteDance

He Runkang | ByteDance infrastructure engineer

This sharing will be mainly divided into five parts: Flink OLAP's business practices in ByteDance and problems encountered, OLAP cluster operation and maintenance and stability governance, SQL Query Optimizer optimization, Query Executor optimization, revenue and future planning.

1: Application of Flink OLAP in ByteDance

• Overall architecture, the landing of Flink OLAP in ByteDance

• Introduce the requirements and problems of operation and maintenance, monitoring, stability and query performance in the process of business use

2: Flink OLAP cluster operation, maintenance and stability

• Flink OLAP monitoring system improves VS flow calculation

• Flink OLAP release and stability management VS stream computing

3: Query Optimizer Optimization

• Plan construction acceleration, including Plan Cache, Catalog Cache, etc

• Optimizer optimization, rich optimization rules, including the enhancement of push down ability, Join Filter transmission, statistical information enhancement, etc

4: Query Executor Optimization

• Memory optimization, including Codegen cache and ClassLoader reuse

• RuntimeFilter optimization

5: Revenue and future planning

• Product improvement, vectorization engine, materialization view, and optimizer evolution

Yu Hangxiang | Apache Flink Contributor

The ChangelogStateBackend introduced by FLIP-158 can provide a more stable and fast checkpoint for user jobs, further provide a faster failover process, and provide less end-to-end delay for transactional sinks jobs.

This topic will introduce the basic mechanism, application scenarios and future planning of ChangelogStateBackend from the performance optimization history of Checkpoint, and share the relevant performance test results.

Optimization and Practice of Flink Unaligned Checkpoint in Shope

Fan Rui | Tech Lead of Shopee Flink Runtime Team, Apache StreamPark Committee&Flink/Hadoop/HBase/LocksDB Contributor

In Flink production practice, Shopee encountered many problems related to checkpoints, and tried to introduce Unaligned Checkpoints to solve some problems. However, after investigation, it was found that there was a certain gap between the effect and the expectation, so it was deeply improved in the internal version, and most of the improvements had been fed back to the Flink community.

The lecture will include the following contents:

• Problems in Checkpoint

• Principle of Unaligned Checkpoint

• Greatly increase UC revenue

• Significantly reduce UC risk

UC's production practice and future planning in Shopee

Flink state optimization and remote state exploration

Zhang Yang | Bilibili Senior Development Engineer

The optimization of Flink rocksdb state optimizes the compression process and improves the throughput in large state task scenarios.

Flink remote state exploration, based on the internal kv storage of bilibili, realizes the separation of storage and computing, speeds up task recovery, and better supports Flink's cloud native deployment.

StateBackend performance improvement with TerarkDB

TerarkDB is an LSM storage engine derived from ByteDance's internal optimization of RocksDB to improve the performance of the storage engine and reduce resource overhead. This sharing will be introduced from five aspects: background, business pain points, Flink&TerarkDB integration, revenue and future planning.

Introduce the problems encountered by RocksDBStateBackend in production practice

Write amplification causes high CPU consumption and disk IO

The SST file cleaning strategy is not universal, which leads to continuous expansion of space

Resonance between Checkpoint and Comparison causes sharp spikes in CPU cycles

Introduce the core optimization of TerarkDB and compare it with RocksDB

KV separation mechanism: reduce write amplification and resource use

Schedule TTL GC: Speed up file recycling to prevent files from being recycled

Flink&TerarkDB Integration Scheme

Flink and TerarkDB JNI interfaces are adapted to support Flink Comparison Filter

Support coexistence of RocksDB and TerarkDB

Provide a new checkpoint mechanism to eliminate CPU cycle spikes

Business income

Future planning

Progress of General Incremental Checkpoint Based on Log in Meituan

Wang Feifan | Computing Engine Engineer of Meituan Data Platform, Apache Flink Contributor

State Changelog based checkpoint is an important evolution of checkpoint mechanism. We think it can solve some business pain points, so we have followed up and co built it. This sharing will be introduced in the following aspects:

Relevant background

1:Meituan application scenario and verification
2:State Changelog Restore performance optimization
3:State Changelog Storage Model Selection Exploration
4:Subsequent planning

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us