Community Blog Introduction to Alibaba Cloud Realtime Compute for Apache Flink

Introduction to Alibaba Cloud Realtime Compute for Apache Flink

This article introduces Alibaba Cloud Realtime Compute for Apache Flink.

By Wang Feng (Mowen)

Alibaba Cloud Realtime Compute for Apache Flink is powered by Ververica and developed by Alibaba Cloud based on Apache Flink. It is an enterprise-level, high-performance system that processes big data in real-time. It was officially released by the founding team of Apache Flink, forming a globally uniform commercial brand. This system is fully compatible with the APIs of open-source Apache Flink and provides a wide range of value-added features for enterprises.

Development of Apache Flink

Big data has been growing at high speeds for more than ten years, evolving from large-scale computing to more real-time computing.

For example, during Alibaba's Double 11 Global Shopping Festival, real-time dashboards display real-time transaction volume and trading volume within milliseconds. Every year, Chinese people from all around the world watch the CCTV Spring Festival Gala. During the CCTV Spring Festival Gala, real-time dashboards counted the national ratings and audience profiles. In the core business scenarios of financial industries, such as banks and stock exchanges, big data real-time computing is used to monitor transactions and detect cheating or money laundering. In Taobao e-commerce scenarios, the real-time computing system tailors personalized recommendations based on user behaviors. The system builds a customer profile in real-time to recommend relevant items the customers may like in their subsequent browsing based on which products customers have browsed within the previous minute or 30 seconds. Real-time computing works in the background throughout the day to drive productivity in many daily scenarios.

A powerful set of big data computing capabilities in the background is a must to implement real-time computing. Thus, Apache Flink, the open-source big data real-time computing technology, was developed. It is enabled from the beginning of the design by stream computing in contrast with traditional computing engines. Essentially, these engines, such as Hadoop and Spark, are batch computing engines that process finite datasets, of which the processing latency can be not short enough. On the contrary, Apache Flink, as a stream computing engine, can subscribe to real-time data, analyze and process in real-time, and produce results for use as quickly as possible.

Currently, Apache Flink is transforming from a mere stream computing engine to an engine equipped with stream-batch unification computing. It can perform stream analysis and processing through log stream, click stream, and IoT data stream. It can also gain quick analysis through batch data processing of finite datasets, such as files in databases and file systems.

Apache Flink is a very popular open-source big data technology in the open-source community, boasting one of the world's most active Apache open-source projects for three consecutive years. With its highly consistent computing and large-scale scalability, its overall performance is extraordinary. It supports multiple programming languages, such as SQL, Java, and Python, and provides a variety of APIs that adapt to various scenarios. Nowadays, Apache Flink has become the mainstream real-time big data computing technology for Internet enterprises worldwide. It is the de facto standard in the real-time computing field.

Alibaba Cloud Realtime Compute for Apache Flink has been tested within Alibaba Group for many years for it to develop its rich technology. Its cloud computing services are now available on the cloud for SMEs in various industries. Since 2016, which is the third year since Apache Flink was donated to the Apache Software Foundation, Alibaba has made a massive rollout of real-time computing products. Apache Flink was first applied in Alibaba's core scenarios, including search recommendation and advertising business. In these scenarios, a large amount of real-time data needs to be processed for real-time recommendations, real-time sorting, and real-time advertising. The core business of the entire e-commerce achieved substantial improvements with this application.


In 2017, a real-time computing platform based on Apache Flink began to serve the entire Alibaba Group. During the 2017 Double 11 Global Shopping Festival, the platform supported real-time data processing for the entire group, including the data in the Double 11 dashboard. In 2018, the platform was officially launched on the cloud to serve Alibaba Group and SMEs on the cloud. This was the first time Realtime Compute for Apache Flink was used to provide external services on a public cloud.

In early 2019, Alibaba acquired Ververica, the founding company of Apache Flink. The Real-Time Computing Technical Team for Apache Flink of Alibaba joined forces with the Founding Team of Apache Flink at its German headquarters. Together, they formed the world's leading Apache Flink team in terms of technologies. This joint team works toward the development and contribution of the entire Apache Flink open-source community. Today, more than 200,000 developers are participating in the Apache Flink community in China. Apache Flink has become one of the most active projects of the Apache Software Foundation in the big data field.

Last year, mainstream cloud computing companies and big data companies worldwide launched their respective Flink products. For example, Cloudera, which started with Hadoop, released CDP and CDH, which fully integrate Apache Flink. Big data companies in China also released Flink-based real-time computing products.

Architecture of Alibaba Cloud Realtime Compute for Apache Flink

Compared to open-source Apache Flink, the product architecture of Alibaba Cloud Realtime Compute for Apache Flink has been significantly improved and appreciated. Currently, many developers use open-source Apache Flink to build real-time computing platforms when they operate on self-built data centers or virtual machines on the cloud. What sets Alibaba Cloud Realtime Compute for Apache Flink apart from other products?


According to this product architecture, the bottom layer of the architecture is the complete cloud-native infrastructure of Alibaba Cloud. It builds a real-time Flink product through containerization, with all Flink computing tasks running in the Kubernetes ecosystem. It implements multi-tenant isolation by containerization to ensure security. In addition, its fully-managed service on the cloud ensures high SLA, keeping users from the trouble of O&M. What's more, the related service architecture enables users to determine the proportion of various types of resources flexibly based on business volume without worrying about machine planning. Realtime Compute for Apache Flink has a natural cloud-native architecture.

In terms of the core computing engine, compared to open-source Apache Flink, Alibaba Cloud has optimized many core functions of its product, and these optimizations have been tested by Alibaba's internal services. Currently, Realtime Compute for Apache Flink supports the real-time data services of nearly 100 business units of Alibaba Group. Through a large number of business practices, the product has been refined to achieve the best results in supporting storage, scheduling, and network transmission.

Dozens of built-in enhanced Connectors can connect to all mainstream open-source data storage systems, including MySQL, HBase, HDFS, and Alibaba Cloud SLS. These Connectors are integrated naturally and ready to use. Dataphin provides an all-in-one enterprise-level development platform with built-in development and O&M capabilities to save development time and improve user experience.

Alibaba Cloud Realtime Compute for Apache Flink supports development environments in multiple languages, such as SQL, Java, and Python. It provides full-lifecycle management for development tasks and supports enterprise-level security mechanisms based on OIDC and RBAC. It has an end-to-end monitoring and alerting system based on the Prometheus protocol and a smart tuning system based on AutoPilot, which enables users to intelligently tune parameters for Flink tasks and optimize resource usage and concurrency. The product can completely adapt to the traffic of the business without any manual adjustment. Intelligent tuning is a core advantage of Realtime Compute for Apache Flink.

Differences between Alibaba Cloud Realtime Compute for Apache Flink and Open-Source Apache Flink

Alibaba Cloud Realtime Compute for Apache Flink has dozens of performance advantages over open-source Apache Flink in terms of development, O&M, cost, and security.


In terms of development, Realtime Compute for Apache Flink has rich data connection capabilities and an all-in-one multi-language development environment, with multiple built-in function libraries to facilitate code debugging. It also allows multi-tenant development, task adjustment, and simulation of tests. In terms of O&M, it supports monitoring and alerts throughout the procedure. It can automatically give alerts about data latency, data exceptions, and service interruptions during use.

It supports automated intelligent diagnosis and tuning and can automatically adjust performance, jobs, parameters, and resources based on business traffic. It can diagnose and solve problems. Based on the open-source framework, the resource layer implements finer-grained and more refined resource allocation. This allows each operator of each job to be configured in terms of CPU and memory granularity. By doing so, it optimizes resource utilization significantly, helps users save costs, improves service stability, and reduces the probability of O&M failure. Together with the original O&M service, 99.9% of SLA assurance, fault tolerance capability for the entire procedure, and system stability assurance, it eliminates users' worries.

Regarding the cost, cost optimization on the cloud reduces the overall TCO of users and improves performance. This is also a core advantage of its performance.

In the standard stream computing test based on NexMark, the performance of Realtime Compute for Apache Flink is three times more than open-source Apache Flink. Based on the practice optimizations accumulated by the strong R&D team of Alibaba Group in internal core business scenarios, the product highlights core advantages in reducing users' basic costs.

Realtime Compute for Apache Flink supports the cloud-native scaling feature to help users save resources and improve resource utilization. It supports both subscription and pay-as-you-go billing modes to help users with different needs.

In terms of security, containerized tasks are isolated to improve user experience. Tenant isolation, security isolation, and VPC isolation are supported. In addition, it is directly connected to Alibaba Cloud's account system. Users can implement seamless security control between products using Alibaba Cloud accounts. Realtime Compute for Apache Flink also supports role-based and OIDC-based authentication protocols, improving business security substantially.

In general, compared to open-source Apache Flink, Realtime Compute for Apache Flink has better functionality and stability. In addition to O&M advantages, the out-of-the-box feature makes it more convenient for users.



As a stream computing engine for real-time computing, Realtime Compute for Apache Flink can process a variety of real-time data, including online ECS service logs and sensor data in IoT scenarios. It can also subscribe to updates of binlogs in relational databases, such as RDS and PolarDB. Real-time data of DataHub, Log Service (SLS), and Message Queue for Apache Kafka are also subscribed to and stored for real-time analysis and processing. The analysis results are written into different data services, such as MaxCompute, Hologres, PAI, and Elasticsearch. Users can select the best data service based on business needs to improve data utilization.

The main scenarios of Realtime Compute for Apache Flink are to subscribe to, process, and analyze data from different real-time data sources in real-time and write the results to other online storage systems for direct production and use. The entire system is highlighted by its high speed, data accuracy, cloud-native architecture, and intelligence. It is a very competitive enterprise-level product. Running on IaaS systems, such as ECS in Alibaba Cloud, this product is interconnected with other Alibaba Cloud systems, enabling users to use more options in more scenarios.

Product Application Scenarios

This topic describes four application scenarios of Realtime Compute for Apache Flink. Users can build a real-time computing solution for their business as needed.


1. Real-Time Data Warehouse

Real-time data warehouse is mainly used in transactional data scenarios, such as website pv/uv statistics, product sales statistics, and transaction data statistics. By subscribing to real-time business data sources, information is analyzed in seconds. The results are presented to decision-makers on the dashboard to judge the business operations and promotions. Data intelligence is realized with decisions based on real-time business operation data. Due to the particularity of scenarios, real-time data can be especially important. While facing this rapidly changing business, analysis and decisions regarding data at the last minute or last second are required. Realtime Compute for Apache Flink is the best choice.

2. Real-Time Recommendation

Real-time recommendations can personalize recommendations based on user preferences or AI, which is a mainstream product scenario. In common scenarios, such as short videos, e-commerce shopping, and content information, enterprises will know user preferences in real-time based on previous clicks and push targeted recommendations to increase user stickiness. In these scenarios with timeliness requirements, Realtime Compute for Apache Flink can implement real-time recommendations in conjunction with AI.

3. ETL Scenarios

Real-time ETL scenarios include data synchronization. Data computing and processing are performed during data synchronization, such as the synchronization and conversion of different tables, synchronization between different databases, and data aggregation. The results are written to data warehouses or data lakes for archiving to facilitate subsequent operations, such as log analysis. During the entire data synchronization and processing procedure, the real-time data synchronization and preprocessing based on Realtime Compute for Apache Flink are very efficient.

4. Real-Time Monitoring

Real-time monitoring is common in financial or transaction business scenarios. Such industries require commercial anti-fraud monitoring to determine whether users are potentially cheating based on real-time behavior in a short period to prevent early losses. These scenarios demand timeliness, as the detection of abnormal data can report anomalies for minimizing the loss. Realtime Compute for Apache Flink can collect statistics on metrics or logs from various systems and observe and monitor metrics in real-time.

0 0 0
Share on

You may also like


Related Products