This topic describes the architecture of Realtime Compute for Apache Flink in exclusive mode.

Architecture

The following figure shows the architecture of Realtime Compute for Apache Flink in exclusive mode. Architecture
  • If you use Realtime Compute for Apache Flink in exclusive mode, all your purchased Elastic Compute Service (ECS) instances are fully hosted in the virtual private cloud (VPC) in which your Realtime Compute for Apache Flink cluster resides. In this mode, you cannot log on to your purchased ECS instances.
  • When you create a Realtime Compute for Apache Flink cluster, Realtime Compute for Apache Flink applies for an elastic network interface (ENI) under your account. You can use this ENI to access all the resources in your VPC.
  • To allow your Realtime Compute for Apache Flink cluster to access the Internet, you can bind a network address translation (NAT) gateway and an elastic IP address (EIP) to the ENI. For more information, see Associate an EIP with a NAT gateway.
  • The ENI belongs to an independent security group under your account. To access the services of other security groups in the VPC, you must configure inbound and outbound rules for the security group.
Note You are charged for the use of the ENI only when your Realtime Compute for Apache Flink cluster accesses the Internet.

Benefits

  • End-to-end real-time data computing and development
    • Provides real-time data processing capabilities based on Flink SQL, which implements automatic data recovery. This ensures accurate data processing even if failures occur.
    • Supports multiple built-in functions, such as string, date, and aggregate functions.
    • Supports various window types, such as tumbling, sliding, and session windows.
    • Provides accurate control over computing resources, which ensures resource isolation for jobs.
    • Provides the following key performance metrics that are superior to the metrics of Apache Flink:
      • The data computing latency can be indicated in subseconds.
      • The throughput of a single job can reach millions of records per second. A single cluster can consist of thousands of servers.
    • Deeply integrates various cloud data storage systems such as DataHub, Log Service, ApsaraDB RDS, Tablestore, and AnalyticDB for MySQL. This allows you to read and write data from and to these systems in a convenient manner.
  • Fully-managed real-time computing service
    • Uses a fully-managed stream computing engine.
    • Allows you to run and query streaming data without the need to provision or manage infrastructures.
    • Allows you to activate streaming data processing services with one click.
    • Integrates features such as data storage, data development, data O&M, and monitoring and alerting. This reduces both the trial and migration costs of stream processing.
    • Isolates and protects the managed and running services of different tenants.
  • Reduced manpower and cluster costs
    • Significantly optimizes the SQL execution engine to provide computing jobs that are more cost-effective than native Flink jobs.
    • Significantly reduces development and operation costs, which are much lower than the costs of open source streaming frameworks.
  • High availability
    If an ECS instance is abnormal or a Realtime Compute for Apache Flink job is recovered from a failure or is resumed, you can use the JobManager or a TaskManager on an available ECS instance in the same zone to ensure high availability for jobs. You can also use the JobManager or a TaskManager on an available ECS instance in a different zone or region to ensure high availability across zones. High availability of clusters