All Products
Search
Document Center

Realtime Compute for Apache Flink:Security white paper

Last Updated:Feb 24, 2025

Realtime Compute for Apache Flink is fully compatible with Apache Flink APIs, and provides comprehensive security hardening features to ensure your data security in aspects such as access control, network, storage, and backup and restoration, as well as through services such as ActionTrail.

Tenant isolation

Realtime Compute for Apache Flink supports multi-tenant scenarios. The Alibaba Cloud account authentication system uses symmetric encryption based on AccessKey pairs to perform signature authentication for each HTTP request from users. Data of different users is isolated and separately stored in a distributed file system. This meets the requirements for multi-user collaboration, data sharing, data confidentiality, and data security, and achieves real multi-tenant resource isolation.

Access control

Multi-dimensional access control methods are provided to ensure data security.

RAM

Alibaba Cloud provides Resource Access Management (RAM) so that you can manage the operation permissions of different RAM users on Realtime Compute for Apache Flink resources. You can also use RAM to log on to the Realtime Compute for Apache Flink console as a member in a resource directory (including as a CloudSSO user). For more information, see What is RAM? and Supported logon methods.

Namespace permission management

Realtime Compute for Apache Flink allows you to manage permissions on namespaces flexibly and securely. You can define roles and configure fine-grained permissions based on your business requirements when multiple users perform deployment development and O&M in the same namespace. For more information, see Grant permissions on namespaces.

Whitelist

By default, the upstream and downstream storage devices of Realtime Compute for Apache Flink deny access from external devices. Therefore, you must add the CIDR block of the vSwitch of Realtime Compute for Apache Flink to the whitelist of the associated storage system. If your vSwitch is not in the same zone as the upstream and downstream storage systems, the network connection is established after you add the CIDR block of the vSwitch to the whitelist. For more information, see FAQ about network connectivity.

Access to a Hive cluster that supports Kerberos authentication

Kerberos is a computer-network authentication protocol used for identity authentication to ensure the security of communication. If the Hive cluster that your deployment needs to access supports Kerberos authentication, you must register the Kerberos-secured Hive cluster in the Realtime Compute for Apache Flink console and configure the Hive cluster in the deployment. For more information, see Register a Hive cluster that supports Kerberos authentication.

Network isolation

Realtime Compute for Apache Flink can access upstream and downstream storage services over virtual private clouds (VPCs) or the Internet. To ensure security, we recommend using VPCs. You can also manage the domain names of upstream and downstream storage services in the Realtime Compute for Apache Flink console.

VPCs

A VPC is a private network that is isolated from other networks at the network layer on top of physical-layer protocols. VPCs provide high security, reliability, flexibility, scalability, and ease of use. For more information, see What is a VPC?

Internet

You can use the Alibaba Cloud's Network Address Translation (NAT) gateway to set up connections between a VPC and the Internet. This way, Realtime Compute for Apache Flink can access upstream and downstream services over the Internet. However, we discourage you from using this access method. For more information, see FAQ about network connectivity.

Domain name management

You can manage the domain names of upstream and downstream services in the Realtime Compute for Apache Flink console.

Encryption

Variable management

You can use variables in various scenarios, such as SQL draft development, JAR or Python deployments, log export configuration, and UI-based parameter setups. This practice helps prevent security risks associated with plaintext AccessKey pairs or credentials. Additionally, variables facilitate reuse, reducing redundancy in code development or variable configuration. For more information, see Manage variables.

Backup and restoration

Multiple backup methods are provided to persist and restore data.

Data backup

Realtime Compute for Apache Flink adopts a storage and computing separation architecture. It uses Object Storage Service (OSS) to store data, such as checkpoints, savepoints, logs, and JAR packages. Realtime Compute for Apache Flink creates different directories in the associated OSS bucket. The default retention period is seven days. For more information, see Activate Realtime Compute for Apache Flink.

Data restoration

  • Manually create a savepoint: You can manually create a savepoint for a deployment at a specific point in time, such as when the deployment is running or when the deployment is canceled, and restore the deployment from the savepoint. This feature is useful in scenarios such as data restoration, quick business deployment, and data verification.

  • Configure scheduled savepoint creation: You can specify a deployment savepoint creation cycle to automatically create scheduled savepoints. After you save the rules, the system automatically creates savepoints during the running of the deployment.

  • Resume a deployment from a specific savepoint: If you want to restore a deployment from a specified savepoint of another deployment, specify a savepoint to do this.

    Note

    If you want to share savepoints across deployments, make sure that the state data between deployments is compatible, through methods such as a dual-run test.

Deployment status backup

You can perform the following steps to view the status set of a deployment: On the Deployments page in the Realtime Compute for Apache Flink console, click the name of the deployment. On the deployment details page, click the Status tab. For more information, see View the state generation overview.

Cross-zone high availability

The cross-zone high availability feature facilitates disaster recovery between zones within a region. In a namespace configured with cross-zone compute units (CUs), this feature allows for seamless failover to a secondary zone in case a fault occurs in the primary zone. This prevents service interruptions from single-zone failures and ensures continuity and high availability for deployments. For more information, see Cross-zone high availability.

ActionTrail

ActionTrail is a service that monitors and records the operations of your Alibaba Cloud account. The operations include your access to and use of cloud services by using the Alibaba Cloud Management Console, APIs, and SDKs. ActionTrail records these operations as events. You can download these events and deliver them to Simple Log Service Logstores or OSS buckets. Then, you can perform behavior analysis, security analysis, resource change tracking, and compliance auditing based on the events. For more information about ActionTrail, see What is ActionTrail?

Realtime Compute for Apache Flink is connected to ActionTrail. You can view resource operation events and related information in the ActionTrail console free of charge. For more information, see View audit events of Realtime Compute for Apache Flink.