A serverless platform can provide ultrahigh elasticity but poses great challenges to the infrastructure. Alibaba Cloud provides a solution that combines Container Service for Kubernetes (ACK) with other cloud services to optimize data access based on Elastic Container Instance. This topic describes the challenges of accessing data in serverless cloud computing and the solution to overcome the challenges.
Challenges of accessing data in serverless cloud computing
A serverless platform provides scaling capabilities that can quickly scale resources or workloads for applications. It takes only a few seconds after an application starts to scale out until it is ready for use. Compute resources are scaled within a few seconds or even milliseconds. As a result, the infrastructure faces great challenges. Storage resources are the most commonly used infrastructure resources. If the IO throughput of a storage system cannot match the rate of instance scaling activities, the system cannot meet the requirements for scaling within seconds. For example, the system can scale container instances within two seconds but needs tens of seconds or even several minutes to download data from the storage system.
Serverless containerization has the following requirements on the traditional storage systems:
High-concurrency access: Compute resources are only used to process data. Data is stored in the storage system. As a result, the overheads of data access in high concurrency scenarios are increased. This affects system stability and increases the bandwidth usage.
Low network latency: An architecture that decouples computing and storage increases the latency to access business data.
Elastic IO throughput: The traditional storage bandwidth increases with the storage capacity. A large number of concurrent accesses of containers can trigger throttling and lead to a conflict between the elastic computing resources and storage bandwidth.
Solution for optimizing data access
To better support serverless cloud computing, the ACK team works with the basic software and operating system team, Elastic Container Instance team, and Data Lake team to provide a solution for optimizing data access based on Elastic Container Instance. The solution conforms to the following rules:
Complies with the existing standards to ensure a consistent user experience. For example, Sidecar and Device Plugin in Kubernetes are used as standards to expose APIs and user interfaces.
Supports fine-grained Linux privilege control.
Synchronizes kernel updates and underlying updates from open source Kubernetes. All design is consistent with open source Kubernetes.
The architecture used by Fluid in serverless cloud computing consists of the data plane and control plane.
Data plane: FUSE containers corresponding to different runtimes compose the data plane. FUSE containers are deployed together with applications as sidecar containers. Sidecar manages data access related to applications.
Control plane: The control plane consists of the injector, cache runtime controller, and application controller.
Injector: The injector transforms data access and runtime implementation information to Sidecar readable information, and injects the information into applications. The injector also controls the sequence in which the containers of a workload are launched. The workload can be a pod or a big data AI computing workload, such as a Spark Job, TensorFlow Job, or MPI Job.
Cache runtime controller: The cache runtime controller controls the elasticity of data caches based on the throughput of FUSE Sidecar, and also manages data access permissions.
Application controller: The application controller terminates the FUSE containers in a pod when the containers of a batch Job, TensorFlow Job, or Spark Job in the same pod are terminated.