All Products
Search
Document Center

:Best practice for deploying Function Compute in Apsara Stack

Last Updated:Feb 26, 2025

Alibaba Cloud Function Compute software provides installation guides for Function Compute-related services to help you complete product deployment.

Architecture

Function Compute is an enterprise-level distributed serverless computing platform developed by Alibaba Cloud. Function Compute features event triggering, real-time scaling, high availability (HA), and high performance.

With Function Compute, you no longer need to manage infrastructure such as servers. You need to only write and upload code. Function Compute prepares computing resources for you and runs code in an elastic and reliable manner. Features such as logging, monitoring, and alerting are provided. The following items list the overall features of Function Compute:

  • Adopts a distributed cluster architecture.

  • Uses data distribution and load balancing technology to achieve high performance and scalability of the system.

  • Uses etcd-based multi-replica technology to achieve high system availability.

  • Implements reliable asynchronous invocations by using ApsaraMQ.

  • High performance.

  • High reliability.

  • Horizontal scaling.

  • Smooth version upgrade.

  • Supports multiple computing resource pools such as virtual machines, bare metal servers, containers, and Kubernetes.

The following figure shows the architecture of a Function Compute cluster.

image

In the preceding architecture, a Function Compute cluster includes metadata storage, SLB, and modules of the Function Compute system. The Function Compute system includes four main modules: API service, resource scheduling, asynchronous event distribution, and the function execution engine. API operations that are provided by Function Compute include those for metadata actions and function invocations. Metadata actions include adding, deletion, modification, and querying actions. Function Compute supports synchronous invocations and asynchronous invocations, which are used in online scenarios and offline task scenarios, respectively.

Deploy Function Compute

Function Compute depends on multiple Alibaba Cloud components to provide you with easy-to-use, high-performance, and high-availability services. For private deployment, you can replace specific components to similar components with which you are familiar. The following items describe the details:

  • F5 load balancer: implements disaster recovery (DR) and load balancing for api controller.

  • Kubernetes cluster: used for node resource management. The version of Kubernetes must be 1.17 or later. You must deploy at least 3 master nodes in HA mode and 3 nodes, each of which must be configured with at least 8 CPU cores and 16 GB of memory. The version of Docker for the nodes must be 18.09 or later.

  • MySQL: used to manage metadata and must be deployed in HA mode.

  • Elasticsearch: used to store and search logs and must be deployed in HA mode.

  • GlusterFS: used for function file sharing and must be deployed in HA mode. This component is optional.

  • Kafka: used to trigger functions. The version of Kafka must be 1.1.1 or later and Kafka deployed in HA mode. This component is optional.

Modules of the Function Compute system are released as container images and are deployed with Kubernetes master nodes in the hybrid mode. By default, three copies of all modules are deployed and share the host network of the master nodes. Different modules can be deployed together in hybrid mode.

Module

Role

Required nodes

DR method

api controller

Access module

>=3

Stateless and multi-active

dispatch controller

Asynchronous message processing

>=3

Stateless and multi-active

resource controller

Routing and scheduling

>=3

Stateful and multi-active

resource manager

Resource management

>=3

Stateful and primary/secondary

node agent

Node computing module

>=2

heartbeat and replacement

dev controller

Deployment modification

1

Peripheral and restarts

Dependent modules of Function Compute are released in the form of container images and deployed by using Kubernetes YAML files. Stateless modules are based on deployments and stateful modules are based on StatefulSet. Except for the rocketmq-broker module, DR is implemented by using Kubernetes. By default, three copies are deployed across servers. Load balancing is implemented by using services. rocketmq-broker uses rocketmq-namespace and rocketmq-controller to implement primary and secondary switchover DR. All dependent modules are deployed under the Kubernetes fc-system namespace.

Module

Role

Required nodes

DR method

mns

ApsaraMQ access

>=3

Stateless and multi-active

rocketmq-controller

ApsaraMQ control

>=3

Stateless and multi-active

rocketmq-nameserver

ApsaraMQ service discovery

>=3

Stateless and multi-active

rocketmq-broker

ApsaraMQ persistence

>=3

Stateful and primary/secondary

zookeeper

ApsaraMQ dependency

>=3

Stateful and leader election

minio

Code persistence

>=3

Stateful and multi-active

kafka connector

Kafka function trigger

>=3

Stateless and multi-active

etcd

FC module dependency

>=3

Stateful and leader election

Fault recovery

By default, Function Compute deploys three replica nodes to provide HA and rapid fault recovery or self-recovery.

  • Module node faults in Function Compute: No local persistent data exists, and self-recovery can be achieved if you restart the node.

  • Network faults of Function Compute modules: Users can deploy function modules across networks to achieve DR.

  • Faults of dependent Function Compute module nodes: Stateless modules have no persistent data, and nodes can recover themselves after restart. Stateful modules such as zookeeper, etcd, and rocketmq-broker have their own data synchronization mechanisms and can automatically recover after restart.

  • Faults of dependent Function Compute modules: The Kubernetes network topology can be detected and the network mutual exclusion properties are available to implement DR.

  • Faults of Function Compute node modules: Function Compute system modules have a retry mechanism, and retry DR is used when faults occur to nodes.

Success stories

Function Compute and Knative help Digital Hainan, Hainan's government, and the People's Bank of China build Apsara Stack cloud platforms to achieve rapid business achievement.

PBChainan_govdigital_hainan_logo

The platform accelerates the construction of digital infrastructure in Hainan Province and the construction of financial infrastructure of the People's Bank of China. After applications are migrated to a unified government or financial cloud platform, operation and maintenance management costs can be reduced and the security is improved. Digital Hainan strive to build a central platform for government affairs, empower industry applications, quickly respond to government business innovations, and support the development of the local ecology.