Community Blog Kubernetes Application Management: Stateful Services

Kubernetes Application Management: Stateful Services

This article describes how to deploy and maintain a set of highly available MySQL services through the native k8s resource object StatefulSet and the MySQL Operator.

By Wu Bo (Bruce Wu)


With Deployments and ReplicationControllers, users can conveniently deploy a highly available and scalable distributed stateless service in Kubernetes. These type of applications do not store data locally. By using simple load balancing policies, they can implement request delivery. With the popularization of k8s and the rise of cloud-native architectures, more and more people want to orchestrate stateful services like databases by using k8s. However, this process is not easy due to the complexity of stateful services. This article uses the most popular open-source database MySQL as an example to describe how to deploy and maintain stateful services on k8s. The content of this article is based on k8s 1.13.

Use StatefulSets to Deploy MySQL

This section uses the sample in the official k8s tutorial Run a Replicated Stateful Application to describe how to deploy highly available MySQL services by using StatefulSets.

StatefulSet Overview

Deployments and ReplicationControllers are designed for stateful services. Pod names, host names, and storage in Deployments and ReplicationControllers are not stable. In addition, Pods are started and destroyed in random order. Therefore, they are not suitable for stateful applications like databases. K8s provides the StatefulSet workload that is used to manage stateful services. Its management pod has the following features:

1.Uniqueness: For a StatefulSet with N replicas, each Pod in the StatefulSet will be assigned a unique integer ordinal, from 0 up through N-1.
2.Sequence: By default, Pods in a StatefulSet are started, updated and destroyed sequentially.
3.Stable network identity: The hostname and DNS of a Pod will not change after the Pod is rescheduled.
4.Stable persistent storage: When a Pod is rescheduled, it can still mount the original PersistentVolume to ensure data integrity and consistency.

Service Deployment

In this example, the highly available MySQL service consists of one master node and multiple slave nodes that asynchronously replicate data from the master node (that is, the one-master-multiple-slave replication model). The master node can process read/write requests from users, while the slave nodes can only process read requests from users.

To deploy such a service, in addition to StatefulSets, many other k8s resource objects are required, including ConfigMaps, Headless Services, and ClusterIP Services. The collaboration among these objects allows stateful services like MySQL to conditionally run on k8s.



To make it easy and convenient to maintain application configuration, large systems and distributed applications usually adopt centralized configuration management policies, In k8s, users can separate configuration from Pods by using ConfigMap to maintain the portability of the workload and simplify configuration change and management.

The sample contains a ConfigMap called mysql. When a Pod in the StatefulSet is started, it will read proper configuration from the ConfigMap based on its own role.

Headless Service

A Headless Service provides each associated Pod with a corresponding DNS address of the form <pod-name>.<service-name>. This allows the client to access any desired application instances and can solve the identity recognition among different instances in a distributed environment.

The sample contains a Headless Service called mysql, which is associated with Pods. These Pods are assigned the following DNS addresses: mysql-0.mysql, mysql-1.mysql, and mysql-2.mysql. By doing this, the client can access the master node through mysql-0.mysql and the slave nodes through mysql-1.mysql or mysql-2.mysql.

ClusterIP Service

To simplify access in read-only scenarios, the sample provides an ordinary service called mysql-read. This service has its own cluster IP and sends requests to associated Pods (including the master and the slaves) to hide Pod access details from users.


A StatefulSet is a critical part of service deployment. Each Pod that a StatefulSet manages is assigned a unique name of the form <statefulset-name>-<ordinal-index>. In this example, the name of the StatefulSet is mysql. Therefore, Pods in the StatefulSet are named mysql-0, mysql-1, and mysql-2 respectively. By default, they are created sequentially and destroyed in reverse sequential order.

As shown in the following figure, a Pod contains two init containers and two app containers, and is bound to the PersistentVolume provided by the volume vendor through the unique PersistentVolumeClaim.


The functions of Pod-related components are as follows:

  • The init-mysql container generates configuration files. It extracts the Pod ordinal from the hostname and exports the ordinal into the /mnt/conf.d/server-id.cnf file. It also applies either master.cnf or slave.cnf (depending on the node type) from the ConfigMap by copying the contents into /mnt/conf.
  • The clone-mysql container clones data. The clone-mysql container in Pod N+1 clones data from Pod N to the PersistentVolume bound.
  • After the Init Containers complete successfully, the app containers run. The mysql container runs the actual mysqld server.
  • The xtrabackup container acts as a sidecar. It waits for mysqld in the mysql container to be ready and then runs the START SLAVE command to initialize data replication on the slave. The xtrabackup container also listens for connections from other Pods requesting a data clone.
  • The StatefulSet associates a unique PC to each Pod by using volumeClaimTemplates. In this sample, Pod N is associated to a PVC named data-mysql-N, which is also bound to the PV provided by the storage system. This mechanism ensures that a rescheduled Pod can still mount the original data.

Service Maintenance

To ensure service performance and improve system reliability, proper maintenance is required after the deployment completes successfully. Common maintenance work related to database services includes service fault recovery, service scaling, service status monitoring, and data backup and recovery.

Service Fault Recovery

Whether a service can recover itself in the case of a fault is one of the key metrics that indicate the system automation level. In the current architecture, the MySQL service can be automatically restored when the host experiences downtime or the master or slave nodes fail to respond. In the case of the aforementioned problems, k8s reschedules and restarts Pods where a problem happens. The StatefulSets can ensure that the names, hostnames, and volumes of these Pods remain consistent with the original items.

Service Scaling

In the one-master-multiple-slave MySQL replication model, scaling means to adjust the number of slaves. Thanks to the Pod startup and destruction ordering guarantee provided by the StatefulSet, the number of slaves can be scaled simply by using the following command.

Kubectl scale statefulset mysql -- replicas = <NumOfReplicas>

Service Status Monitoring

Monitoring service status is one essential part to ensure service stability. In addition to readiness probes and liveness probes, more fine-grained monitoring metrics are often required to detect service health. Users can expose the key metrics in MySQL to Prometheus by using mysqld-exporter and implement monitoring and alerting based on Prometheus. We recommend that users deploy mysqld-exporter in the sidecar mode together with the mysqld container in the same Pod.

Data Backup and Recovery

Data backup and recovery is an effective means to ensure data security. Users can implement data backup and recovery by using either volume interfaces or VolumeSnapshots. The following part describes the two methods.

Use Volume Interfaces

Many volume vendors provide the features to save data snapshots and recover data based on snapshots. These features are usually exposed to users in the form of interfaces. This requires users to be familiar with operation interfaces provided by the corresponding volume vendors. For example, if a service uses Alibaba Cloud disks as external volumes, users need to understand the snapshot interface provided for disks.

Use VolumeSnapshots

Three snapshot-related resource objects are introduced in K8s v1.12: VolumeSnapshot, VolumeSnapshotContent, and VolumeSnapshotClass. These objects provide standard methods to perform snapshot operations. Users can create snapshots of volumes that store MySQL data without perceiving external volumes, or recover data based on snapshots.

Using VolumeSnapshots is obviously a better method than directly using underlying volume interfaces. However, the VolumeSnapshot is still in the Alpha stage, and only a limited number of external volumes support standard snapshot operations. These factors limit the application scenarios of VolumeSnapshot. For more information about VolumeSnapshots, see the Volume Snapshots document.

Deploy MySQL by Using Operators

Although users can deploy and maintain a set of highly available MySQL services in k8s based on StatefulSets, the process is relatively complex. This process requires users to familiarize themselves with various k8s resource objects, learn many MySQL operation details and maintain a set of complex management scripts. Kubernetes Operators are designed to reduce the threshold for deploying complex applications on k8s.

Operator Introduction

An Operator is a method introduced by CoreOS to package, deploy and manage a complex application running on Kubernetes. Operators express the maintainers' knowledge of software operations in the form of code and comprehensively use various k8s resource objects to deploy and maintain complex applications.

An Operator defines new resource objects for a service by using a CustomResourceDefinition and ensures that applications are in the expected state by using custom controllers.


The workflow process of the Operator can be divided into the three steps:

1.Observe: Observe the current status of the target object by using the k8s API.
2.Analyze: Find the differences between the desired state and current state.
3.Act: Take the necessary steps to make the running state of the application match its expected state

Oracle MySQL Operator

Many excellent open-source Operator solutions have already been available for MySQL services, including grtl/mysql-operator, oracle/mysql-operator, presslabs/mysql-operator, and kubedb/mysql. The Oracle MySQL Operator described in this section is a typical example of these open-source solutions.

How the Oracle MySQL Operator Works

The Oracle MySQL Operator supports the two following MySQL deployment modes.

  • Primary: In this mode, the service group consists of a read-write single-primary node and multiple read-only primary nodes.
  • Multi-Primary: In multi-primary mode, each node in the cluster plays the same role and the notion of primary-secondary does not apply. Each node can process read/write requests from users.

The following figure shows how the Operator works in Multi-Primary mode.


The following processes are very helpful to understand how the Operator works:

1.Use k8s CustomResourceDefinitions (CRDs) to define several resource objects related to MySQL deployment and maintenance.
mysqlclusters - Describe the expected cluster state, including deployment mode and number of nodes.
mysqlbackups - Describe on-demand backup policies and configure where backup data is stored (for example, in AWS S3).
mysqlrestores - Describe data recovery policies and require the backup data and target cluster.
mysqlbackupschedules - Describe regular backup policies and configure a time interval for backup.

2.Deploy an instance of the Operator in k8s. The Operator will constantly monitor CRUD operations on these resource objects and observes the object state.

3.When a user performs an operation (for example, creating a MySQL cluster), a new MySQLCluster resource object will be created. When the Operator listens for the MySQLCluster creation event, it will create a cluster that matches that user's configuration. This example creates a highly available MySQL cluster based on the Group Replication and uses native k8s resource objects like StatefulSets and Headless Services.

4.When the Operator finds that the desired state and current state have some differences, it performs proper orchestration operations to ensure a consistent state.

Service Deployment

Because the Operator encapsulates complex deployment details, it is now very easy to create a cluster. For example, a user can easily create a multi-primary MySQL cluster consisting of three nodes by using the following configuration.

apiVersion: mysql.oracle.com/v1alpha1
kind: Cluster
  name: mysql-multimaster-cluster
  multiMaster: true
  members: 3

Service Maintenance

When Operators are used, maintenance is also necessary, including service fault recovery, service scaling, service status monitoring, and data backup and recovery.

Service Fault Recovery

Due to the existence of the StatefulSet, k8s will reschedule a MySQL service instance when it fails to respond. In addition, if a StatefulSet is accidentally deleted, the Operator will recreate one.

Service Scaling

Users can easily scale services by changing the spec.members field of the MySQLCluster resource object. Only the MySQLCluster is exposed to users and underlying k8s resource objects are hidden.

Service Status Monitoring

Prometheus can be deployed on k8s to monitor the state of Operators and individual MySQL clusters. For more information, see Monitoring

Data Backup and Recovery

MySQLBackups and MySQLRestores can be used to back up and recover data, eliminating differences in operations on different volumes. MySQLBackupSchedules can also be used to create scheduled backup tasks.

For example, the following configuration performs a backup on the test database in the mysql-cluster MySQL cluster every 30 minutes.

kind: BackupSchedule
  schedule: '*/30 * * * *'
      name: mysql-cluster
      provider: mysqldump
        - test


This article describes how to deploy and maintain a set of highly available MySQL services through the native k8s resource object StatefulSet and the MySQL Operator. We can see that the Operator hides the orchestration details of complex applications and greatly reduces the threshold to use them in k8s. If you need to deploy other complex applications, we recommend that you use the Operator.


0 0 0
Share on

Alibaba Container Service

120 posts | 26 followers

You may also like