Local Container Upgrade through PouchContainer

PouchContainer provides an upgrade interface at the container engine layer, significantly reducing the number of API requests and improving the container upgrade efficiency.

Most containers within Alibaba are used in rich container mode. Some containers among the rich containers in traditional virtual machine O&M mode are still stateful. Updates and upgrades of stateful services are frequently performed daily operations within Alibaba. For the container technology that delivers images, the container operations corresponding to service update and upgrade include two steps: deletion of old image container and creation of the new image container.

The stateful service upgrade requires that the new container must inherit all resources of old container, such as network and storage. The following two cases are the service provision scenarios of rich containers:

Customer Case 1: In a database service, remote data is downloaded to the local computer as the initial data in the database when the container service is created for the first time. Database initialization lasts for a long time, so the new container must inherit the data stored on the old container in the consequent service upgrades to shorten the service provision time.
Customer Case 2: In a middleware service, the service registration mode is used. That is, the IP addresses of the scaled-up containers must be registered in the server list; otherwise, the containers are invalid. The new containers must inherit the IP addresses of old containers every time the service containers are upgraded; otherwise, the new services are invalid.

Many companies use Moby as the container engine; however, none of the Moby APIs can perform container upgrade independently. API combination will increase the number of API requests, for example, the requests for container adding and deleting APIs and the requests for IP address reservation APIs. In addition, this method may increase the risks of upgrade failure.

To solve the previous problems, PouchContainer provides an upgrade interface at the container engine layer to implement local upgrade of containers. That is, the container upgrade function is lowered to the container engine layer. This facilitates operations on container resources and significantly reduces the number of API requests, improving the container upgrade efficiency.

Bottom-layer Storage of Containers

The bottom layer of PouchContainer is connected to Containerd v1.0.3, which has a very different storage architecture than Moby. Before understanding how the PouchContainer implements local container upgrade, it is necessary to learn about the container storage architecture in PouchContainer:

Compared with the container storage architecture of Moby, PouchContainer has the following characteristics:

PouchContainer gets rid of the GraphDriver and Layer concepts, but introduces Snapshotter and Snapshot to the new storage architecture, to adapt to the containerd design of CNCF. Snapshotter is the storage drive, for example, overlay, devicemapper, and btrfs. Snapshot is the image snapshot. Two types of snapshot are available: read-only (read-only data in each layer of container image) and read-write (read-write layer of container). All the container incremental data is stored in the read-write snapshot.
The metadata of containers and images in Containerd is stored in boltdb. Therefore, only the boltdb needs to be initialized every time services are restarted, instead of reading the host file directories to initialize container and image data.

Upgrade Requirements

In the design of each system and function, it is necessary to investigate what pain points the system or function can solve for users. After investigating the container local upgrade scenarios within Alibaba, we posed three requirements on the upgrade function:

Data consistency
Flexibility
Robustness

Data consistency means that after the upgrade, some data must be unchanged:

Network: The network configurations of the container must be unchanged after the upgrade.
Storage: The new container must inherit all volumes of the old container.
Config: The new container must inherit some configurations of the old container, for example, Env and Labels.

Flexibility means that new configurations are allowed to be introduced based on the configurations of the old container during the upgrade operation:

The CPU and memory information of the new container can be modified.
For the new image, not only the new Entrypoint needs to be specified, but also the Entrypoint of the old container must be inherited.
New volumes can be added to the container. The new image may include new volume information. In the creation of a new container, the new volume information needs to be parsed and new volumes are created.

Robustness refers to the capability of processing abnormalities during the container local upgrade. The rollback policy must be supported to restore the configurations of old container if the upgrade fails.

Upgrade implementation

Definition of upgrade API

The upgrade API defines which parameters can be modified in the upgrade operation. The following is the definition of ContainerUpgradeConfig. In the container upgrade, the ContainerConfig and HostConfig parameters can be modified. If you view the definitions of the two parameters under the apis/types directory in the PouchContainer GitHub code repository, you will find that the upgrade operation can modify all the related configurations of the old container.

// ContainerUpgradeConfig ContainerUpgradeConfig is used for API "POST /containers/upgrade".
// It wraps all kinds of config used in container upgrade.
// It can be used to encode client params in client and unmarshal request body in daemon side.
//
// swagger:model ContainerUpgradeConfig

type ContainerUpgradeConfig struct {
    ContainerConfig

    // host config
    HostConfig *HostConfig `json:"HostConfig,omitempty"`
}

Upgrade Procedure

The container upgrade operation is to delete the old container without modifying the network configurations and original volumes, and create a new container by using the new image. The following is the procedure of the upgrade operation:

Before the upgrade, back up all operations performed on the old container. If the upgrade fails, the rollback can be performed using the backup.
Update the container configuration parameters. Merge the new configuration parameters in the request to the old container parameters, to make the new configurations take effect.
Perform special processing on the image Entrypoint parameter: If the new parameters include Entrypoint, the new Entrypoint parameter is used. Otherwise, check the Entrypoint parameter of the old container. If this parameter is specified in configurations but not contained in the old image, the Entrypoint parameter of the old container is used as Entrypoint of the new container. If Entrypoint is neither included in new parameters nor specified in configurations, the Entrypoint in the new image is used as the Entrypoint of the new container. Such processing on the Entrypoint of the new container is to ensure the continuity of container service entry parameters.
Check the container status. If the container is in running state, stop the container, and then create a snapshot based on the new image as the read-write layer of the new container.
After creating the snapshot, check the status of the old container again. If the status is running, start the new container. Otherwise, do not perform any operation.
Clean up the unneeded container configurations, including the old snapshot, and save the latest configurations.

Upgrade Rollback

The upgrade operation may have errors. Currently, a rollback is performed to restore the old container when an error occurs. First, we need to define what upgrade failures are:

If the creation of new resources for the new container fails, perform a rollback: When the creation of snapshot and volumes for the new container fails, a rollback is performed.
If a system error occurs in the new container, perform a rollback: When the container creation by invoking the Containerd API fails, a rollback is performed. If the API returns a normal result but the program in the container runs abnormally, causing the exit of container, rollback is not performed.

The following is the basic operation of rollback:

defer func() {
    if !needRollback {
        return
    }

    // rollback to old container.
    c.meta = &backupContainerMeta

    // create a new containerd container.
    if err := mgr.createContainerdContainer(ctx, c); err != nil {
        logrus.Errorf("failed to rollback upgrade action: %s", err.Error())
        if err := mgr.markStoppedAndRelease(c, nil); err != nil {
            logrus.Errorf("failed to mark container %s stop status: %s", c.ID(), err.Error())
        }
    }
}()

If an error occurs during upgrade, the newly created resources such as snapshot are cleared. In the rollback stage, only the old container configurations need to be restored, and the new container starts with the restored configuration file.

Demonstration of the Upgrade Function

Create a new container by using the ubuntu image:

$ pouch run --name test -d -t registry.hub.docker.com/library/ubuntu:14.04 top
43b75002b9a20264907441e0fe7d66030fb9acedaa9aa0fef839ccab1f9b7a8f

$ pouch ps
Name   ID       Status         Created          Image                                            Runtime
test   43b750   Up 3 seconds   3 seconds ago   registry.hub.docker.com/library/ubuntu:14.04   runc

Upgrade the image of the test container to busybox:

$ pouch upgrade --name test registry.hub.docker.com/library/busybox:latest top
test
$ pouch ps
Name   ID       Status         Created          Image                                            Runtime
test   43b750   Up 3 seconds   34 seconds ago   registry.hub.docker.com/library/busybox:latest   runc

As shown in the previous demonstration, the container's image is replace with a new one by using the upgrade interface, and other configurations are unchanged.

Conclusion

In the production environment of a company, the upgrade operation is also frequently performed like container scale-up and scale-down. However, neither the Moby community nor the Containerd community provides an API matching the upgrade operation. PouchContainer unprecedentedly implements this function to solve the stateful service update and provision problem of the container technology for companies. We are trying to improve the closeness between PouchContainer and downstream dependent components such as Containerd, and planning to provide the upgrade function to the Containerd community so as to enrich the functions of Containerd.

Community

Local Container Upgrade through PouchContainer

Bottom-layer Storage of Containers

Upgrade Requirements

Upgrade implementation

Definition of upgrade API

Upgrade Procedure

Upgrade Rollback

Demonstration of the Upgrade Function

Conclusion

Read previous post:

Read next post:

Alibaba Cloud Native Community

You may also like

Comments

Alibaba Cloud Native Community

Related Products

Container Service for Kubernetes

ACK One

Architecture and Structure Design

Container Compute Service (ACS)