×
Community Blog Dragonfly Releases the Nydus Container Image Acceleration Service

Dragonfly Releases the Nydus Container Image Acceleration Service

This article introduces the architecture and advantages of the Nydus Container Image Acceleration Service.

Challenges for Container Deployment Posed by Images

In the production practice of containers, smaller container images can be deployed and started quickly. When the application image reaches several gigabytes or above, it usually takes a lot of time to download the images on the node. Dragonfly, a CNCF incubating project, improves the efficiency of large-scale distribution of container images by introducing P2P networks. However, users still have to wait for the complete image data to be downloaded locally before creating their containers. Therefore, we hope to reduce the time taken to download images so users can deploy container applications faster. At the same time, how to better protect users' data is also an important concern of the container industry in recent years.

To this end, we introduced a container image acceleration service called Nydus for the Dragonfly project. Nydus can shorten the image download time and provide end-to-end image data consistency verification so users can manage container applications more safely and quickly. Nydus is jointly developed by Alibaba Cloud and Ant Group engineers and deployed in the internal production environment on a large scale. As part of the cloud-native ecosystem, Nydus has shown excellent performance in the production environment. This has given us the confidence to make the project open-source. This way, more container users can experience the capabilities of quick startup and secure loading of containers.

Nydus: Dragonfly Container Image Service

The Nydus project implements a user-mode file system on top of a container image format that improves over the current OCI image specification. These optimizations give Nydus the following features:

  • Container images are downloaded on-demand. Users do not have to download the complete images before starting containers.
  • Block-level image data deduplication to save storage resources for users
  • The image only has the final available data with no need to save and download expired data
  • End-to-end data consistency verification to provide better data protection for users
  • Compatible with the OCI distribution standard and artifacts standard (available out-of-the-box)
  • Different container image storage backends are supported. Image data can be stored in the image repository and NAS or S3-like Object Storage Service (OSS).
  • Integrated with Dragonfly well

In terms of architecture, Nydus mainly includes a new image format and the Filesystem in Userspace (FUSE) responsible for parsing container images.

1

Nydus can parse the FUSE or virtiofs protocols to support a traditional runc container or Kata container. Container repositories, OSS, NAS, and the super nodes and peer nodes of Dragonfly can all be used as image data sources for Nydus. At the same time, Nydus can also configure a local cache to avoid pulling data from remote data sources every time it is started.

In terms of image format, Nydus divides a container image into two layers: metadata and data. The metadata layer is a self-verification hash tree. Each file and directory is a node with a hash value in the hash tree. The hash value of a file node is determined by the data of the file, and the hash value of a directory node is determined by the hash value of all files and directories under the directory. The data of each file is sliced according to a fixed size and saved to the data layer. Data slices can be shared in different files and in different files of different images.

2

1. What Does Nydus Bring to Users?

If a user deploys the Nydus image service, one of the most visible improvements is less time is spent on container startup than before. It changes from a long startup in the past to an instantaneous one. In our testing, Nydus shortened the startup time of common images from minutes to seconds.

3

Another less obvious (but also important) improvement is that Nydus can provide users with container runtime data consistency verification. In traditional images, image data is first decompressed to the local file system and accessed by container applications. Before decompression, the image data integrity is checked. However, the image data integrity can no longer be checked after decompression. There is one problem here. If the decompressed image data is unintentionally or maliciously modified, the user cannot perceive it. With Nydus, the image will not be decompressed to the local, and integrity checks can be performed for each data access. If the data is tampered with, it can be pulled again from the remote data source.

4

2. Planning

We have introduced the architecture and advantages of Nydus. Over the past year, we have worked with our internal product team to make Nydus more stable, safe, and easy to use. After making Nydus open-source, we will make more efforts to adapt to the cloud-native container ecosystem. Our vision is that when users deploy Dragonfly and Nydus services in a cluster, they can easily and quickly run their container applications regardless of the size of the image and do not need to worry about the data security of the container image.

3. OCI Community Container Image Specifications

We have deployed Nydus on a large scale in our internal production environment. We believe improvements to OCI image specifications require extensive community efforts. Therefore, we actively participated in the OCI community's discussion on the next-generation image specification and found that Nydus can meet the OCI community's requirements for the next-generation image format in a wide range of aspects. Therefore, we propose implementing Nydus as an example of the next-generation image specification in the OCI community. We look forward to working with more cloud-native industry leaders to promote the formulation and implementation of the next-generation image specification.

4. FAQ

Q: What are the problems with the existing OCI image specification?

  • Aleksa Sarai is a senior software engineer from SUSE, a German-based open-source software company. In his blog entitled The Road to OCIv2 Images: What's Wrong with Tar?, he discussed a series of problems with the existing OCI image format. He concluded that the outdated tar format used by the OCI image specification is not suitable as a container image format.

Q: What is the difference between Nydus and CRFS?

  • CRFS is an image format designed by the GO Build Team. The two are very similar in the main design ideas. Nydus supports chunk-level data deduplication and end-to-end data consistency verification. We can consider it a further improvement on top of the CRFS stargz format.

Q: What is the difference between Nydus and Azure Teleport?

  • Azure Teleport is more like a deployment implementation of the existing OCI image specification on snapshotter based on the SMB file-sharing protocol. It can support the on-demand download of container image data, but it retains all the defects of the current tar OCI image format. In contrast, Nydus abandoned the outdated tar format and used the merkle tree format to provide more advanced features.

Q: What if the network is broken when running Nydus-based containers?

  • When using the existing OCI image, if the network is broken when the container image has not been completely downloaded, the container will not start at the beginning. Nydus changes the container startup process significantly. You can start the container without waiting for the complete image data to be downloaded. However, if the network is cut off when the container is running, the image data that has not been downloaded to the local computer cannot be accessed. Nydus supports downloading container image data in the backend after the container is started. Therefore, when the container image data is completely downloaded to the local computer, Nydus-based containers will not be affected by network interruption.

In June 2020, the OCI community spent roughly a month discussing the shortcomings of the current OCI image specification and which requirements the OCIv2 image format needs to meet. OCIv2 is an improvement of the current OCI image specification, not a new image specification.

This great discussion on image format started with an email and a shared document. It also led to many online OCI community discussion meetings. The conclusion is also very encouraging. The OCIv2 image format needs to meet the following requirements:

  • Less duplicate data
  • Recreatable image format
  • Clear file system metadata and less file system metadata
  • File system formats that can be mounted
  • Image content list
  • On-demand loading of image data
  • Scalability
  • Checkable and/or repairable
  • Less data to upload
  • Working on untrusted storage

You can find a detailed description of each requirement in this shared document. We participated in the whole discussion of the OCIv2 image format requirements and found that Nydus met all these requirements well. This prompted us to make the Nydus project open-source to provide a basis for community discussions.

This articlce was originally published on CNCF wechat official account, Dragonfiy.

0 0 0
Share on

OpenAnolis

84 posts | 5 followers

You may also like

Comments

OpenAnolis

84 posts | 5 followers

Related Products