By Cloud-Native SIG
When it comes to the network download field, you may think of C/S mode based on the TCP/IP protocol suite first. This mode expects each client to establish a TCP connection with the server. The server polls and listens for the TCP connection and responds, as shown in the following figure:
At the end of the last century, people developed application layer protocols (such as HTTP and FTP) based on the idea of C/S mode. However, the disadvantages of the C/S mode are clear: the server load is too large, and the download rate is too slow. With the increasing scale of the Internet and the demand for download data size, download rate, etc., these disadvantages are constantly magnified.
Based on the background above, some people combine the idea of a P2P network and load balance to propose the P2P download mode. In this mode, the server no longer bears all the download pressure but is only responsible for passing the file metadata, and the real file download connection is established between the clients. At the same time, a file can be divided into multiple blocks, and different blocks in the same file can be downloaded on different clients, which makes the downloaded file flow dynamically in the P2P network, thus significantly improving the download efficiency, as shown in the following figure:
The Decentralized P2P download (based on DHT technology) adopts distributed network technology to store and retrieve information. All information is stored in the form of hash table entries scattered on each node, thus forming a huge distributed hash table in a network-wide manner. On this basis, the decentralization of a single server is achieved. The hash table is responsible for the allocation of the load, and the load of the entire network is evenly distributed to multiple machines.
Dragonfly is a P2P-based intelligent image and file distribution tool. It aims to improve the efficiency and rate of large-scale file transfer and maximize the use of network bandwidth. It is widely used in application distribution, cache distribution, log distribution, and image distribution.
Dragonfly combines the advantages of C/S architecture and P2P architecture. It provides a customer-oriented C/S download mode. At the same time, it provides a P2P back-to-source mode for server clusters. Unlike the traditional P2P technology, the peer-to-peer networks of Dragonfly are built inside Scheduler. The goal is to maximize P2P internal download efficiency, as shown in the following figure:
Dragonfly focuses on image distribution and file distribution and provides users with stable and efficient download services combined with the idea of a P2P network and server cluster. Dragonfly hopes to build a P2P network inside the server and divide different host nodes of the server into four roles: Manager, Scheduler, Seed Peer, and Peer. They provide different functions, respectively.
The Manager provides the overall configuration function to pull the configurations of other roles and enable communications between the four roles. The Scheduler provides the download scheduling function. The scheduling result directly affects the download rate. Seed Peer is responsible for back-to-source downloads and pulls required images or files from external networks. As a server in the C/S architecture, Peer provides download functions to customers through various protocols. The architecture diagram is shown below:
In the architecture, Seed Peer supports downloads from external networks using multiple protocols. Seed Peer can also be used as a Peer in a cluster. Peer provides download services based on multiple protocols. It also provides a proxy service for image registry or other download tasks.
Manager plays the role of a manager when multiple P2P clusters are deployed, providing a frontend console for users to visually operate P2P clusters. It mainly provides the functions of dynamic configuration management, maintaining cluster stability, and the relationship between multiple P2P clusters. The Manager and each service use Keepalive to maintain the overall cluster stability and ensure that abnormal instances can be eliminated in the case of instance exceptions. Dynamic configuration management allows us to operate the control units of each component on the Manager, such as controlling the load of Peer and Seed Peer and the number of Parents scheduled by Scheduler. Manager can also maintain the relationship between multiple P2P clusters and a complete P2P cluster formed by one Scheduler Cluster, one Seed Peer Cluster, and several Peers. Also, different P2P clusters can be network isolated. Usually, there is one P2P cluster in one data center and multiple P2P clusters managed by one Manager.
The main responsibility of Scheduler is to find the optimal parent peer for the current download peer and trigger back-to-source downloads in Seed Peer. At the proper time, Scheduler triggers back-to-source downloads in Peer. When Scheduler starts, it registers in the Manager, initializes the dynamic configuration client, and pulls the dynamic configuration from the Manager. Next, it starts the service required by Scheduler.
The core responsibility of Scheduler is to select the optimal Parent peer for the current download peer. Scheduler is Task-oriented. A task means a complete download task. The task information and the DAG of the corresponding P2P download network are stored in Scheduler. The scheduling process is to filter abnormal Parent peers first from multiple dimensions. For example, to determine whether a peer is a bad node, the judgment logic is to assume the response duration of each peer follows a normal distribution. If the current response duration of a peer is outside the 6σ range, the peer is considered to be a bad node, and it will be eliminated. The remaining pending Parent peers are scored based on the historical download feature values, and a group of Parent peers with the highest score is returned to the current download peer.
Seed Peer and Peer share many similarities. They are both based on Dfdaemon, but the difference is that Seed Peer adopts the Seed Peer mode and supports active back-to-source downloads. As a server in the C/S architecture, Peer adopts the Peer mode, provides users with the download function, and supports the back-to-source download passively triggered by Scheduler. This shows that the relationship between Peer and Seed Peer is not fixed. A Peer can make itself a Seed Peer by back-to-source operations. Seed Peer can also change itself to Peer by changing the running status. Scheduler will dynamically change the corresponding DAG. In addition, both Seed Peer and Peer need to participate in the scheduling download process. Scheduler may select Seed Peer or Peer as the Parent peer to provide download functions to other Peers.
Dfcache is Dragonfly's cache client that communicates with dfdaemon and operates on files in the P2P network. The P2P network here acts as a cache system, and the corresponding Task and DAG can be stored in the Scheduler.
Dfstore is Dragonfly's storage client that relies on different types of object storage services as backends to provide stable storage solutions. It supports S3 and OSS. Dfstore can enable fast write and read by relying on the backend OSS with the acceleration features of P2P. At the same time, the back-to-source and cross-data center traffic can also be saved, reducing the pressure on the origin server.
Dragonfly automatically isolates abnormal peers to improve download stability. Each component in Dragonfly contacts Manager through Keepalive. The Manager can ensure that the Scheduler address returned to Peer and the Seed Peer address returned to Scheduler are available. The unavailable Scheduler and Seed Peer addresses will not be pushed by the Manager to the Peer or Scheduler that needs to perform download tasks, achieving the purpose of isolating abnormal peers. This is also the exception isolation in the instance dimension, as shown in the following figure:
In addition, Dragonfly executes the scheduling with Tasks as the units, which ensures the stability of the entire scheduling process. After receiving a new Task scheduling request, the Scheduler triggers the Seed Peer to perform back-to-source downloads. After receiving a scheduling request from an existing Task, the Scheduler dispatches the optimal Parent Peer set and returns it to the Peer. This logic ensures that Dragonfly can process a Task regardless of whether it has been downloaded. In addition, during Scheduler scheduling, Peers with long response duration are considered abnormal peers and will not be returned as Parent Peers. This is also called exception isolation in the Task dimension.
Dragonfly uses P2P to perform internal back-to-source on the server. The P2P download enables load allocation and minimizes the load on each server node. The following details ensure the efficiency of Dragonfly download:
Dragonfly provides multiple deployment methods: Helm Charts, Docker Compose, Docker Image, and binary deployment methods. Users can make a quick one-click deployment for a simple POC or perform large-scale production deployment based on Helm Charts. Complete metrics are available in Dragonfly's various services, and ready-made Granafa templates are also provided for users to observe P2P traffic trends.
Dragonfly is a standard solution in the CNCF image acceleration field. It can maximize the image download speed with the help of the Dragonfly subproject Nydus performing on-demand loading. In the future, we will continue to work hard to build the ecological chain in the image acceleration field. Thank you to everyone that participated in the community construction. We hope more people interested in image acceleration or P2P will join our community!
OpenAnolis Cloud-native SIG Address:
Developer Group Email:
eunomia-bpf: The Lightweight Development Framework for eBPF and WebAssembly Is Now Available!
63 posts | 4 followersFollow
OpenAnolis - February 27, 2023
Alibaba Developer - September 16, 2020
Aliware - July 21, 2021
Alibaba Clouder - December 20, 2018
OpenAnolis - January 10, 2023
OpenAnolis - March 8, 2022
63 posts | 4 followersFollow
Connect your business globally with our stable network anytime anywhere.Learn More
Accelerate and secure the development, deployment, and management of containerized applications cost-effectively.Learn More
Alibaba Cloud offers an accelerated global networking solution that makes distance learning just the same as in-class teaching.Learn More
Customized infrastructure to ensure high availability, scalability and high-performanceLearn More
More Posts by OpenAnolis