Analysis of the principle of docker
Docker has become synonymous with container technology, but in fact, Docker is just a package of linux container technology, so why does container technology appear, and what problems does container technology solve? Why does Docker appear again, and what problems does Docker solve? What is the core principle of Docker. In the next article, we will analyze the emergence and principle of Docker.
1. Why Docker
1. The emergence of virtualization technology
In software engineering, the problem of environment configuration is a problem that can never be solved. Every computer environment is different. Developers often say: "It works on my machine (he can run on my computer)", But in fact, if you change a computer, it is likely to fail. How to ensure that software can run as expected in different computers is also a problem that program developers must consider.
Why does such a problem occur?
Software engineering is a delicate project. Software engineering relies on the coordination of various systems on the computer system, ranging from the operating system version to the compilation environment, database version, etc., and there will be incompatibility problems between different versions of software, even Software of the same version but with different configurations may not be compatible and run normally.
Is there a way around that?
The solution to this problem is virtualization. An implementation of virtualization technology is virtual machine technology. Virtual machine technology can simulate the complete system and environment configuration of other computers on a computer. The corresponding version of the software can be installed in Linux. For the running program, it is running on the Linux operating system. The program is imperceptible, so the program can run normally in the virtual machine. The virtual machine is a file to the computer. If necessary, you can delete it directly, or package it for migration. Using virtual machine technology can solve the problem of "It doesn't work on your machine (he can't run on your machine)".
2. The problem of virtualization technology
It seems that the virtual machine perfectly solves the problem of the program running environment. More and more people use the virtual machine to install and debug programs, but at the same time, more and more problems are found.
Problem 1: Low resource utilization
The virtual machine solves environmental problems, but a complete operating system runs in the virtual machine, and the operating system itself needs huge resources to run. A complete operating system includes user management, built-in software, etc. These are most programs Not required to run.
For example, as shown in the figure, a machine with 16 cores and 32G memory can run 3 virtual machines with 4 cores and 8G memory because the operating system itself needs certain resources to run, and each virtual machine runs 3 program processes with 1 core and 2G memory. A machine can only run 9 program processes, but if it runs directly on the operating system, it can easily run more than 12 program processes, indicating that the use of virtual machines solves environmental problems, but the utilization of resources is very low.
Problem 2: The program starts slowly
Since the virtual machine runs a complete operating system, it always takes a lot of time to start the operating system, and even some system-level operation steps cannot be skipped, which makes the virtual machine running program very redundant, and even the program itself takes a long time to start Much less time than the virtual machine operating system, redundant and long startup time, which makes the program very redundant, and even becomes a development bottleneck during the period of rapid iteration.
3. Linux Container
As these problems become more and more prominent, linux has developed another virtualization technology, Linux Containers (LXC)
Linux containers share the same operating system kernel, isolating application processes from the rest of the system. Containers can ensure that the application has the necessary libraries, dependencies and files to run, which is equivalent to wrapping a shell for the program, so that the program thinks that it is running on a complete operating system.
Solve problem 1: only load the environmental resources that the program needs to use, and share the underlying system
A Linux container does not need to start a complete operating system like a virtual machine, but only contains the libraries that the program depends on. Multiple containers can share the underlying operating system, so the container occupies less resources and is small in size. At the same time, the container can package program dependencies. environment of. Containers solve the problem of environment configuration. In a linux container, multiple programs can be run, and each program can run multiple containers as if they are running on a complete operating system. The containers are isolated from each other.
Solve Problem 2: Start only one process, not the OS
The Linux container is just a process to the operating system. The program runs in the container. Starting the container is the process of starting the program. It does not need to operate the redundant startup steps of the complete operating system, so it can be started quickly.
2. What is Docker
Docker is a package of Linux containers that provides an easy-to-use container interface.
Docker is the world's leading software container platform.
Docker uses the Go language launched by Google for development and implementation. It is based on the Linux kernel's Cgroup, Namespace, and AUFS-like UnionFS and other technologies (Cgroup and Namespace will be explained in the third section of the article), which encapsulates and isolates processes and belongs to the operating system. Layered virtualization technology. Since an isolated process is independent of the host and other isolated processes, it is also called a container. Docker's original implementation was based on LXC.
Docker automates repetitive tasks like setting up and configuring a development environment, freeing developers to focus on what really matters: building great software.
Users can easily create and use containers, and put their own applications into containers. Containers can also be versioned, copied, shared, and modified, just like normal code.
2. Features of Docker
Lightweight, multiple Docker containers running on a machine can share the machine's operating system kernel; they can be started quickly and require very little computing and memory resources. Images are constructed through the file system layer and share some common files. This minimizes disk usage and downloads images faster.
Standard, Docker containers are based on open standards and can run on all major Linux distributions, Microsoft Windows, and any infrastructure including VMs, bare metal servers, and clouds.
Security, the isolation that Docker gives applications is not limited to being isolated from each other, but also independent of the underlying infrastructure. Docker provides the strongest isolation by default, so application problems are only a single container problem, not the entire machine.
3. Why use Docker
Docker's image provides a complete runtime environment in addition to the kernel, ensuring the consistency of the application runtime environment, so that there will be no problems like "this code is fine on my machine"; - a consistent operating environment
Startup times in seconds or even milliseconds can be achieved. Greatly saves development, testing, and deployment time. - Faster startup time
Avoid public servers, resources will be vulnerable to other users. - isolation
Good at dealing with centralized and explosive server usage pressure; - elastic scaling, rapid expansion
Applications running on one platform can be easily migrated to another platform without worrying about the situation where the application cannot run normally due to changes in the operating environment. - easy to migrate
Using Docker, you can achieve continuous integration, continuous delivery, and deployment by customizing application images. - Continuous Delivery and Deployment
3. The core principle of docker: how to make the program seem to have exclusive access to system resources
There is only one operating system, the resources required by a process, how to isolate the resources and use them for different processes. The resources here refer to the basic resources required by the process to run, such as CPU, memory, disk, and network. In Linux, there is a function called Cgroups, which can isolate resources.
The operation of the program not only requires basic resources, but also needs to use many functions of the system kernel, such as assigning process numbers, user groups, etc. In Linux, Namespace can partition the kernel resources, and the partition can make the process appear to use a complete global resources.
Supplementary note: The use of Cgroups and Namespaces is not unique to Docker. Cgroups and Namespaces can be said to be the cornerstone of containerization technology. Understanding Cgroups and Namespaces is more helpful to understand the implementation of containerization.
1. Cgroups: Restrict resources
Cgroups is a mechanism provided by the Linux kernel that can limit the resources used by a single process or multiple processes, and can achieve fine-grained control over resources such as CPU and memory. As shown in the figure, Cgroups have a hierarchical structure. Cgroups can limit the resources of subsystems based on the number of Cgroups. Cgroups can form a tree-structured resource limit tree. Subsystems of cgroups need to be attached to the cgroup hierarchy to implement resource constraints on the subsystems. Let's briefly look at the source code structure definition of cgroup.
Subsystems can be attached to hierarchies,
The subsystems attached to the hierarchy have their own algorithms and statistics to limit the resources used by the process group.
Therefore, the subsys field is provided to each subsystem to store the statistical data that limits the resources used by the process group.
We can see that the subsys field is an array,
And each element in the array represents a subsystem related statistics.
From the implementation point of view, cgroup just organizes multiple processes into control process groups, and it is each subsystem that really restricts the use of resources
Next we look at how the kernel associates processes with the cgroups hierarchy.
After creating a node (cgroup structure) in the cgroups hierarchy, a process can be added to the control task list of a node. All processes in the control list of a node will be limited by the resources of the current node. At the same time, a process can also be added to nodes of different cgroups hierarchies, because different cgroups hierarchies can be responsible for different system resources.
Since a process can be added to different cgroups at the same time (provided that these cgroups belong to different levels) for resource control, and these cgroups are attached with different resource control subsystems. Therefore, it is necessary to use a structure to collect the resource control statistics of these subsystems, so that the process can quickly find the corresponding subsystem resource control statistics through the subsystem ID, and the css_set structure is used to do this. Let's take a brief look at the structure definition of css_set to help us better understand how css_set works.
Knowledge Base Team
Knowledge Base Team
Knowledge Base Team
Knowledge Base Team
Explore More Special Offers
50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00