A Beginner's Guide to Kubernetes

By Alex Mungai Muchiri, Alibaba Cloud Tech Share Author. Tech Share is Alibaba Cloud's incentive program to encourage the sharing of technical knowledge and best practices within the cloud community.

Kubernetes was developed by Google as a solution for containerized application management in clustered environments. It is very handy when you need to manage related components that are distributed across infrastructures.

An Overview of Kubernetes

In a nutshell, Kubernetes runs and coordinates applications across machine clusters in a predictable, scalable, and highly available manner. It allows you to determine your applications' behavior and interactions with the outside world. Specifically, it allows you to:

Scale services
Make seamless updates
Manage traffic to application versions
Test faulty deployments

With Kubernetes interfaces and platform primitives, your applications are defined and managed for flexible, reliable and powerful utilization.

Kubernetes Architecture

Kubernetes utilizes a layered architecture that effectively abstracts complexities below the user level. At the basic level, a shared network allows clustered machines to communicate with each other flawlessly. Components, capabilities, and workloads are configured at the cluster level.

Within each Kubernetes cluster, there are specific roles assigned and controlled by one master server or a small group of machines. The master server is both the gateway and manager of other clusters, determining their health and balancing the workload. Master servers are the way through which clients access services through APIs, as well as maintain communication of components.

Apart from the master server, the platform also includes nodes that accept and run workloads using locally available or external resources. Kubernetes further includes containers such as Docker to isolate workload instances. Once nodes receive instructions from the master server, they create or destroy containers. All applications run within cluster containers after the master server submits either JSON or YAML plans after client input. Kubernetes determines how applications are processed entirely.

Components of the Master Server

It has been highlighted that the master server acts as the primary cluster control. It includes components that to accept requests, schedule workloads, authenticate users, control networking, manage health and scaling. The master could be a single machine or a distributed cluster. Below are the main components:

etcd

The etcd project is a configuration store that spans across multiple nodes as a light-weight distributed key-value store. etcd stores configuration data in Kubernetes, from where it is visible to the nodes. Such data allows nodes to reconfigure themselves or to discover new service, as well as maintenance of cluster state through distributed locking or leader election. The etcd provides simple HTTP/JSON APIs on a single machine or distributed on a cluster of machines.

kube-apiserver

The API server controls the workload configuration and organizes Kubernetes units. It ensures consistency between etcd store and the service requirements for deployed containers. It is also the bridge that connects components. kubectl is a Kubernetes client that connects to its RESTful APIs from a local machine.

kube-controller-manager

The controller manager manages controllers, workload life, routine procedures and keeps the cluster in the optimal functional state. It does its work using data stored in the etcd, whereby it implements detected changes such as scaling applications or changing endpoints.

kube-scheduler

The scheduler assigns tasks and workloads to nodes in the cluster. It requires workload operating requirements and an understanding of the cluster state to distribute workloads across the nodes. It also balances workloads across the hosts to avoid exceeding capacity.

cloud-controller-manager

Kubernetes uses the cloud controller manager to determine its interaction with different infrastructure providers. While Kubernetes is able to deploy across various environments, the cloud controller enables maintenance of an internal generic structure while still accommodating the capabilities, features and API standards of the different providers. That way, Kubernetes can continually gather state information from cloud providers and adjust resource provision to satisfy workload requirements. The cloud controller manager is the means through which Kubernetes can deploy across heterogeneous cloud environments through attachable storage and load balancer representations.

Components of the Node Server

As mentioned previously, Kubernetes nodes are those servers that run containers. Nodes communicate with the master server, run workloads and configure container networking. Below are the major components:

Container Runtime

Nodes must have container runtimes, such as Docker, rkt, and runc. A container runtime starts and manages containers to run applications in an isolated operating environment. Workloads can run on a single container or across multiple containers.

kubelet

The kubelet is a service within cluster groups that act as a contact point for the nodes. It processes information and instructions and also accesses the etcd store to read configuration details or write new values. It is responsible for maintaining the work state, launching and containers.

kube-proxy

kube-proxy is a small proxy service that manages host subletting and avails services to other components. It makes the network environment accessible and predictable, as well as forwarding requests to containers.

Objects and Workloads on Kubernetes

Kubernetes object models allow users to access primitives to define instances for workloads. Below are the various objects used to abstract container interfaces:

Pods

Pods encapsulate containers in ones or multiples, translating to the basic Kubernetes instance. Kubernetes controls pods like single applications since they share a life cycle and similar scheduling. As a single unit, they are collectively managed, have a shared environment, volume, and IP spaces. Inside pods, there is the main container to fulfill general workload purposes and some helper containers for related tasks.

Replication Controllers and Replication Sets

Replication controllers define pod templates and control the horizontal scaling of pod replicas. Working with groups of pods is preferred when dealing with Kubernetes due to the simplicity of workload distribution. The replication controller uses the template to create new pods as needed so that running pod matches set configurations. It starts new pods when existing pods fail or as needed. It also starts or kills numbers of pods depending on configuration requirements.

Replication sets, on the other hand, are better and flexible iterations on controllers. At the moment, they are increasingly popular because they offer a wide selection of replica selection. However, you cannot use them to roll out new updates and are thus better utilized inside controller design at higher levels.

Deployments

Deployments are meant to make it easier to manage the life cycles of replicated pods. They are higher level objects that are built on replication sets to offer better update rollouts. Kubernetes allows simple modification of deployments through configurations to manage replica sets, transitioning between versions and other capability management. On the contrary, users have much more responsibility when making changes, such as creating new plans, tracking history or failure recovery. Deployments are therefore what you are most likely to encounter in Kubernetes.

Stateful Sets

Stateful sets control pods in specific order and uniqueness. You would need to make use of stateful sets for finer control, in special deployment order, or persistent network stability requirements. A common application for stateful sets is in database applications where they enable persistent of database-related use cases even when being accessed in different nodes.

Daemon Sets

A daemon set is a pod controller that runs pod copies across the cluster to maintain peak performance and provide services to the nodes. Daemon sets are popular for log collections and forwarding, metric aggregations and processing requirements that improve node capability. In most cases, a daemon set is tasked with providing fundamental services in the respective nodes and is, therefore, able to restrict pod manipulation by other types of controllers. They can even override master server configuration to keep the system running.

Jobs and Cron Jobs

In Kubernetes, containers that run and exit after completing their tasks are run in jobs. They are very useful for batch processing or one-off tasks as opposed to continuous services. On the other hand, cron jobs incorporate a scheduling component to provide interfaces for running jobs. You can use them to specify when a job should run in the future or set up a recurrent task planning.

Additional Components

Kubernetes incorporates many other abstraction services to enable the efficient running of applications, control networking and persist tasks.

Services

Kubernetes uses a component called a service for internal load balancing and linking up the various pods. Pods that perform the same task are grouped in a single entity through a service component. That way, it makes it easy to deploy, track and route certain types of containers in the backend. Services enable persistence of endpoints while the backend work units are replaced when needed. For instance, the Ip address does not change even if the routed pods have changed. A simple service implementation is giving web servers that need to store and retrieve data access to database pods.

Be default, services operate using an IP address that can be routed internally. To access services outside of a cluster, we would need either a NodePort or LoadBalancer. With NodePorts, static ports on external networking interfaces of nodes are opened. It automatically routes traffic to external ports using an internal IP service.

The LoadBalancer type of implementation uses an external load balancer based on Kubernetes load balancer integration to link an external service.

Volumes and Persistent Volumes

There are many challenges when sharing data and ensuring availability in containerized environments during restarts. Kubernetes employ volumes abstraction whereby, all containers in a pod share data and make it available until the termination of a pod. Pods that are closely coupled do not require complex external mechanisms to share data and their failure does not affect data availability. However, the volume is still deployed after terminating a pod.

Persistent volumes enable data persistence beyond a pod's life cycle. Administrators use them to configure a cluster's storage requirement and maintain the volume until it is n longer required.

Labels and Annotations

Labels in Kubernetes are used to mark objects as part of a group. It then becomes easier to manage or route such objects identified by a semantic tag or label. For instance, objects on the controllers use tags to determine their pods of operation.

Annotations use a mechanism similar to labels by allowing the attachment arbitrary key-value pairs to objects. Annotation adds unstructured metadata to objects unlike labels include information useful for pod selection.

Conclusion

In conclusion, Kubernetes enables running of scalable and highly available containers on an abstracted platform. There is simply no match to the powerful, flexible and highly scalable Kubernetes architecture. Understanding the inner working of Kubernetes enables you to run and manage workloads at scale, leveraging the platform's immense capabilities.

To learn more about Kubernetes on Alibaba Cloud, visit www.alibabacloud.com/product/kubernetes

Community