This topic describes how to fairly allocate resources to tenants in a shared cluster to prevent malicious tenants from attacking other tenants.

Background information

Multi-tenancy is classified into soft multi-tenancy and hard multi-tenancy based on the level of isolation.
  • Soft multi-tenancy is suitable for trusted tenants. For example, soft multi-tenancy can separate a Kubernetes cluster among multiple departments of an enterprise. In this scenario, the intention of multi-tenancy is to protect the workloads of each department and prevent potential threats.
  • Hard multi-tenancy is suitable for untrusted tenants. For example, a service provider may need to provide infrastructure resources to different organizations. Hard multi-tenancy enforces stricter tenant isolation by preventing a tenant from accessing other tenants or the Kubernetes system.

Soft multi-tenancy

You can use the following Kubernetes-native resources to implement soft multi-tenancy: namespaces, roles, role bindings, and network policies. This enables logical isolation among tenants. For example, you can use role-based access control (RBAC) to prevent tenants from accessing or manipulating the resources of other tenants. You can use quotas and limit ranges to manage the amount of cluster resources that can be consumed by each tenant. You can also use network policies to prevent applications that are deployed in different namespaces from communicating with each other.

However, you cannot use these control methods to prevent pods of different tenants from sharing a node. You can use node selectors, anti-affinity rules, taints, and tolerations to forcibly schedule pods of different tenants to separate nodes. These nodes are known as sole tenant nodes. The complexity and cost of sole tenant nodes are significantly increased when a cluster is shared by a large number of tenants.

If you use namespaces to implement soft multi-tenancy, you are not allowed to provide tenants with a filtered list of namespaces. This is because namespaces are globally scoped resources. If a tenant can access a specific namespace in a cluster, the tenant can access all namespaces in the cluster.

When soft multi-tenancy is used, tenants can query CoreDNS for all services that run in the cluster by default. Attackers can exploit this capability by running dig SRV ..svc.cluster.local from a pod in the cluster. If you want to limit access to DNS records of services that run in a cluster, you can use the firewall or policy plug-ins for CoreDNS. For more information, see kubernetes-metadata-multi-tenancy-policy.

  • Enterprise setting

    The soft multi-tenancy model is widely used by Kubernetes enterprise users. If all tenants of a Kubernetes cluster belong to the same enterprise, the roles of the service users are manageable. This helps enterprises gain control over the security of their business. Each tenant corresponds to an administrative division, such as a department or team.

    In this scenario, cluster administrators are responsible for creating namespaces and managing policies. The cluster administrators can use a delegated administration model where specific tenants are granted permissions on namespaces. The tenants are granted permissions to perform CRUD operations on non-policy related objects, such as Deployments, Services, pods, and Jobs.

    Isolation methods that are provided by Docker are suitable for this scenario. You can also use other methods, such as Pod Security Policies (PSPs) based on your business requirements. You may want to limit communication among services in different namespaces if stricter isolation is required.

  • Kubernetes as a Service (KaaS)

    You can use soft multi-tenancy in scenarios in which you want to provide Kubernetes as a service. In KaaS environments, your applications are hosted in a shared cluster. The shared cluster contains controllers and CustomResourceObjects (CRDs) that provide a set of Platform as a Service (PaaS) services. Tenants interact with the Kubernetes API server and are allowed to perform CRUD operations on non-policy objects. A self-service feature is provided. For example, tenants are allowed to create and manage their namespaces. In this type of environment, tenants are considered to be running untrusted code.

    If you want to isolate tenants in this type of environment, you may need to use network policies and pod sandboxing. For more information, see Sandboxed containers.

  • Software as a Service (SaaS)

    In SaaS environments, each tenant is associated with a specific instance of an application that runs in the cluster. Each instance contains data and uses separate access control policies other than Kubernetes RBAC.

    Tenants in a SaaS environment do not interact with the Kubernetes API server. The SaaS application interacts with the Kubernetes API server to create objects that are required by each tenant.

Kubernetes configurations

Kubernetes is a single-tenant orchestration platform that is used to manage containerized workloads. When you use Kubernetes, the control plane is shared among all tenants in a cluster. You can use various Kubernetes objects to isolate tenants. For example, you can use namespaces and RBAC to logically isolate tenants from each other. You can also use quotas and limit ranges to control the amount of cluster resources that can be consumed by each tenant. Each cluster provides a strong security boundary. Attackers that gain access to a host in the cluster can retrieve all Secrets, ConfigMaps, and volumes that are mounted to the host. The attackers can also exploit the kubelet, which allows them to manipulate the attributes of the node or move laterally within the cluster. The following Kubernetes-native resources can help you mitigate the risks that may occur when you use single-tenant orchestration platforms, such as Kubernetes. You can also use the Kubernetes resources to isolate tenants by using the methods that are provided in the preceding section.

  • Namespaces

    Namespaces are the basis for implementing soft multi-tenancy. You can use namespaces to divide a cluster into logical partitions. Quotas, network policies, service accounts, and other objects that are required to implement soft multi-tenancy must be scoped to a namespace.

  • AuthN&AuthZ&Admission

    The authorization of Container Service for Kubernetes (ACK) clusters consists of Resource Access Management (RAM) authorization and RBAC authorization. You can use RAM authorization to regulate cluster-level access control, including CRUD operations on a cluster. For example, you can manage the visibility of a cluster, scale a cluster, and add nodes to a cluster by granting the required permissions to a RAM user. RBAC authorization is used to control access to Kubernetes resources in a cluster. You can use RBAC authorization to enforce fine-grained access control on a specified resource in a namespace. ACK provides templates of different predefined roles for users in a tenant. You can also associate multiple custom cluster roles with a user and grant permissions to multiple users at a time. For more information, see Authorization overview.

  • Network policies

    By default, all pods in a Kubernetes cluster are allowed to communicate with each other. You can use network policies to modify this default setting.

  • Network policies limit communication among pods based on labels or CIDR blocks. If you require network isolation among tenants in a multi-tenant environment, add the following rules:
    • A default rule that denies communication among pods.
    • A rule that allows pods to send DNS queries to the DNS server.
  • Quotas and limit ranges

    Quotas are used to define limits on workloads that are hosted in your cluster. You can use quotas to specify the maximum amount of CPU and memory resources that can be consumed by a pod. You can also use quotas to specify the amount of resources that can be allocated for a cluster or namespace. Limit ranges allow you to specify the minimum, maximum, and default values for each limit.

    To maximize resource utilization, you can overcommit resources in a shared cluster. If access to a cluster is unlimited, resources in the cluster may be exhausted. This degrades the performance of the cluster and affects the availability of your applications. If you set the request threshold of a pod to a small value and the actual resource utilization exceeds the capacity of the node, the CPU or memory resources of the node may be exhausted. When the CPU or memory resources are exhausted, the pod may be restarted or evicted from the node.

    To prevent this issue, you must add quotas for namespaces in a multi-tenant environment. This forces tenants to specify request and limit thresholds when they schedule pods in a cluster. In addition, the amount of resources that can be consumed by a pod is limited. This mitigates the risk of service unavailability.

    In KaaS scenarios, you can use quotas to allocate cluster resources to meet the requirements of tenants.

  • Pod priority and preemption

    If you want to provide different quality of service (QoS) levels for customers, you can use pod priority and preemption. For example, you can specify a higher priority value for pods of Customer A than Customer B. If the resource capacity is insufficient, the kubelet evicts pods with lower priority values from customer B to satisfy pods with higher priority values of customer A. You can use this method in a SaaS environment to provide a higher QoS level for customers that pay a premium fee.

Mitigation methods

The major concern of administrators in a multi-tenant environment is preventing attackers from gaining access to underlying hosts. You can use one of the following methods to prevent attackers from gaining access to underlying hosts:

  • Sandboxed-Container

    Sandboxed-Container is an alternative to the Docker runtime. Sandboxed-Container allows you to run applications in a sandboxed and lightweight virtual machine that has a dedicated kernel. This enhances resource isolation and improves security.

    Sandboxed-Container is suitable in scenarios such as untrusted application isolation, fault isolation, performance isolation, and load isolation among multiple users. Sandboxed-Container provides enhanced security, has minor impacts on application performance, and offers the same user experience as Docker in terms of logging, monitoring, and elastic scaling. For more information, see Sandboxed-Container overview.

  • Open Policy Agent (OPA) & Gatekeeper

    Open Policy Agent (OPA) is a powerful policy engine. OPA supports decoupled policy decisions and is used in Kubernetes. If the security requirements of enterprise applications are not met after RBAC is used to perform isolation at the namespace level, you can control access policies at the object level by using OPA. Gatekeeper is a Kubernetes admission controller that enforces policies created by OPA during application deployment. For more information, see Gatekeeper.

    In addition, OPA supports Layer 7 network policies and access control across namespaces based on labels and annotations. You can use OPA to manage Kubernetes network policies in an efficient manner.

  • Kyverno

    Kyverno is a Kubernetes-native policy engine. Kyverno provides policies that can be used to validate, modify, and generate configurations for Kubernetes resources. Kyverno uses Kustomize-style overlays for validation and supports strategic merge patch for mutation. In addition, Kyverno can clone resources across namespaces based on flexible triggers. For more information, see Kyverno.

    You can use Kyverno to isolate namespaces, enforce pod security and other best practices, and generate default configurations, such as network policies. For more information, see Policy repository.

Hard multi-tenancy

You can implement hard multi-tenancy by creating a separate cluster for each tenant. Hard multi-tenancy provides strong isolation among tenants. The following section describes the disadvantages of hard multi-tenancy:
  • The cost of hard multi-tenancy significantly increases if you manage a large number of tenants. You are charged a control plane fee for each cluster that you use. In addition, computing resources cannot be shared among clusters. As a result, fragmentation occurs in scenarios in which specific clusters are underutilized or overutilized.
  • You may need to purchase or build special tooling to manage the clusters. Hundreds of clusters need to be managed over a long period of time.
  • Compared with namespaces, clusters for tenants require a longer period of time to be created. Hard multi-tenancy is suitable for highly regulated industries and SaaS environments that require strong isolation.

Future trend

The Kubernetes community recognizes the disadvantages of soft multi-tenancy and the challenges with hard multi-tenancy. The Multi-Tenancy Special Interest Group (SIG) attempts to address the disadvantages by using several projects:

  • The Virtual Cluster proposal describes a mechanism to create separate instances of the control plane services for each tenant in the cluster, including the API server, controller manager, and scheduler. The tenants in the cluster refer to Kubernetes on Kubernetes. For more information, visit Virtual Cluster.
  • The Hierarchical Namespace Controller (HNC) proposal proposed in Kubernetes Enhancement Proposal (KEP) describes a way to create parent-child relationships between namespaces with policy object inheritance. HNC also allows tenant administrators to create subnamespaces. For more information, see HNC.
  • The Multi-Tenancy Benchmarks proposal provides guidelines on how to share clusters after the clusters are isolated and segmented by using namespaces. The proposal also describes how to use the kubectl-mtb CLI to ensure compliance with the guidelines. For more information, see Multi-Tenancy Benchmarks.