Kubernetes Eviction Policies for Handling Low RAM and Disk Space Situations - Part 2

By Alwyn Botha, Alibaba Cloud Community Blog author.

Hard RAM Eviction Thresholds

Start minikube, this time with only hard thresholds.

minikube start --extra-config=kubelet.eviction-hard="memory.available<650Mi" --extra-config=kubelet.feature-gates="ExperimentalCriticalPodAnnotation=true"  --extra-config=kubelet.eviction-pressure-transition-period="30s"

Give node startup processes time to complete. ( 2 minutes should be OK )

After 15 minutes in my case:

memory.available_in_mb 1196

Get Pods:

kubectl get pods
NAME     READY   STATUS    RESTARTS   AGE
myram2   1/1     Running   0          15m

Kubernetes kept the Pod spec definition we attempted to start at the end of part 1 tutorial.

Upon a fresh Kubernetes node start it starts up all Pods for which it has specs.

Create myrampod3.yaml

kubectl create -f myrampod3.yaml
pod/myram3 created

kubectl get pods

NAME     READY   STATUS    RESTARTS   AGE
myram2   1/1     Running   0          16m
myram3   1/1     Running   0          18s

Check kubelet calculated: memory.available_in_mb:

Another 60 MB used.

source ./memory-available.sh | tail -n1

memory.available_in_mb 1140

No MemoryPressure yet.

kubectl describe node minikube | grep MemoryPressure

  MemoryPressure   False   Fri, 01 Feb 2019 08:30:04 +0200   KubeletHasSufficientMemory   kubelet has sufficient memory available

Create third 50 MB Pod:

kubectl create -f myrampod4.yaml
pod/myram4 created


kubectl get pods
NAME     READY   STATUS    RESTARTS   AGE
myram2   1/1     Running   0          17m
myram3   1/1     Running   0          83s
myram4   1/1     Running   0          6s

No MemoryPressure yet.

kubectl describe node minikube | grep MemoryPressure
Fri Feb  1 08:30:55 SAST 2019
  MemoryPressure   False   Fri, 01 Feb 2019 08:30:54 +0200   KubeletHasSufficientMemory   kubelet has sufficient memory available

a minute later ...

memory.available_in_mb 1081

a minute later ...

memory.available_in_mb 699

Mystery 300 MB process uses 300 MB RAM.

( Based on several reboot tests I learned that after a few 50 MB Pods Kubernetes needs to allocate some RAM for internal use. )

We now have a MemoryPressure condition.

eviction-hard RAM 650 MiB = 680 MB

kubectl describe node minikube | grep MemoryPressure

  MemoryPressure   True    Fri, 01 Feb 2019 08:31:44 +0200   KubeletHasInsufficientMemory   kubelet has insufficient memory available

We expect Pods to be swiftly evicted.

kubectl get pods

NAME     READY   STATUS    RESTARTS   AGE
myram2   0/1     Evicted   0          18m
myram3   1/1     Running   0          2m34s
myram4   1/1     Running   0          77s

Logs from kubelet :


06:31:23 attempting to reclaim memory
06:31:23 must evict pod(s) to reclaim memory

06:31:23 pods ranked for eviction: 
 myram2_default,
 myram3_default,
 myram4_default,
 metrics-server-6486d4db88-t6krr_kube-system,
 kubernetes-dashboard-5bff5f8fb8-jxslb_kube-system,

06:31:24 pod myram2_default is evicted successfully
06:31:24 pods myram2_default evicted, waiting for pod to be cleaned up
06:31:26 pods myram2_default successfully cleaned up

Check MemoryPressure status:

kubectl describe node minikube | grep MemoryPressure
  MemoryPressure   False   Fri, 01 Feb 2019 08:33:05 +0200   KubeletHasSufficientMemory   kubelet has sufficient memory available

No MemoryPressure. 700 MB available; threshold is 680 MB.

memory.available_in_mb 701

5 minutes later: still no more Pods evicted.

kubectl get pods

NAME     READY   STATUS    RESTARTS   AGE
myram2   0/1     Evicted   0          23m
myram3   1/1     Running   0          7m
myram4   1/1     Running   0          5m43s

Delete myram2 so that we can have a neat kubectl get pods list.

kubectl delete -f myrampod2.yaml
pod "myram2" deleted

Define a 100 MB Pod:

nano myrampod7.yaml

apiVersion: v1
kind: Pod
metadata:
  name: myram7
spec:
  containers:
  - name: myram-container-1
    image: mytutorials/centos:bench
    imagePullPolicy: IfNotPresent
    
    command: ['sh', '-c', 'stress --vm 1 --vm-bytes 100M --vm-hang 3000 -t 3600']
    
    resources:
      limits:
        memory: "600Mi"
      requests:
        memory: "10Mi"
    
  restartPolicy: Never
  terminationGracePeriodSeconds: 0

Create Pod:

kubectl create -f myrampod7.yaml
pod/myram7 created

Check kubelet calculated: memory.available_in_mb:

memory.available_in_mb 541

Available RAM below 680 MB threshold. We have a MemoryPressure situation.

Seconds later two Pods got evicted.

kubectl get pods
NAME     READY   STATUS    RESTARTS   AGE
myram3   0/1     Evicted   0          9m39s
myram4   0/1     Evicted   0          8m22s
myram7   0/1     Evicted   0          19s

kubelet logs:

06:38:47 attempting to reclaim memory
06:38:47 must evict pod(s) to reclaim memory

06:38:47 pods ranked for eviction: 
 myram4_default,
 myram3_default,
 myram7_default,
 metrics-server-6486d4db88-t6krr_kube-system,
 kubernetes-dashboard-5bff5f8fb8-jxslb_kube-system,

06:38:48 pod myram4_default is evicted successfully
06:38:48 pods myram4_default evicted, waiting for pod to be cleaned up
06:38:50 pods myram4_default successfully cleaned up

06:38:50 attempting to reclaim memory
06:38:50 must evict pod(s) to reclaim memory

myram4 Pod requests 10 MB, uses 50 MB
myram3 Pod requests 10 MB, uses 50 MB
myram7 Pod requests 10 MB, uses 100 MB

Based on https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/#evicting-end-user-pods

pods are ranked by Priority, and then usage above request.

This is not what we observe here. 9 and 8 minutes old Pods get evicted first.

Then myram7 Pod that used considerably more RAM than it requested gets evicted. Based on official Kubernetes information I would have expected myram7 to be evicted first. See logs below.

06:38:50 pods ranked for eviction:
 myram3_default,
 myram7_default,
 metrics-server-6486d4db88-t6krr_kube-system,
 kubernetes-dashboard-5bff5f8fb8-jxslb_kube-system,

06:38:50 pod myram3_default is evicted successfully
06:38:50 pods myram3_default evicted, waiting for pod to be cleaned up
06:38:52 pods myram3_default successfully cleaned up

06:38:52 attempting to reclaim memory
06:38:52 must evict pod(s) to reclaim memory

- - - - 

06:38:52 pods ranked for eviction: 
 myram7_default,
 metrics-server-6486d4db88-t6krr_kube-system,
 kubernetes-dashboard-5bff5f8fb8-jxslb_kube-system,
 kube-proxy-gnffr_kube-system,

06:38:53 pod myram7_default is evicted successfully
06:38:53 pods myram7_default evicted, waiting for pod to be cleaned up
06:38:55 pods myram7_default successfully cleaned up

Check available RAM:

memory.available_in_mb 719

We do not have a MemoryPressure condition anymore. ( I forgot to do that actual grep command here )

Let's read the describe output to see how an eviction gets reported:

kubectl describe pod/myram7

Name:               myram7
Start Time:         Fri, 01 Feb 2019 08:38:37 +0200
Status:             Failed
Reason:             Evicted
Message:            The node was low on resource: memory. Container myram-container-1 was using 103668Ki, which exceeds its request of 10Mi.
IP:
Containers:
  myram-container-1:
    Command:
      stress --vm 1 --vm-bytes 100M --vm-hang 3000 -t 3600
    Requests:
      memory:     10Mi
QoS Class:       Burstable
Events:
  Type     Reason     Age    From               Message
  ----     ------     ----   ----               -------
  Normal   Scheduled  3m18s  default-scheduler  Successfully assigned default/myram7 to minikube
  Normal   Pulled     3m17s  kubelet, minikube  Container image "centos:bench" already present on machine
  Normal   Created    3m17s  kubelet, minikube  Created container
  Normal   Started    3m17s  kubelet, minikube  Started container
  Warning  Evicted    3m3s   kubelet, minikube  The node was low on resource: memory. Container myram-container-1 was using 103668Ki, which exceeds its request of 10Mi.
  Normal   Killing    3m2s   kubelet, minikube  Killing container with id docker://myram-container-1:Need to kill Pod

Last 2 events lines explain it adequately.

Clean up other Pods:

kubectl delete -f myrampod3.yaml
pod "myram3" deleted

kubectl delete -f myrampod4.yaml
pod "myram4" deleted

kubectl delete -f myrampod7.yaml
pod "myram7" deleted

kubectl get pods
No resources found.

Evictions and Priority Classes

You have heard before that pods are ranked for eviction by Priority.

This part of tutorial demonstrates that.

Kubelet eviction thresholds as before:

minikube stop

minikube start --extra-config=kubelet.eviction-hard="memory.available<650Mi" --extra-config=kubelet.feature-gates="ExperimentalCriticalPodAnnotation=true"  --extra-config=kubelet.eviction-pressure-transition-period="30s"

We need 3 priority classes:

nano low-priority.yaml

apiVersion: scheduling.k8s.io/v1beta1
kind: PriorityClass
metadata:
  name: low-priority
value: 10
globalDefault: true
description: "low priority class"

nano med-priority.yaml

apiVersion: scheduling.k8s.io/v1beta1
kind: PriorityClass
metadata:
  name: med-priority
value: 500
globalDefault: false
description: "med priority class"

nano high-priority.yaml

apiVersion: scheduling.k8s.io/v1beta1
kind: PriorityClass
metadata:
  name: high-priority
value: 1000
globalDefault: false
description: "high priority class"

kubectl create -f high-priority.yaml
kubectl create -f med-priority.yaml
kubectl create -f low-priority.yaml

Below are the YAML files for 6 Pods. Note they all use priorityClassName. Create it all.

nano mylow35.yaml

apiVersion: v1
kind: Pod
metadata:
  name: mylow35
spec:
  containers:
  - name: myram-container-1
    image: mytutorials/centos:bench
    imagePullPolicy: IfNotPresent
    
    command: ['sh', '-c', 'stress --vm 1 --vm-bytes 35M --vm-hang 3000 -t 3600']
    
    resources:
      limits:
        memory: "600Mi"
      requests:
        memory: "1Mi"
    
  restartPolicy: Never
  terminationGracePeriodSeconds: 0
  
  priorityClassName: low-priority

nano mylow15.yaml

apiVersion: v1
kind: Pod
metadata:
  name: mylow15
spec:
  containers:
  - name: myram-container-1
    image: mytutorials/centos:bench
    imagePullPolicy: IfNotPresent
    
    command: ['sh', '-c', 'stress --vm 1 --vm-bytes 15M --vm-hang 3000 -t 3600']
    
    resources:
      limits:
        memory: "600Mi"
      requests:
        memory: "1Mi"
    
  restartPolicy: Never
  terminationGracePeriodSeconds: 0
  
  priorityClassName: low-priority

nano mymed35.yaml

apiVersion: v1
kind: Pod
metadata:
  name: mymed35
spec:
  containers:
  - name: myram-container-1
    image: mytutorials/centos:bench
    imagePullPolicy: IfNotPresent
    
    command: ['sh', '-c', 'stress --vm 1 --vm-bytes 35M --vm-hang 3000 -t 3600']
    
    resources:
      limits:
        memory: "600Mi"
      requests:
        memory: "1Mi"
    
  restartPolicy: Never
  terminationGracePeriodSeconds: 0
  
  priorityClassName: med-priority

nano mymed15.yaml

apiVersion: v1
kind: Pod
metadata:
  name: mymed15
spec:
  containers:
  - name: myram-container-1
    image: mytutorials/centos:bench
    imagePullPolicy: IfNotPresent
    
    command: ['sh', '-c', 'stress --vm 1 --vm-bytes 15M --vm-hang 3000 -t 3600']
    
    resources:
      limits:
        memory: "600Mi"
      requests:
        memory: "1Mi"
    
  restartPolicy: Never
  terminationGracePeriodSeconds: 0
  
  priorityClassName: med-priority

nano myhigh35.yaml

apiVersion: v1
kind: Pod
metadata:
  name: myhigh35
spec:
  containers:
  - name: myram-container-1
    image: mytutorials/centos:bench
    imagePullPolicy: IfNotPresent
    
    command: ['sh', '-c', 'stress --vm 1 --vm-bytes 35M --vm-hang 3000 -t 3600']
    
    resources:
      limits:
        memory: "600Mi"
      requests:
        memory: "1Mi"
    
  restartPolicy: Never
  terminationGracePeriodSeconds: 0
  
  priorityClassName: high-priority

nano myhigh15.yaml

apiVersion: v1
kind: Pod
metadata:
  name: myhigh15
spec:
  containers:
  - name: myram-container-1
    image: mytutorials/centos:bench
    imagePullPolicy: IfNotPresent
    
    command: ['sh', '-c', 'stress --vm 1 --vm-bytes 15M --vm-hang 3000 -t 3600']
    
    resources:
      limits:
        memory: "600Mi"
      requests:
        memory: "1Mi"
    
  restartPolicy: Never
  terminationGracePeriodSeconds: 0
  
  priorityClassName: high-priority

Summary:

myhigh15 ... High Priority Class ... request memory: "1Mi" : use 15Mi
myhigh35 ... High Priority Class ... request memory: "1Mi" : use 35Mi
mymed15 ... Medium Priority Class ... request memory: "1Mi" : use 15Mi
mymed35 ... Medium Priority Class ... request memory: "1Mi" : use 35Mi
mylow15 ... Low Priority Class ... request memory: "1Mi" : use 15Mi
mylow35 ... Low Priority Class ... request memory: "1Mi" : use 35Mi

kubectl create -f myhigh15.yaml
kubectl create -f myhigh35.yaml

kubectl create -f mymed15.yaml
kubectl create -f mymed35.yaml

kubectl create -f mylow15.yaml
kubectl create -f mylow35.yaml

MemoryPressure?

kubectl describe node minikube | grep MemoryPressure
  MemoryPressure   False   Fri, 01 Feb 2019 10:13:24 +0200   KubeletHasSufficientMemory   kubelet has sufficient memory available

No.

memory.available_in_mb 1078

a minute later ... 200 MB process uses additional 200 MB memory.

memory.available_in_mb 772

eviction-hard RAM 650 MiB = 680 MB

You can specify eviction thresholds in MB:

From https://github.com/kubernetes/apimachinery/blob/master/pkg/api/resource/quantity.go

Ki | Mi | Gi | Ti | Pi | Ei

k | M | G | T | P | E

List running Pods - no evictions - as expected.

kubectl get pods

NAME       READY   STATUS    RESTARTS   AGE
myhigh15   1/1     Running   0          74s
myhigh35   1/1     Running   0          74s
mylow15    1/1     Running   0          75s
mylow35    1/1     Running   0          75s
mymed15    1/1     Running   0          74s
mymed35    1/1     Running   0          75s

Run our 100 MB Pod. That will surely push our node into MemoryPressure .

kubectl create -f myrampod7.yaml
pod/myram7 created

Mere seconds later: 2 Pods evicted ( Hard threshold acts immediately ).

kubectl get pods

NAME       READY   STATUS    RESTARTS   AGE
myhigh15   1/1     Running   0          112s
myhigh35   1/1     Running   0          112s
mylow15    1/1     Running   0          113s
mylow35    0/1     Evicted   0          113s
mymed15    1/1     Running   0          112s
mymed35    1/1     Running   0          113s
myram7     0/1     Evicted   0          9s

Check RAM available:

memory.available_in_mb 667

We are in MemoryPressure True condition.

kubectl describe node minikube | grep MemoryPressure
  MemoryPressure   True    Fri, 01 Feb 2019 10:15:14 +0200   KubeletHasInsufficientMemory   kubelet has insufficient memory available

Let's investigate the kubelet eviction manager to see how it ranks which Pods to evict.

08:15:01 attempting to reclaim memory
08:15:01 must evict pod(s) to reclaim memory

08:15:01 pods ranked for eviction: 
 mylow35_default, 
 myram7_default, 
 mylow15_default, 
 metrics-server-6486d4db88-t6krr_kube-system,
 kubernetes-dashboard-5bff5f8fb8-jxslb_kube-system, 
 mymed35_default, 
 mymed15_default, 
 myhigh35_default, 
 myhigh15_default, 
 kube-proxy-nrfkm_kube-system, 

08:15:01 pod mylow35_default is evicted successfully
08:15:01 pods mylow35_default evicted, waiting for pod to be cleaned up
08:15:04 pods mylow35_default successfully cleaned up

Low priority Pods are listed first - will be evicted first
Medium priority Pods are listed second
High priority Pods are listed last - will be evicted last

35 MB Pods are listed before 15 MB Pods every time.

Official theory ...

pods are ranked by Priority, and then usage above request.

EXACTLY what we see here.

Important tip: specify accurate RAM requests for your Pods. That way they are less likely to be evicted. Prevent everyone from vastly overstating RAM requests ( to prevent eviction ).

Our myram7 Pod that uses 100 MB is listed right at the top as likely to be evicted first.

08:15:04 attempting to reclaim memory
08:15:04 must evict pod(s) to reclaim memory

08:15:04 pods ranked for eviction: 
 myram7_default,
 mylow15_default,
 metrics-server-6486d4db88-t6krr_kube-system,
 kubernetes-dashboard-5bff5f8fb8-jxslb_kube-system,
 mymed35_default,
 mymed15_default,
 myhigh35_default,
 myhigh15_default,
 kube-proxy-nrfkm_kube-system,
 kube-addon-manager-minikube_kube-system, coredns-576cbf47c7-s4mwb_kube-system, coredns-576cbf47c7-r8f6v_kube-system

08:15:04 pod myram7_default is evicted successfully
08:15:04 pods myram7_default evicted, waiting for pod to be cleaned up
08:15:06 pods myram7_default successfully cleaned up

30 seconds later more Pods evicted.

Let's investigate the kubelet eviction manager to see how it ranks which of these latter Pods to evict.

Based on what you just saw above you can probably predict correctly what the eviction ranks will look like:

Pods are evicted in ranked order

mylow15_default
metrics-server-6486d4db88-t6krr_kube-system
kubernetes-dashboard-5bff5f8fb8-jxslb_kube-system
mymed35_default

08:15:07 attempting to reclaim memory
08:15:07 must evict pod(s) to reclaim memory

08:15:07 pods ranked for eviction: 
 mylow15_default,
 metrics-server-6486d4db88-t6krr_kube-system,
 kubernetes-dashboard-5bff5f8fb8-jxslb_kube-system,
 mymed35_default,
 mymed15_default,
 myhigh35_default,
 myhigh15_default,
 
08:15:07 pod mylow15_default is evicted successfully
08:15:07 pods mylow15_default evicted, waiting for pod to be cleaned up
08:15:09 pods mylow15_default successfully cleaned up

08:15:09 attempting to reclaim memory
08:15:09 must evict pod(s) to reclaim memory

08:15:09 pods ranked for eviction: 
 metrics-server-6486d4db88-t6krr_kube-system,
 kubernetes-dashboard-5bff5f8fb8-jxslb_kube-system,
 mymed35_default,
 mymed15_default,
 myhigh35_default,
 myhigh15_default,
 
08:15:09 pod metrics-server-6486d4db88-t6krr_kube-system is evicted successfully
08:15:09 pods metrics-server-6486d4db88-t6krr_kube-system evicted, waiting for pod to be cleaned up
08:15:12 pods metrics-server-6486d4db88-t6krr_kube-system successfully cleaned up

08:15:12 attempting to reclaim memory
08:15:12 must evict pod(s) to reclaim memory

08:15:12 pods ranked for eviction: 
 kubernetes-dashboard-5bff5f8fb8-jxslb_kube-system,
 mymed35_default,
 mymed15_default,
 myhigh35_default,
 myhigh15_default,

08:15:13 pod kubernetes-dashboard-5bff5f8fb8-jxslb_kube-system is evicted successfully
08:15:13 pods kubernetes-dashboard-5bff5f8fb8-jxslb_kube-system evicted, waiting for pod to be cleaned up
08:15:15 pods kubernetes-dashboard-5bff5f8fb8-jxslb_kube-system successfully cleaned up

08:15:25 attempting to reclaim memory
08:15:25 must evict pod(s) to reclaim memory

08:15:25 pods ranked for eviction:
 mymed35_default,
 mymed15_default,
 myhigh35_default,
 myhigh15_default,

08:15:25 pod mymed35_default is evicted successfully
08:15:25 pods mymed35_default evicted, waiting for pod to be cleaned up
08:15:29 pods mymed35_default successfully cleaned up

Determine Pod status:

kubectl get pods
NAME       READY   STATUS    RESTARTS   AGE
myhigh15   1/1     Running   0          2m14s
myhigh35   1/1     Running   0          2m14s
mylow15    0/1     Evicted   0          2m15s
mylow35    0/1     Evicted   0          2m15s
mymed15    1/1     Running   0          2m14s
mymed35    0/1     Evicted   0          2m15s
myram7     0/1     Evicted   0          31s

Check RAM available:

memory.available_in_mb 652

Still under MemoryPressure.

kubectl describe node minikube | grep MemoryPressure

  MemoryPressure   True    Fri, 01 Feb 2019 10:17:04 +0200   KubeletHasInsufficientMemory   kubelet has insufficient memory available

a minute later ...

kubectl describe node minikube | grep MemoryPressure
  MemoryPressure   False   Fri, 01 Feb 2019 10:17:45 +0200   KubeletHasSufficientMemory   kubelet has sufficient memory available

List Pods ... all evicted.

kubectl get pods

NAME       READY   STATUS    RESTARTS   AGE
myhigh15   0/1     Evicted   0          5m18s
myhigh35   0/1     Evicted   0          5m18s
mylow15    0/1     Evicted   0          5m19s
mylow35    0/1     Evicted   0          5m19s
mymed15    0/1     Evicted   0          5m18s
mymed35    0/1     Evicted   0          5m19s
myram7     0/1     Evicted   0          3m35s

It is worthwhile to investigate the evict manager here since there is actually a problem.

Several more repeating phrases deleted from output :

attempting to reclaim memory
must evict pod(s) to reclaim memory
evicted, waiting for pod to be cleaned up
successfully cleaned up

metrics-server, kubernetes-dashboard and mymed15 evicted - no problem so far.

08:16:09 pods ranked for eviction:
 metrics-server-6486d4db88-n7tdt_kube-system,
 kubernetes-dashboard-5bff5f8fb8-vg5hw_kube-system,
 mymed15_default,
 myhigh35_default,
 myhigh15_default,
 kube-proxy-nrfkm_kube-system,

08:16:11 pod metrics-server-6486d4db88-n7tdt_kube-system is evicted successfully

- - - 

08:16:15 pods ranked for eviction: 
 kubernetes-dashboard-5bff5f8fb8-vg5hw_kube-system,
 mymed15_default,
 myhigh35_default,
 myhigh15_default,
 kube-proxy-nrfkm_kube-system,

08:16:15 pod kubernetes-dashboard-5bff5f8fb8-vg5hw_kube-system is evicted successfully

- - - 

08:16:18 pods ranked for eviction: 
 mymed15_default,
 myhigh35_default,
 myhigh15_default,
 kube-proxy-nrfkm_kube-system,

08:16:19 pod mymed15_default is evicted successfully

A replacement metrics-server and kubernetes-dashboard needed eviction.

System is spinning its wheels - 2 steps forward then 2 steps back ( 2 Pods evicted, got replaced, new 2 replacement Pods need eviction )

08:17:01 pods ranked for eviction: 
 metrics-server-6486d4db88-q2pkp_kube-system,
 kubernetes-dashboard-5bff5f8fb8-mnmxk_kube-system,
 myhigh35_default,
 myhigh15_default,
 kube-proxy-nrfkm_kube-system,

08:17:01 pod metrics-server-6486d4db88-q2pkp_kube-system is evicted successfully

- - - 

08:17:03 pods ranked for eviction: 
 kubernetes-dashboard-5bff5f8fb8-mnmxk_kube-system,
 myhigh35_default,
 myhigh15_default,
 kube-proxy-nrfkm_kube-system,

08:17:03 pod kubernetes-dashboard-5bff5f8fb8-mnmxk_kube-system is evicted successfully

- - - 

08:17:05 pods ranked for eviction: 
 myhigh35_default,
 myhigh15_default,
 kube-proxy-nrfkm_kube-system,

08:17:05 pod myhigh35_default is evicted successfully

Another 2 replacement Pods need eviction: kubernetes-dashboard, metrics-server.

08:17:49 pods ranked for eviction: 
 kubernetes-dashboard-5bff5f8fb8-x4r6t_kube-system,
 myhigh15_default,
 kube-proxy-nrfkm_kube-system,
 metrics-server-6486d4db88-7fs9w_kube-system,

08:17:50 pod kubernetes-dashboard-5bff5f8fb8-x4r6t_kube-system is evicted successfully

- - - 

08:17:53 pods ranked for eviction: 
 metrics-server-6486d4db88-7fs9w_kube-system,
 myhigh15_default,
 kube-proxy-nrfkm_kube-system,

08:17:53 pod metrics-server-6486d4db88-7fs9w_kube-system is evicted successfully

- - - 

08:17:55 pods ranked for eviction:
 myhigh15_default,
 kube-proxy-nrfkm_kube-system,

08:17:55 pod myhigh15_default is evicted successfully

During the next few minutes there was a brief moment when memory available was enough ( by a mere 500 kilobytes ).

uptime;source ./mem | tail -n1

 10:17:51 up 8 min,  1 user,  load average: 1.67, 0.88, 0.47
memory.available_in_mb 639

 10:18:53 up 9 min,  1 user,  load average: 0.67, 0.73, 0.44
memory.available_in_mb 649 .... on border of threshold 

 10:19:39 up 10 min,  1 user,  load average: 0.63, 0.72, 0.45
memory.available_in_mb 642

 10:20:22 up 11 min,  1 user,  load average: 1.29, 0.92, 0.54
memory.available_in_mb 626

Fri Feb  1 10:19:02 SAST 2019
  MemoryPressure   False   Fri, 01 Feb 2019 10:18:55 +0200   KubeletHasSufficientMemory   kubelet has sufficient memory available
  
kubectl describe node minikube | grep MemoryPressure
Fri Feb  1 10:19:43 SAST 2019
  MemoryPressure   False   Fri, 01 Feb 2019 10:19:35 +0200   KubeletHasSufficientMemory   kubelet has sufficient memory available
  
kubectl describe node minikube | grep MemoryPressure
Fri Feb  1 10:20:16 SAST 2019
  MemoryPressure   True    Fri, 01 Feb 2019 10:20:15 +0200   KubeletHasInsufficientMemory   kubelet has insufficient memory available

Delete evicted Pods: make RAM available :

kubectl delete -f mylow35.yaml
pod "mylow35" deleted

kubectl delete -f mylow15.yaml
pod "mylow15" deleted

kubectl delete -f mymed35.yaml
pod "mymed35" deleted

kubectl delete -f mymed15.yaml
pod "mymed15" deleted

kubectl delete -f myhigh35.yaml
pod "myhigh35" deleted

kubectl delete -f myhigh15.yaml
pod "myhigh15" deleted

Still not enough RAM available :

memory.available_in_mb 623

I did not do a before and after, but it seems evicted Pods use very little RAM anyway.

Node now permanently under MemoryPressure:

kubectl describe node minikube | grep MemoryPressure
  MemoryPressure   True    Fri, 01 Feb 2019 10:22:26 +0200   KubeletHasInsufficientMemory   kubelet has insufficient memory available

A minute later : kubelet STILL has insufficient memory available 

kubectl describe node minikube | grep MemoryPressure
  MemoryPressure   True    Fri, 01 Feb 2019 10:23:06 +0200   KubeletHasInsufficientMemory   kubelet has insufficient memory available

2 minutes later : kubelet STILL has insufficient memory available 

  MemoryPressure   True    Fri, 01 Feb 2019 10:24:56 +0200   KubeletHasInsufficientMemory   kubelet has insufficient memory available
  
5 minutes later : kubelet STILL has insufficient memory available   

  MemoryPressure   True    Fri, 01 Feb 2019 10:30:47 +0200   KubeletHasInsufficientMemory   kubelet has insufficient memory available
  
7 minutes later : kubelet STILL has insufficient memory available 

  MemoryPressure   True    Fri, 01 Feb 2019 10:37:08 +0200   KubeletHasInsufficientMemory   kubelet has insufficient memory available

Over 15 minutes ... memory available stays below: kubelet.eviction-hard="memory.available < 650Mi

memory.available_in_mb 624

5 minutes later : kubelet STILL has insufficient memory available   

memory.available_in_mb 643

5 minutes later : kubelet STILL has insufficient memory available   

memory.available_in_mb 642

6 minute2 later : kubelet STILL has insufficient memory available   

memory.available_in_mb 627

During the next 25 minutes this is how kubelet eviction manager tries to fix low RAM situation.

It repeatedly evicts not-critical Pods. However those Pods get restarted automatically by kubelet since they are still Kubernetes system Pods.

Kubernetes seems to learn somewhat from this situation that is beyond any hope of repair.

Initially it evicts kube-proxy every 30 seconds. It then slowly takes longer to create replacement Pods. After 30 minutes it takes 14 minutes for kube-proxy eviction and Pod recreation.

From https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy

Exited Containers that are restarted by the kubelet are restarted with an exponential back-off delay (10s, 20s, 40s …) capped at five minutes.

This behavior seems similar to the exponential back-off delay of exited containers.

08:21:19 pod kube-proxy-cgk9f_kube-system is evicted successfully
08:21:49 pod kube-proxy-7zql6_kube-system is evicted successfully
08:22:20 pod kube-proxy-2glmn_kube-system is evicted successfully
08:22:50 pod kube-proxy-qwjgw_kube-system is evicted successfully
08:23:20 pod kube-proxy-vj6sc_kube-system is evicted successfully
08:23:23 pod kube-addon-manager-minikube_kube-system is evicted successfully
08:23:53 pod coredns-576cbf47c7-s4mwb_kube-system is evicted successfully
08:23:57 pod coredns-576cbf47c7-r8f6v_kube-system is evicted successfully

08:24:11 pod kube-proxy-m6c8p_kube-system is evicted successfully
08:25:43 pod metrics-server-6486d4db88-jx556_kube-system is evicted successfully
08:25:46 pod kubernetes-dashboard-5bff5f8fb8-qmd9p_kube-system is evicted successfully
08:25:49 pod coredns-576cbf47c7-hn7g6_kube-system is evicted successfully
08:25:51 pod coredns-576cbf47c7-xk2vq_kube-system is evicted successfully

08:27:04 pod kube-proxy-zzcm2_kube-system is evicted successfully
08:32:38 pod kube-proxy-84thj_kube-system is evicted successfully
08:33:22 pod kubernetes-dashboard-5bff5f8fb8-mqxst_kube-system is evicted successfully
08:33:24 pod metrics-server-6486d4db88-n8rd7_kube-system is evicted successfully
08:33:26 pod coredns-576cbf47c7-ntg9b_kube-system is evicted successfully
08:33:30 pod coredns-576cbf47c7-mxl4p_kube-system is evicted successfully
08:33:33 pod coredns-576cbf47c7-5scvv_kube-system is evicted successfully

08:43:39 pod kube-proxy-qrbks_kube-system is evicted successfully
08:46:00 pod coredns-576cbf47c7-xc9c4_kube-system is evicted successfully
08:46:04 pod kubernetes-dashboard-5bff5f8fb8-7rmnh_kube-system is evicted successfully
08:46:06 pod metrics-server-6486d4db88-wl2vl_kube-system is evicted successfully
08:46:08 pod kube-addon-manager-minikube_kube-system is evicted successfully
08:46:38 pod coredns-576cbf47c7-z7dzr_kube-system is evicted successfully
08:46:41 pod coredns-576cbf47c7-qxgdr_kube-system is evicted successfully

This is the top resident RAM usage processes.

  PID USER      PR  NI    VIRT    RES  %CPU  %MEM     TIME+ S COMMAND
 3278 root      20   0  544.2m 472.2m   5.9  22.1   2:52.31 S kube-apiserver
 3340 root      20   0  176.3m 125.1m   5.9   5.9   3:12.81 S kube-controller
 2856 root      20   0 1815.9m 103.0m   0.0   4.8   1:53.50 S kubelet
 3276 root      20   0   10.1g  80.9m   0.0   3.8   0:54.08 S etcd
 2429 root      20   0  784.0m  69.0m   0.0   3.2   0:46.09 S dockerd
 2224 root      20   0 1070.9m  36.4m   0.0   1.7   0:19.23 S containerd
 3393 root      20   0   50.2m  34.2m   0.0   1.6   0:56.39 S kube-scheduler
 2438 root      20   0 1701.0m  28.3m   0.0   1.3   0:18.84 S docker-containe

2200 MB RAM node minus around 1200 MB for processes above - that leaves 1000 MB available .

However our script shows 600 MB available. Let us call the missing ( in use ) 400 MB Kubernetes and Linux system overhead.

These are the running Kubernetes services.

systemctl list-units --type=service --state=active

UNIT                                   LOAD   ACTIVE SUB     DESCRIPTION
containerd.service                     loaded active running containerd container runtime
docker.service                         loaded active running Docker Application Container Engine
kubelet.service                        loaded active running kubelet: The Kubernetes Node Agent

Restarting kubelet does not fix problem.

systemctl restart kubelet

memory.available_in_mb 621

kube-apiserver uses around 500 MB RAM upon startup so that did not balloon out of control either.

Clean up : delete ...

kubectl delete -f myrampod7.yaml
pod "myram7" deleted

Eviction Thresholds Syntax

Reference : https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/#eviction-thresholds

This tutorial specified memory thresholds using

memory.available<600Mi

600 MiB = 629.1456 MB

You can also specify the threshold in MBs

memory.available<630M

Thresholds may also be specified in percentages, for example a 12 GB node may have:

memory.available<10%

Eviction Monitoring Interval

From https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/

The kubelet evaluates eviction thresholds per its configured housekeeping interval.

housekeeping-interval is the interval between container housekeepings.

From https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/

--housekeeping-interval duration

Interval between container housekeepings (default 10s)

This tutorial did not change this interval from its default value.

eviction-pressure-transition-period

From https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/

--eviction-pressure-transition-period duration
Duration for which the kubelet has to wait before transitioning out of an eviction pressure condition. (default 5m0s)

From https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/#oscillation-of-node-conditions

eviction-pressure-transition-period is the duration for which the kubelet has to wait before transitioning out of an eviction pressure condition.

The kubelet would ensure that it has not observed an eviction threshold being met for the specified pressure condition for the period specified before toggling the condition back to false.

We deliberately set this value to 30 seconds so that we could quickly see transitions into and out of MemoryPressure.

30 seconds is probably a too low value for production usage. The official Kubernetes default value of 5 minutes seem a very good default.

Evicting Guaranteed Pods

From https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/#evicting-end-user-pods

Guaranteed pods and Burstable pods whose usage is beneath requests are evicted last.

( Guaranteed Pods are guaranteed only when requests and limits are specified for all the containers and they are equal. )

Such pods are guaranteed to never be evicted because of another Pod's resource consumption.

Summary of other conditions:

When node only has Guaranteed or Burstable Pods using less than requests remaining
Node under RAM pressure MUST choose to evict such Pods
Kubelet eviction manager will evict pods of lowest priority first

This tutorial did not test any guaranteed Pods.

Devise a set of tests that will test these eviction conditions.

It is much easier to do two tests that investigate 2 conditions each.

Do not attempt to test all these conditions in one complex test: interactions will hide simple cause and effects you wish to observe.

You need to know ( from experience ) how guaranteed Pods are handled: they are guaranteed.

To really develop your mastery of this topic devise and run more tests where you:

investigate hard thresholds
investigate soft thresholds
set hard and soft thresholds in same test run
use the 3 priority classes above.

Community

Kubernetes Eviction Policies for Handling Low RAM and Disk Space Situations - Part 2

Hard RAM Eviction Thresholds

Evictions and Priority Classes

Eviction Thresholds Syntax

Eviction Monitoring Interval

eviction-pressure-transition-period

Evicting Guaranteed Pods

Read previous post:

Read next post:

Alibaba Clouder

You may also like

Comments

Alibaba Clouder

Related Products

ECS(Elastic Compute Service)

Container Service for Kubernetes