Container Service for Kubernetes (ACK) provides the topology-aware CPU scheduling
feature based on the new Kubernetes scheduling framework. This feature can improve
the performance of CPU-sensitive workloads. This topic describes how to enable topology-aware
CPU scheduling.
Prerequisites
- An ACK Pro cluster is created. For more information, see Create an ACK Pro cluster.
Notice Topology-aware CPU scheduling is available only for ACK Pro clusters. To enable topology-aware
CPU scheduling for ACK dedicated clusters,
Submit a ticket to apply to be added to the whitelist.
- ack-slo-manager is installed. For more information, see Usage notes.
Note ack-slo-manager is upgraded and optimized based on resource-controller. You must uninstall
resource-controller after you install ack-slo-manager.
Background information
Multiple pods can run on a node in a Kubernetes cluster, and some of the pods may
belong to CPU-intensive workloads. In this case, pods compete for CPU resources. When
the competition becomes intense, the CPU cores that are allocated to each pod may
frequently change. This situation intensifies when Non-Uniform Memory Access (NUMA)
nodes are used. These changes degrade the performance of the workloads. The Kubernetes
CPU manager provides a CPU scheduling solution to fix this issue within a node. However,
the Kubernetes CPU manager cannot find an optimal allocation of CPU cores within a
cluster. The Kubernetes CPU manager works only on guaranteed pods and does not apply
to other types of pods. In a guaranteed pod, each container is configured with requests
and limits on CPU resources, and the request and limit are set to the same value for
each container.
Topology-aware CPU scheduling applies to the following scenarios:
- The workload is compute-intensive.
- The application is CPU-sensitive.
- The workload runs on multi-core Elastic Compute Service (ECS) bare metal instances
with Intel CPUs or AMD CPUs.
To test the effect of topology-aware CPU scheduling, stress tests are performed on
two NGINX applications that both request 4 CPU cores and 8 GB of memory. The tests
are performed on applications that are deployed on an ECS bare metal instance with
104 Intel CPU cores and also applications that are deployed on an ECS bare metal instance
with 256 AMD CPU cores. The results show that the application performance is improved
by 22% to 43% when topology-aware CPU scheduling is enabled. The following figures
show the details.


Performance metrics |
Intel |
AMD |
QPS |
Improved by 22.9% |
Improved by 43.6% |
AVG RT |
Reduced by 26.3% |
Reduced by 42.5% |
When you enable topology-aware CPU scheduling for an application, you can set cpu-policy to static-burst in the pod annotations. This allows you to configure automatic CPU core binding policies. This CPU policy
is suitable for compute-intensive workloads. It can efficiently reduce CPU core competition
among processes and memory access between NUMA nodes. This maximizes the utilization
of fragmented CPU resources and optimizes resource allocation for compute-intensive
workloads without the need to modify the hardware and VM resources. This further improves
CPU usage.
Limits
The following table describes the versions that are required for the system components.
Component |
Required version |
Kubernetes |
≥ 1.18 |
ack-slo-manager |
≥ 0.2.0 |
Helm |
≥ 3.0 |
Kernel and OS |
Alibaba Cloud Linux 2, CentOS 7.6, and CentOS 7.7 |
Considerations
- Before you enable topology-aware CPU scheduling, make sure that ack-slo-manager is
deployed.
- When you enable topology-aware CPU scheduling, make sure that
cpu-policy=none
is configured for the node.
- If you want to regulate pod scheduling, add the
nodeSelector
field.
Notice Do not add the nodeName
field, which cannot be parsed by the pod scheduler when topology-aware CPU scheduling
is enabled.
Enable topology-aware CPU scheduling
To enable topology-aware CPU scheduling, you must set the
annotations and
containers parameters when you configure pods. Set the parameters in the following ways:
- annotations: Set cpuset-scheduler to true to enable topology-aware CPU scheduling.
- containers: Set
resources.limit.cpu
to an integer.
- Create a file named cal-pi.yaml by using the following template. You can use this file to create a pod with topology-aware
CPU scheduling enabled.
apiVersion: v1
kind: Pod
metadata:
name: cal-pi
annotations:
cpuset-scheduler: 'true' # Enable topology-aware CPU scheduling.
#cpu-policy: 'static-burst' # Configure automatic vCPU binding policies and improve the utilization of fragmented CPU resources.
spec:
restartPolicy: Never
containers:
- image: registry.cn-zhangjiakou.aliyuncs.com/xianlu/java-pi
name: cal-pi
resources:
requests:
cpu: 4
limits:
cpu: 4 # Specify the value of resources.limit.cpu.
env:
- name: limit
value: "20000"
- name: threadNum
value: "3000"
Notice When you enable topology-aware CPU scheduling, you can set cpu-policy to static-burst in the annotations section. This allows you to configure automatic vCPU binding policies. When you add
this configuration, delete #
before cpu-policy.
- Create a file named go-demo.yaml by using the following template. You can use this file to create a Deployment with
topology-aware CPU scheduling enabled.
apiVersion: apps/v1
kind: Deployment
metadata:
name: go-demo
spec:
replicas: 4
selector:
matchLabels:
app: go-demo
template:
metadata:
annotations:
cpuset-scheduler: "true" # Enable topology-aware CPU scheduling.
#cpu-policy: 'static-burst' # Configure automatic vCPU binding policies and improve the utilization of fragmented CPU resources.
labels:
app: go-demo
spec:
containers:
- name: go-demo
image: registry.cn-hangzhou.aliyuncs.com/polinux/stress/go-demo:1k
imagePullPolicy: Always
ports:
- containerPort: 8080
resources:
requests:
cpu: 1
limits:
cpu: 4 # Specify the value of resources.limit.cpu.
Notice
- When you enable topology-aware CPU scheduling, you can set cpu-policy to static-burst in the annotations section. This allows you to configure automatic vCPU binding policies. When you add
this configuration, delete
#
before cpu-policy.
- In the
template.metadata
section, configure annotations
of the pod.
- Run the following command to create the pod and Deployment:
kubectl create -f cal-pi.yaml go-demo.yaml
Test the application performance
In this example, the following conditions apply:
- The Kubernetes version of the ACK Pro cluster is 1.20.
- Two cluster nodes are used in the test. One is used to perform stress tests. The other
runs the workload and serves as the tested machine.
- Run the following command to add a label to the tested machine:
kubectl label node 192.168.XX.XX policy=intel/amd
- Deploy the NGINX service on the tested machine.
- Use the following YAML templates to create resources for the NGINX service:
-
service.yamlapiVersion: v1
kind: Service
metadata:
name: nginx-service-nodeport
spec:
selector:
app: nginx
ports:
- name: http
port: 8000
protocol: TCP
targetPort: 80
nodePort: 32257
type: NodePort
-
configmap.yamlapiVersion: v1
kind: ConfigMap
metadata:
name: nginx-configmap
data:
nginx_conf: -
user nginx;
worker_processes 4;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
events {
worker_connections 65535;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main;
sendfile on;
#tcp_nopush on;
keepalive_timeout 65;
#gzip on;
include /etc/nginx/conf.d/*.conf;
}
-
nginx.yamlapiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
annotations:
#cpuset-scheduler: "true" By default, topology-aware CPU scheduling is disabled.
labels:
app: nginx
spec:
nodeSelector:
policy: intel7
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
resources:
requests:
cpu: 4
memory: 8Gi
limits:
cpu: 4
memory: 8Gi
volumeMounts:
- mountPath: /etc/nginx/nginx.conf
name: nginx
subPath: nginx.conf
volumes:
- name: nginx
configMap:
name: nginx-configmap
items:
- key: nginx_conf
path: nginx.conf
- Run the following command to create the resources that are provisioned for the NGINX
service:
kubectl create -f service.yaml configmap.yaml nginx.yaml
- Log on to the node that is used to perform stress tests and run the following command
to download
wrk
.
wget https://caishu-oss.oss-cn-beijing.aliyuncs.com/wrk?versionId=CAEQEBiBgMCGk565xxciIDdiNzg4NWIzMzZhZTQ1OTlhYzZhZjFhNmQ2MDNkMzA2 -O wrk
chmod +755 wrk
mv wrk /usr/local/bin
- Run the following command to perform stress tests and record the test data:
taskset -c 32-45 wrk --timeout 2s -t 20 -c 100 -d 60s --latency http://<IP address of the tested machine>:32257
Expected output:
20 threads and 100 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 600.58us 3.07ms 117.51ms 99.74%
Req/Sec 10.67k 2.38k 22.33k 67.79%
Latency Distribution
50% 462.00us
75% 680.00us
90% 738.00us
99% 0.90ms
12762127 requests in 1.00m, 10.10GB read
Requests/sec: 212350.15Transfer/sec: 172.13MB
- Run the following command to delete the NGINX Deployment:
kubectl delete deployment nginx
Expected output:
deployment "nginx" deleted
- Use the following YAML template to deploy an NGINX Deployment with topology-aware
CPU scheduling enabled:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
annotations:
cpuset-scheduler: "true"
labels:
app: nginx
spec:
nodeSelector:
policy: intel7
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
resources:
requests:
cpu: 4
memory: 8Gi
limits:
cpu: 4
memory: 8Gi
volumeMounts:
- mountPath: /etc/nginx/nginx.conf
name: nginx
subPath: nginx.conf
volumes:
- name: nginx
configMap:
name: nginx-configmap
items:
- key: nginx_conf
path: nginx.conf
- Run the following command to perform stress tests and record the test data for comparison:
taskset -c 32-45 wrk --timeout 2s -t 20 -c 100 -d 60s --latency http://<IP address of the tested machine>:32257
Expected output:
20 threads and 100 connections
ls Thread Stats Avg Stdev Max +/- Stdev
Latency 345.79us 1.02ms 82.21ms 99.93%
Req/Sec 15.33k 2.53k 25.84k 71.53%
Latency Distribution
50% 327.00us
75% 444.00us
90% 479.00us
99% 571.00us
18337573 requests in 1.00m, 14.52GB read
Requests/sec: 305119.06Transfer/sec: 247.34MB
Compare the data of the preceding tests. This comparison indicates that the performance
of the NGINX service is improved by 43% after topology-aware CPU scheduling is enabled.
Verify that the automatic vCPU binding policies improve performance
In this example, a CPU policy is configured for a workload that runs on a node with
64 vCPUs. After you configure automatic CPU core binding policies for an application
with topology-aware CPU scheduling enabled, the CPU usage can be further improved
by 7% to 8%.
- After the pod or Deployment is started, run the following command to query the pod:
kubectl get pods grep cal-pi
Expected output:
NAME READY STATUS RESTARTS AGE
cal-pi-d**** 1/1 Running 1 9h
- Run the following command to query the log of the
cal-pi-d****
application: kubectl logs cal-pi-d****
Expected output:
computing Pi with 3000 Threads...computed the first 20000 digets of pi in 620892 ms!
the first digets are: 3.14159264
writing to pi.txt...
finished!
- Enable topology-aware CPU scheduling.
- Create a file named cal-pi.yaml by using the following template. You can use this file to create a pod with topology-aware
CPU scheduling enabled.
apiVersion: v1
kind: Pod
metadata:
name: cal-pi
annotations:
cpuset-scheduler: 'true' # Enable topology-aware CPU scheduling.
cpu-policy: 'static-burst' # Configure automatic vCPU binding policies and improve the utilization of fragmented CPU resources.
spec:
restartPolicy: Never
containers:
- image: registry.cn-zhangjiakou.aliyuncs.com/xianlu/java-pi
name: cal-pi
resources:
requests:
cpu: 4
limits:
cpu: 4 # Specify the value of resources.limit.cpu.
env:
- name: limit
value: "20000"
- name: threadNum
value: "3000"
- Create a file named go-demo.yaml by using the following template. You can use this file to create a Deployment with
topology-aware CPU scheduling enabled.
apiVersion: apps/v1
kind: Deployment
metadata:
name: go-demo
spec:
replicas: 4
selector:
matchLabels:
app: go-demo
template:
metadata:
annotations:
cpuset-scheduler: "true" # Enable topology-aware CPU scheduling.
cpu-policy: 'static-burst' # Configure automatic vCPU binding policies and improve the utilization of fragmented CPU resources.
labels:
app: go-demo
spec:
containers:
- name: go-demo
image: registry.cn-hangzhou.aliyuncs.com/polinux/stress/go-demo:1k
imagePullPolicy: Always
ports:
- containerPort: 8080
resources:
requests:
cpu: 1
limits:
cpu: 4 # Specify the value of resources.limit.cpu.
- Create a pod and Deployment.
- After the pod or Deployment is started, run the following command to query the pod:
cal-pi is used in this example.
kubectl get pods grep cal-pi
Expected output:
NAME READY STATUS RESTARTS AGE
cal-pi-e**** 1/1 Running 1 9h
- Run the following command to query the log of the
cal-pi-e****
application: kubectl logs cal-pi-e****
Expected output:
computing Pi with 3000 Threads...computed the first 20000 digets of pi in 571221 ms!
the first digets are: 3.14159264
writing to pi.txt...
finished!
Compare the log data with the log data in Step 2. You can find that the performance of the pod configured with a CPU policy is improved
by 8%.