Alibaba Cloud Linux 2 with a kernel of the kernel-4.19.91-24.al7 version or later supports the group identity feature. You can use the group identity feature to configure different identities for CPU control groups (cgroups) to prioritize process tasks in the cgroups.
Background information
- High-priority tasks have the minimum wakeup latency.
- Low-priority tasks do not affect the performance of high-priority tasks.
- The wakeup of low-priority tasks does not affect the performance of high-priority tasks.
- Low-priority tasks do not use the simultaneous multithreading (SMT) scheduler to share hardware and have no negative impacts on the performance of high-priority tasks.
How the group identity feature works
The group identity feature allows you to configure identities for CPU cgroups to prioritize tasks in the cgroups. The group identity feature relies on a dual red-black tree architecture. A low-priority red-black tree is added based on the red-black tree of the Completely Fair Scheduler (CFS) scheduling queue to store low-priority tasks.
Identity | Description |
---|---|
ID_HIGHCLASS |
Identifies a high-priority task. A high-priority task has more opportunities to preempt
resources than a low-priority task.
When the CFS schedules high-priority tasks, the following situations may occur:
|
ID_NORMAL |
Identifies a normal-priority task. A normal-priority task has more opportunities to
preempt resources than a low-priority task.
When the CFS schedules normal-priority tasks, the following situations may occur:
|
ID_UNDERCLASS |
Identifies a low-priority task.
When the CFS schedules low-priority tasks, the following situations may occur: If the peer SMT scheduler has run the |
- For tasks in cgroups of the same level, identity priorities take effect.
- For tasks in parent cgroups, identity priorities do not take effect. For tasks in child cgroups, identity priorities take effect.
- Resources are competed among tasks that have identities with the same priority in
compliance with CFS policies. Note that the runtime of tasks identified by
ID_UNDERCLASS
orID_NORMAL
may not reach the minimum value.
Identity | Description |
---|---|
ID_SMT_EXPELLER |
Identifies the SMT expeller. The SMT expeller evicts the tasks that are identified
by ID_UNDERCLASS from the peer CPU when the SMT scheduler runs.
|
ID_IDLE_SEEKER |
Specifies that when a task wakes up, the task attempts to find idle CPUs within the limits of scheduler policies. |
ID_IDLE_SAVER |
Used with the sched_idle_saver_wmark kernel parameter. You can use sched_idle_saver_wmark to set a water mark for CPU idle time. When a task identified by ID_IDLE_SAVER wakes up, the task attempts to find only an idle CPU whose idle time exceeds the
specified water mark.
|
Interfaces
- Interfaces used to configure identities
The group identity feature provides two interfaces for you to configure task identities:
/sys/fs/cgroup/cpu/$cg/cpu.identity
and/sys/fs/cgroup/cpu/$cg/cpu.bvt_warp_ns
. The $cg variable indicates the child cgroup directory node where a task is located. Before you use the interfaces, take note of the following items:- The
cpu.bvt_warp_ns
interface is a quick configuration interface. The written value of this interface can be converted to the value of identity. - Both
cpu.identity
andcpu.bvt_warp_ns
interfaces can be used to change the identities of cgroups. - After data is written to the
cpu.identity
interface, the last value written of thecpu.bvt_warp_ns
interface is overwritten. This overwrite operation is not reflected in thecpu.bvt_warp_ns
interface. - After the data is written to the
cpu.bvt_warp_ns
interface, the last written value of thecpu.identity
interface is overwritten. This overwrite operation is not reflected in thecpu.identity
interface. - You can use one of the interfaces to configure task identifies. We recommend that you do not configure both of the interfaces.
- If you are unfamiliar with operations related to the operating system kernel, we recommend
that you do not use the
cpu.identity
interface.
Interface Description cpu.identity
The default value is 0, which indicates the ID_UNDERCLASS
identity.The interface is a 5-bit segment. Valid values of each bit of the interface:- If the interface is empty, it indicates the
ID_NORMAL
identity. - Bit 0: indicates the
ID_UNDERCLASS
identity. - Bit 1: indicates the
ID_HIGHCLASS
identity. - Bit 2: indicates the
ID_SMT_EXPELLER
identity. - Bit 3: indicates the
ID_IDLE_SAVER
identity. - Bit 4: indicates the
ID_IDLE_SEEKER
identity.
For example, if you want to set the identity of a cgroup to
ID_HIGHCLASS
andID_IDLE_SEEKER
, set bit 1 and bit 4 to 1 and the other bits to 0 to obtain a binary value of 10010, which is converted to a decimal value of 18. Then, run theecho 18 > /sys/fs/cgroup/cpu/ $cg /cpu.identity
command to write 18 to cpu.identity.cpu.bvt_warp_ns
The default value is 0, which indicates the ID_NORMAL
identity. Valid values:- 2: indicates the
ID_SMT_EXPELLER
,ID_IDLE_SEEKER
, andID_HIGHCLASS
identities. The value of the corresponding identity is 22. - 1: indicates the
ID_HIGHCLASS
andID_IDLE_SEEKER
identities. The value of the corresponding identity is 18. - 0: indicates the
ID_NORMAL
identity. The value of the corresponding identity is 0. - -1: indicates the
ID_UNDERCLASS
andID_IDLE_SAVER
identities. The value of the corresponding identity is 9. - -2: indicates the
ID_UNDERCLASS
andID_IDLE_SAVER
identities. The value of the corresponding identity is 9.
- The
- Interfaces used to enable or disable scheduling features
You can run the following command to view the default settings of kernel scheduling features by using the
sched_features
interface:
The following table describes the scheduling features.cat /sys/kernel/debug/sched_features
Scheduling feature Description Default value ID_IDLE_AVG
This feature is used with the ID_IDLE_SAVER
identity to count the runtime ofID_UNDERCLASS
tasks towards idle time. This ensures that no CPUs remain idle when onlyID_UNDERCLASS
tasks are running and prevents resource wastes.ID_IDLE_AVG
: indicates that the feature is enabled.ID_RESCUE_EXPELLEE
This feature is used in load balancing scenarios. If tasks cannot find CPU resources available for use, CPUs that are evicting ID_UNDERCLASS
tasks are load-balanced. This feature helpsID_UNDERCLASS
tasks get out of the evicted state as soon as possible.ID_RESCUE_EXPELLEE
: indicates that the feature is enabled.ID_EXPELLEE_NEVER_HOT
After this feature is enabled, when a task that is being evicted decides to migrate to another CPU, hot cache does not cause the migration request to be denied. This feature helps ID_UNDERCLASS
tasks get out of the evicted state as soon as possible.NO_ID_EXPELLEE_NEVER_HOT
: indicates that the feature is disabled.ID_LOOSE_EXPEL
After this feature is enabled, CPUs do not update their eviction states every time they select tasks but have the states automatically updated at the time specified by the sched_expel_update_interval
kernel parameter. The configuration of this feature affects only state updates when CPUs select tasks, not the updates of IPI interrupts.NO_ID_LOOSE_EXPEL
: indicates that the feature is disabled.ID_LAST_HIGHCLASS_STAY
After this feature is enabled, the last ID_HIGHCLASS
task that runs on a CPU cannot be migrated to another CPU.ID_LAST_HIGHCLASS_STAY
: indicates that the feature is enabled. - Interfaces used by sysctl to configure kernel parameters
Some capabilities of the group identity feature depend on the values of kernel parameters. The following table describes the parameters.
Kernel parameter Description Unit Default value /proc/sys/kernel/sched_expel_update_interval
The interval at which the eviction state is automatically updated when a CPU selects tasks. This parameter is valid only when the ID_LOOSE_EXPEL
feature is enabled.ms 10 /proc/sys/kernel/sched_expel_idle_balance_delay
The minimum idle balance
interval when a CPU is evicting tasks. A value of -1 indicates thatidle balance
is not allowed.If only
ID_UNDERCLASS
tasks exist on a CPU and the tasks are being evicted, the CPU is idle.Idle balance
is performed on this CPU to improve load-balancing effects. However, this may damageID_UNDERCLASS
tasks. You can set thesched_expel_idle_balance_delay
parameter to alleviate this issue.ms -1 /proc/sys/kernel/sched_idle_saver_wmark
The water mark for CPU idle time. When an ID_IDLE_SAVER
task wakes up, the task attempts to find an idle CPU whose idle time exceeds the specified water mark.ns 0
Information output
cat /proc/sched_debug
The following table describes the output parameters.Parameter | Description |
---|---|
nr_high_running |
The number of ID_HIGHCLASS tasks that are running on the current CPU.
|
nr_under_running |
The number ofID_UNDERCLASS tasks that are running on the current CPU.
|
nr_expel_immune |
The number of non-ID_UNDERCLASS tasks that are running on the current CPU.
|
smt_expeller |
Indicates whether ID_SMT_EXPELLER tasks are running on the current CPU. A value of 1 indicates that ID_SMT_EXPELLER
tasks are running on the current CPU. A value of 0 indicates that no ID_SMT_EXPELLER
tasks are running on the current CPU.
|
on_expel |
Indicates whether ID_SMT_EXPELLER tasks are running on the peer SMT scheduler. A value of 1 indicates that ID_SMT_EXPELLER
tasks are running on the current CPU. A value of 0 indicates that no ID_SMT_EXPELLER
tasks are running on the current CPU.
|
high_exec_sum |
The cumulative runtime of ID_HIGHCLASS tasks on the current CPU.
|
under_exec_sum |
The cumulative runtime of ID_UNDERCLASS tasks on the current CPU.
|
h_nr_expel_immune |
The number of non-ID_UNDERCLASS tasks that are running on cfs_rq .
|
expel_start |
The difference between the minimum vruntimes of the two red-black trees when the CPU starts to evict tasks. |
expel_spread |
The cumulative difference between the minimum vruntimes of the two red-black trees caused by CPU eviction states. |
min_under_vruntime |
The minimum vruntime of the low-priority red-black tree. |
FAQ
How do I upgrade the kernel version from kernel-4.19.91-25.1.al7 to kernel-4.19.91-25.6.al7 or later?
- Log on to the instance.
For more information, see Connect to a Linux instance by using a password or key.
- Run the following command to query the kernel version:
uname -r
- Run the following command to upgrade the kernel version:
yum update kernel
- Run the following command to restart the instance to make the new kernel version take
effect:
reboot