Updated on 2025-09-19 GMT+08:00

Soft Binding of CPUs

Background

With NUMA affinity, if the load of two VMs is high but other VMs are idle, idle CPUs cannot be used. Without NUMA affinity, the performance deteriorates significantly especially when the entire system is busy. To address this dilemma, soft binding is provided.

  • preferred CPU: Some CPUs are selected for being preferentially used if their usage is lower than the threshold.
  • allowed CPU: When the usage of preferred CPUs exceeds the threshold, cores will be selected from allowed CPUs.
  • This soft binding solution also applies to containers.
  • Preferred CPUs are those preferentially used.
  • Allowed CPUs are from a list of CPUs bound through sched_setaffinity or cgroup.

To intuitively observe the execution of soft binding, soft binding scheduling actions are counted. To enable or disable soft binding without stopping the servers, a switch is added for soft binding scheduling.

Parameter Description

  • If soft binding is enabled, /sys/fs/cgroup/cpuacct/sub-cgroup/cpuacct.nr_select_allowed_cpus is added to a cgroup to collect statistics on how many times allowed cores are selected by tasks in the cgroup.
  • If soft binding is enabled, /sys/fs/cgroup/cpuacct/sub-cgroup/cpuacct.nr_select_prefer_cpus is added to a cgroup to collect statistics on how many times preferred cores are selected by tasks in the cgroup.
  • If soft binding is enabled, /sys/fs/cgroup/cpuacct/sub-cgroup/cpuacct.nr_smoothed_prefer_cpus is added to a cgroup to collect statistics on failures of switching from preferred cores to allowed cores due to a smoothing algorithm when tasks in the cgroup are selecting cores.
  • /proc/sys/kernel/sched_dynamic_affinity_disable is provided to disable soft binding. 0 indicates soft binding is enabled. 1 indicates soft binding is disabled.

How to Use

You can perform the following operations to configure soft binding.

  • Procedure
    1. Use /proc/$PID/task/$TID/preferred_cpuset or cpuset.preferred_cpus in /sys/fs/cgroup/cpuset/ to configure the soft binding CPU list for a process or cgroup.
    2. Use /proc/sys/kernel/sched_util_low_pct to set the usage threshold for preferred CPUs. If the usage is lower than the threshold, services select cores from preferred CPUs. Otherwise, they select cores from allowed CPUs. The preferred CPU usage can be calculated in two ways. For details, see the next step.
    3. Enable or disable DA_UTIL_TASKGROUP in /sys/kernel/debug/sched_features to determine whether to select cores based on the preferred CPU usage of a taskgroup or the total preferred CPU usage.
  • preferred_cpuset can be used to configure the CPU list for soft binding of a thread. A CPU list contains logical CPU IDs, separated by commas (,). For example, CPU{1,3,5,6,7} indicates logical CPU IDs 1,3,5,6,7. Consecutive numbers can be abbreviated in range format. For example, 5,6,7 can be abbreviated as 5-7.
    1. preferred_cpuset must be a subset of allowed cpuset.
    2. If preferred_cpuset is not set, left empty, or the same as allowed cpuset, soft binding will be invalid.
      Checking the parameter:
      cat /proc/$PID/task/$TID/preferred_cpuset
      Example of assigning a value to the parameter:
      echo 5-7 > /proc/$PID/task/$PID/preferred_cpuset
  • cpuset.preferred_cpus in each directory of a cpuset cgroup can be used to configure soft binding for the cgroup. The parameter format is the same as that of preferred_cpuset for a thread.
    1. Currently, cgroup cpuset.preferred_cpus must be a subset of allowed cpus (cpuset.cpus).
    2. If cpuset.preferred_cpus is not set, left empty, or the same as cpuset.cpus, soft binding will be invalid.
    3. The value of cpuset.preferred_cpus in a directory has no impact on that in any other directory.
      Checking the parameter:
      cat /sys/fs/cgroup/cpuset/cpuset.preferred_cpus
      Example of assigning a value to the parameter:
      echo 5-7 > /sys/fs/cgroup/cpuset/sub-cgroup/cpuset.preferred_cpus
  • The value of sched_util_low_pct ranges from 0 to 100 (unit: %). The default value is 85. 0 indicates no threshold limit, but preset idle preferred CPUs are still preferentially selected. 100 indicates that cores will be selected from cpuset.cpus only when the usage exceeds preferred_cpus.
    Checking the parameter:
    cat /proc/sys/kernel/sched_util_low_pct
    Example of assigning a value to the parameter:
    echo 90 > /proc/sys/kernel/sched_util_low_pct
  • DA_UTIL_TASKGROUP is added to /sys/kernel/debug/sched_features. By default, DA_UTIL_TASKGROUP is enabled. If it is enabled, core selection is determined by checking the usage of preferred_cpus for a taskgroup. If it is disabled, core selection is determined by checking the total usage of preferred_cpus (preferred CPUs used by non-taskgroup processes are also counted).
    Enable: echo DA_UTIL_TASKGROUP > /sys/kernel/debug/sched_features
    Disable: echo NO_DA_UTIL_TASKGROUP > /sys/kernel/debug/sched_features

You can observe soft binding by checking how many times preferred or allowed CPUs are selected. You can also enable or disable soft binding as needed.

  • If soft binding is enabled, you can check and reset the parameter.
    Checking the parameters:
    cat /sys/fs/cgroup/cpuacct/sub-cgroup/cpuacct.nr_select_allowed_cpus // Number of times that allowed cores are selected
    cat /sys/fs/cgroup/cpuacct/sub-cgroup/cpuacct.nr_select_prefer_cpus // Number of times that preferred cores are selected
    cat /sys/fs/cgroup/cpuacct/sub-cgroup/cpuacct.nr_smoothed_prefer_cpus // Failures of switching from preferred cores to allowed cores due to a smoothing algorithm
    Reset the parameter:
    echo 0 > /sys/fs/cgroup/cpuacct/sub-cgroup/cpuacct.nr_smoothed_prefer_cpus
  • You can run cat to check /proc/sys/kernel/sched_dynamic_affinity_disable or run echo to change its value.
    Checking the parameter: cat /proc/sys/kernel/sched_dynamic_affinity_disable // The value can be 0 or 1. 0 indicates soft binding is enabled and 1 indicates soft binding is disabled.
    Assigning a value to the parameter: echo 0 > /proc/sys/kernel/sched_dynamic_affinity_disable
  • /proc/$pid/task/$pid/selected_cpuset is added to each process to check the CPUs selected by these processes.

If CPU subgroups are not configured for cgroup v1 or the target process is not in a CPU subgroup, cores will be selected based on the total usage of preferred_cpus regardless of whether DA_UTIL_TASKGROUP is enabled or disabled.

Constraints

  1. Only root user can enable, disable, and configure soft binding. root has the highest privilege in the system. When performing operations as root, follow the operation guide to avoid system management or security risks caused by improper operations.
  2. After preferred_cpuset is set, cores are selected only when a process is woken up or periodic load balancing is performed.
  3. Soft binding depends on the Per-Entity Load Tracking (PELT) algorithm when selecting preferred CPUs or allowed CPUs. The PELT calculation result is updated every 1 ms. If the load changes frequently, the selection of preferred CPUs or allowed CPUs will also be frequently performed.