HCE-specific Kernel Parameters
Compared with CentOS 8, HCE 2.0 has some custom kernel parameters.
Kernel Parameters
The following parameters are from the files in the /proc/sys/kernel directory.
Task scan
Automatic NUMA balancing scans the address space of a task and cancels page mapping to check whether the page is correctly placed or whether data should be migrated to the memory node local to where the task is running. Each time a scan is delayed, the task scans the next number of pages in its address space. When the end of the address space is reached, the scanner starts from the beginning.
The scan delay and scan size determine the scan rate. When the scan delay decreases, the scan rate increases. The scan delay and the scan rate of every task are adaptive and depend on historical behaviour. If pages are properly placed, the scan delay increases. Otherwise, the scan delay decreases. The scan size is not adaptive. However, a larger scan size indicates a higher scan rate.
A higher scan rate results in higher system overhead because page errors must be trapped and data must be migrated. However, the higher the scan rate, the faster the memory of the task is migrated to the local node. If the workload pattern changes, this minimizes performance impact due to remote memory accesses. Parameters in Table 1 control the thresholds for the scan delay and the number of pages scanned.
Parameter |
Description |
Value |
---|---|---|
kernel.numa_balancing_scan_delay_ms |
Specifies the starting scan delay used for a task when it initially forks. |
The default value is 1000, in milliseconds. |
kernel.numa_balancing_scan_period_max_ms |
Specifies the maximum time to scan a task's virtual memory. It effectively controls the minimum scan rate for each task. |
The default value is 60000, in milliseconds. |
kernel.numa_balancing_scan_period_min_ms |
Specifies the minimum time to scan a task's virtual memory. It effectively controls the maximum scanning rate for each task. |
The default value is 1000, in milliseconds. |
kernel.numa_balancing_scan_size_mb |
Defines the size of the page to be scanned each time. |
The default value is 256, in MB. |
CFS
The Completely Fair Scheduler (CFS) uses nanosecond granularity accounting and does not rely on any jiffies or other HZ detail. Thus, it has no notion of "time slices" in the way the previous scheduler had, and has no heuristic algorithms. There is only one central tunable (you have to switch on CONFIG_SCHED_DEBUG):
/proc/sys/kernel/sched_min_granularity_ns
Due to its design, CFS is not prone to any of the "attacks" (such as fiftyp.c, thud.c, chew.c, ring-test.c, and massive_intr.c) that exist today against the heuristic algorithms of conventional schedulers. These attacks do not impact interactivity and produce the expected behavior.
CFS has a much stronger handling of nice levels and SCHED_BATCH than the previous vanilla scheduler: both types of workloads are isolated much more aggressively. SMP load balancing has been reworked/sanitized: the runqueue-walking assumptions are gone from the load-balancing code now, and iterators of the scheduling modules are used. As a result, the load balancing code becomes simpler.
Parameter |
Description |
Value |
---|---|---|
kernel.sched_min_granularity_ns |
Tunes the scheduler from "desktop" (low latency) to "server" (good batch processing) workloads. The default value is suitable for desktop workloads. SCHED_BATCH is handled by the CFS scheduler module too. |
The default value is 3000000, in nanoseconds. |
Fault locating
Parameter |
Description |
Value |
---|---|---|
kernel.net_res_debug_enhance |
When a large number of packets are sent and received, the resources of the kernel stack may be insufficient or exceed the thresholds. As a result, the user-mode socket and send interfaces fail to return responses, or packet loss occurs. If this option is enabled, the fault locating information is recorded in the system log. The value 1 indicates this option is enabled, and 0 indicates this option is disabled. |
The default value is 0. |
OOM event fault locating
The product/platform service software or OS may have insufficient memory due to a special reason, which triggers an OOM event. Kbox can record the time when an OOM event occurs, details about the OOM process, and system process information in the storage device, facilitating fault locating.
Parameter |
Description |
Value |
---|---|---|
kernel.oom_enhance_enable |
Indicates whether to enable OOM information printing. The value 1 indicates this option is enabled, and 0 indicates this option is disabled. |
The default value is 1. |
kernel.oom_print_file_info |
Indicates whether to print file system information. The value 1 indicates that the file system information is printed, and 0 indicates that the file system information is not printed. |
The default value is 1. |
kernel.oom_show_file_num_in_dir |
Specifies the number of files in the printed file system information. |
The default value is 10. |
SMT expeller
Parameter |
Description |
Value |
---|---|---|
kernel.qos_offline_wait_interval_ms |
Specifies the sleep time (in milliseconds) of the offline task before entering the user mode in the event of overload. |
The value ranges from 100 to 1000, and the default value is 100. |
kernel.qos_overload_detect_period_ms |
Specifies a period of time, in milliseconds. If the online task has been occupying a CPU's resources for a period longer than the value specified by this parameter, the process of resolving priority inversion is triggered. |
The value ranges from 100 to 100000, and the default value is 5000. |
The two parameters are new openEuler kernel parameters. For details, see SMT Expeller Free of Priority Inversion in the openEuler Technical White Paper.
Compute statistics
Due to factors such as turbo frequency, tunning, SMT, and big and small cores on the same node, the CPU usage collected by the cpuacct subsystem cannot reflect the actual compute. The difference between the compute represented by the CPU usage on different nodes can reach over 30%. Compute statistics is used to resolve the problem that the duty cycle cannot reflect the actual CPU compute usage. The cgroup's CPU usage derived based on the actual compute can better represent service performance metrics compared with duty cycle. The impact of SMT is mainly considered in compute statistics.
Parameter |
Description |
Value |
---|---|---|
kernel.normalize_capacity.sched_normalize_util |
Controls whether to dynamically enable CPU normalization. The value 0 indicates that this option is disabled, and 1 indicates that this option is enabled. |
The default value is 0. |
kernel.normalize_capacity.sched_single_task_factor |
In hyper-threading scenarios, set the compute coefficient for independently running logical cores. The value ranges from 1 to 100. A larger value indicates higher compute of the logical cores. |
The default value is 100. |
kernel.normalize_capacity.sched_multi_task_factor |
In hyper-threading scenarios, set the compute coefficient for running logical cores in parallel. The value ranges from 1 to 100. A larger value indicates higher compute of the logical cores. |
The default value is 60. |
kernel.normalize_capacity.sched_normalize_adjust |
Read and write interface. It controls whether to dynamically enable compute compensation. The value 0 indicates that this option is disabled, and 1 indicates that this option is enabled. |
The default value is 0. |
Usage of preferred CPU
Parameter |
Description |
Value |
---|---|---|
kernel.sched_util_low_pct |
Specifies the usage threshold of the preferred CPU. When the preferred CPU usage is lower than the threshold, services cannot select cores from the preferred CPU. They can only select cores from the allowed CPU. |
The default value is 85. |
Watchdog detection period
The Linux kernel can act as a watchdog to detect both soft and hard lockups. If the watchdog is incorrectly reset for lockups, the watchdog will become invalid. In this case, valid information will be recorded or displayed, helping R&D and O&M personnel quickly locate faults. The sysctl parameter is added for the watchdog detection period. The detection and alarm log printing periods can be dynamically adjusted based on product requirements.
Parameter |
Description |
Value |
---|---|---|
kernel.watchdog_enhance_enable |
Controls all watchdog enhancement functions. The value 0 indicates that the functions are disabled, and 1 indicates that the functions are enabled. The value can be 0 or 1. |
The default value is 1. |
kernel.watchdog_softlockup_divide |
Adjusts the watchdog detection interval, in seconds. The value can be calculated using the formula: The value of kernel.watchdog_thresh × 2/The value of kernel.watchdog_softlockup_divide |
The value ranges from 1 to 60, and the default value is 5. |
kernel.watchdog_print_period |
Specifies the interval (in seconds) for printing process information after the watchdog detects that a process is not scheduled. |
The value ranges from 1 to 60, and the default value is 10. |
CPU QoS interface compatibility
Parameter |
Description |
Value |
---|---|---|
kernel.sched_qos_level_0_has_smt_expell |
Whether to enable CPU QoS interface compatibility after the level of priority is increased from 2 to 5 for CCE clusters that use HCE. 0 indicates this option is disabled, which means that there are five levels of priority. When compatibility is required, you can enable this option by setting the value to 1 so that the semantics of priority 0 remains unchanged when there are both online and offline workloads. |
The default value is 0. |
Dynamic adjustment of Frequency
Parameter |
Description |
Value |
---|---|---|
kernel.actual_hz |
Dynamically adjusts the frequency by setting the kernel parameter. The value 0 indicates that this option is disabled, and the original 1,000 HZ of the kernel is used. |
The value ranges from 0 to 1000, and the default value is 0. |
kernel scheduler
Parameter |
Description |
Value |
---|---|---|
kernel.sched_latency_ns |
A variable used to define the initial value for the scheduler period. The scheduler period is a period of time during which all runnable tasks should be allowed to run at least once. |
The default value is 24000000, in nanoseconds. |
kernel.sched_migration_cost_ns |
Specifies the amount of time after the last execution that a task is considered to be "cache hot" in migration decisions. A "hot" task is less likely to be migrated, so increasing this variable reduces task migrations. |
The default value is 500000, in nanoseconds. |
kernel.sched_nr_migrate |
If a SCHED_OTHER task spawns a large number of other tasks, they will all run on the same CPU. The migration task or softirq will try to balance these tasks so that they can run on idle CPUs. This option can be set to specify the number of tasks to be moved at a time. |
The default value is 32. |
kernel.sched_tunable_scaling |
Controls whether the scheduler can adjust sched_latency_ns. The adjustment made is based on the number of CPUs, and increases logarithmically or linearly as implied in the available values. This is because more CPUs there is an apparent reduction in perceived latency. |
0: no adjustment 1: logarithmic adjustment 2: linear adjustment The default value is 1. |
kernel.sched_wakeup_granularity_ns |
Gives preemption granularity when tasks wake up. |
The default value is 4000000, in nanoseconds. |
kernel.sched_autogroup_enabled |
Controls whether to enable autogroup scheduling. By default, it is disabled. If enabled, all members in an autogroup belong to the same task group of the kernel scheduler. The CFS scheduler uses an algorithm to evenly allocate CPU clock cycles among task groups. |
0: disabled 1: enabled Default value: 0 |
Dynamic affinity in scheduler
Parameter |
Description |
Value |
---|---|---|
kernel.sched_dynamic_affinity_disable |
Controls whether to disable dynamic affinity in scheduler. The value 0 indicates that this option is enabled, and the value 1 indicates that this option is disabled. |
The default value is 0. |
CPU QoS priority-based load balancing
Parameter |
Description |
Value |
---|---|---|
kernel.sched_prio_load_balance_enabled |
Specifies whether to enable CPU QoS priority-based load balancing. The value 0 indicates that this option is disabled, and 1 indicates that this option is enabled. |
The value is 0 or 1, and the default value is 0. |
For details about new kernel options of openEuler, see CPU QoS Priority-based Load Balancing in the openEuler Technical White Paper.
Core suspension detection
CPU core suspension is a special issue. When this issue occurs, the CPU core cannot execute any instructions or respond to interrupt requests. So, kernel tests cannot cover this issue. Chips need to use a simulator to locate the root cause. To improve the efficiency of locating faults, core suspension detection is provided for kernels to check whether core suspension occurs.
Parameter |
Description |
Value |
---|---|---|
kernel.corelockup_thresh |
If the threshold is set to x and a CPU does not receive hrtimer and NMI interrupts for x consecutive times, the CPU core will be suspended. |
Default value: 5 |
Idle polling control
Parameter |
Description |
Value |
---|---|---|
kernel.halt_poll_threshold |
If idle pooling is enabled, the guest OS kernel performs idle polling before entering the idle state, without VM-exit. kernel.halt_poll_threshold determines the idle polling duration. During idle polling, task scheduling will not increase Inter-Processor Interrupt (IPI) overheads. |
Default value: 0 Value range: 0 to max(uint64) |
Reset failures in User to Core Environment (UCE)
Parameter |
Description |
Value |
---|---|---|
kernel.machine_check_safe |
Ensure that CONFIG_ARCH_HAS_COPY_MC is enabled in the kernel. If /proc/sys/kernel/machine_check_safe is set to 1, machine check is enabled. If it is set to 0, machine check is disabled. Other values are invalid. |
Default value: 1 Values: 0 or 1 |
Printing source IP addresses upon network packet verification errors
When hardware errors occur or networks are under attack, the kernel receives network packets with verification errors and discards them. As a result, the sources of the data packets cannot be located. This feature provides fault locating in this case. After the verification fails, the source IP addresses of the data packets are printed in system logs.
Parameter |
Description |
Value |
---|---|---|
kernel.net_csum_debug |
1 indicates source IP address printing is enabled and 0 indicates it is disabled. This parameter is only available in Arm. |
The default value is 0. |
Cluster scheduling
Parameter |
Description |
Value |
---|---|---|
kernel.sched_cluster |
1 indicates cluster scheduling is enabled and 0 indicates it is disabled. |
The default value is 1. |
Parameters in the /proc/sys/net Directory
The following parameters are from the files in the /proc/sys/net directory.
TCP socket buffer control
Parameter |
Description |
Value |
---|---|---|
net.ipv4.tcp_rx_skb_cache |
Controls the cache of each TCP socket of an SKB, which may help improve the performance of some workloads. This option can be dangerous on systems with a large number of TCP sockets because it increases memory usage. |
The default value is 0, indicating that this option is disabled. |
Network namespace control
Parameter |
Description |
Value |
---|---|---|
net.netfilter.nf_namespace_change_enable |
If this option is set to 0, the namespace is read-only in a non-initialized network namespace. |
The default value is 0. |
Querying VF information and displaying the broadcast address
Parameter |
Description |
Value |
---|---|---|
net.core.vf_attr_mask |
Determines whether to display the broadcast address when netlink is used to query VF link information (for example, the ip linkshow command). The default value is 1, indicating that the broadcast address is displayed, which is the same as that in the community. |
The default value is 1. |
Custom TCP retransmission rules
The TCP packet retransmission of EulerOS complies with the exponential backoff principle. In a low-quality network, the packet arrival rate is low and the latency is high. To address this issue, HCE allows custom TCP retransmission rules by using APIs. Users can specify the number of linear backoff times, maximum number of retransmission times, and maximum retransmission interval, to increase the packet arrival rate and reduce the latency in a low-quality network.
Parameter |
Description |
Value |
---|---|---|
net.ipv4.tcp_sock_retrans_policy_custom |
Determines whether to enable custom TCP retransmission rules. 0: Custom TCP retransmission rules cannot be configured. 1: Custom TCP retransmission rules can be configured. |
The default value is 0. |
IPv6 re-path to avoid congestion
Cloud DCN networks have various paths and use Equal Cost Multi Path (ECMP) to balance loads and reduce conflicts. In the case of dynamic bursts or fluctuant flow sizes, load imbalance and congestion hotspots can still occur.
IPv6 re-path balances network loads through device-side congestion detection and path switching to improve the service flow throughput and reduce transmission latency. The sender side dynamically detects the network congestion status and evaluates whether path switching is necessary. If it is necessary, the sender side performs re-path to switch to a light-load path, thereby avoiding network congestion hotspots.
Parameter |
Description |
Value |
---|---|---|
net.ipv6.tcp_repath_cong_thresh |
Number of consecutive congestion rounds allowed during idle hours. If the number is exceeded, path switching will be performed. |
Value range: 1 to 8192 Default value: 10 |
net.ipv6.tcp_repath_enabled |
0 indicates IPv6 re-path is disabled and 1 indicates it is enabled. |
Values: 0 or 1 Default value: 0 |
net.ipv6.tcp_repath_idle_rehash_rounds |
Number of consecutive congestion rounds allowed during non-idle hours. If the number is exceeded, path switching will be performed. |
Value range: 3 to 31 Default value: 3 |
net.ipv6.tcp_repath_rehash_rounds |
Percentage of lost packets allowed in a round. If the threshold is exceeded, congestion occurs in the round. |
Value range: 3 to 31 Default value: 3 |
net.ipv6.tcp_repath_times_limit |
Maximum number of path switching operations per second. |
Value range: 1 to 10 Default value: 2 |
Parameters in the /proc/sys/vm Directory
The following parameters are from the files in the /proc/sys/vm directory.
Periodic reclamation
Parameter |
Description |
Value |
---|---|---|
vm.cache_reclaim_s |
Specifies the interval for periodically reclaiming memory. When periodic memory reclamation is enabled, the memory is reclaimed every cache_reclaim_s seconds. |
The default value is 0. |
vm.cache_reclaim_weight |
This is used to speed up page cache reclaim. When periodic memory reclamation is enabled, the amount of memory reclaimed each time can be calculated using the following formula: reclaim_amount = cache_reclaim_weight × SWAP_CLUSTER_MAX × nr_cpus_node(nid)
Workqueue is used to reclaim memory. If the memory reclamation task is time-consuming, subsequent work will be blocked, which may affect time-sensitive work. |
The default value is 1. |
vm.cache_reclaim_enable |
Specifies whether to enable periodic memory reclamation. |
The default value is 1. |
Page cache upper limit
Parameter |
Description |
Value |
---|---|---|
vm.cache_limit_mbytes |
Limits the page cache amount, in MB. If the page cache exceeds the limit, the page cache is periodically reclaimed. |
The default value is 0. |
Maximum batch size and high watermark
Parameter |
Description |
Value |
---|---|---|
vm.percpu_max_batchsize |
Specifies the maximum batch size and high watermark per CPU in each region. |
|
Maximum fraction of pages
Parameter |
Description |
Value |
---|---|---|
vm.percpu_pagelist_fraction |
Defines the maximum fraction (with high watermark pcp->high) of pages in each zone that are can be allocated for each per-cpu page list. The minimum value is 8, which means that up to 1/8th of pages in each zone can be allocated for each per-CPU page list. This entry only changes the value of each hot per-CPU page list. A user can specify a number like 100 to allocate 1/100th of pages in each zone for each per-CPU list. The batch value of each per-CPU page list will be updated accordingly and set to pcp->high/4. The upper limit of batch is (PAGE_SHIFT x 8). The initial value is zero. The kernel does not use this value to set the high watermark for each per-CPU page list at startup. If the user writes 0 to this sysctl, it will revert to the default behaviour. |
The default value is 0. |
Memory priority classification
Parameter |
Description |
Value |
---|---|---|
vm.memcg_qos_enable |
Dynamically enables memory priority classification. The value 0 indicates that this option is disabled, and 1 indicates that this option is enabled. |
The default value is 0. |
Virtual address cycle for mmap loading
Parameter |
Description |
Value |
---|---|---|
vm.mmap_rnd_mask |
This parameter can be used to set any number of bits of the virtual address loaded by the mmap to 0 to control the virtual address period loaded by the mmap. |
The default value is null. |
Hugepage management
Parameter |
Description |
Value |
---|---|---|
vm.hugepage_mig_noalloc |
When hugepages are migrated from one NUMA node to another, if the number of available hugepages on the destination NUMA node is insufficient, the system determines whether to allocate new hugepages based on the value of this parameter. Set 1 to forbid new hugepage allocation in hugepage migration when hugepages on the destination node run out. Set 0 to allow hugepage allocation in hugepage migration as usual. |
The default value is 0. |
vm.hugepage_nocache_copy |
In the x86 architecture, when hugepages are migrated to the NUMA node where Intel AEP is used, this parameter determines the way to copy hugepages. If the value is 1, the NT instruction is used. If the value is 0, the native MOV instruction is used. |
The default value is 0. |
vm.hugepage_pmem_allocall |
During hugepage allocation on the NUMA node where Intel AEP is used, this parameter determines whether to limit the number of hugepages. If the value is 0, the number of hugepages that can be converted to is limited by the kernel threshold. If the value is 1, all available memory can be applied as hugepages. |
The default value is 0. |
vmemmap memory source
Parameter |
Description |
Value |
---|---|---|
vm.vmemmap_block_from_dram |
Controls whether to apply the vmemmap memory from the AEP when the AEP memory is hot added to the system NUMA node. If the value is 1, the memory is from the DRAM. If the value is 0, the memory is from the corresponding AEP. |
The default value is 0. |
Memory overcommitment
Parameter |
Description |
Value |
---|---|---|
vm.swap_madvised_only |
Specifies whether to enable memory overcommitment. The value 1 indicates this option is enabled If the value 0 indicates that this option is disabled. |
The default value is 0. |
QEMU hot replacement
Parameter |
Description |
Value |
---|---|---|
vm.enable_hotreplace |
Specifies whether to enable QEMU hot replacement. This option supports quick QEMU version upgrade without interrupting services. It can be used in the host OS hot patch and cannot be enabled for the guest OS. The value can be 0 or 1. |
The default value is 0, indicating that this option is disabled. |
Slab allocation
Slab allocation is used to cache kernel data to reduce memory fragmentation and improve system performance. However, as the process progresses, the slabs may occupy a large amount of memory. Enabling drop_slabs can release the cache to increase the available memory of the host.
Parameter |
Description |
Value |
---|---|---|
vm.drop_slabs |
The value is 0 or 1. The value 0 indicates that the cache will not be released. The value 1 indicates that the cache will be released. |
Default value: 1 |
vm.drop_slabs_limit |
Indicates the priority. The value is an integer from 0 and 12. A larger value indicates that fewer slabs are cleared. This is to prevent the CPU from being occupied for a long time when too many slabs are cleared. |
Default value: 7 |
cgroup isolation for ZRAM memory compression
This feature binds memcgs and ZRAM devices. A specified memcg can use a specified ZRAM device, and the memory used by the ZRAM device is obtained from the memory of the container in the group.
Parameter |
Description |
Value |
---|---|---|
vm.memcg_swap_qos_enable |
Read and write interface. The default value is 0, indicating that the feature is disabled. If the value is 1, memory.swapfile of all memcgs is set to all. If the value is 2, memory.swapfile of all memcgs is set to none. If the value is 1 or 2 and you want to change it to another value, you need to set the value to 0. |
Values: 0, 1, or 2 Default value: 0 |
Disabling swappiness globally
In the community kernel, if swappiness is set to 0, the kernel avoids swappiness as much as possible The swap_extension option is provided to disable swappiness globally to forcibly prevent anonymous pages from being swapped out.
Parameter |
Description |
Value |
---|---|---|
vm.swap_extension |
Disables swappiness globally to forcibly prevent anonymous pages from being swapped out. Set swap_extension to 1 to disable anonymous page reclamation. By default, anonymous pages are not swapped out unless the process proactively calls madvise to use the swap space. The anonymous page exchange of processes in the cgroup is not affected. If swap_extension is set to 2, the swapcache is cleared when the swap space is used up, which is irrelevant to this feature. If swap_extension is set to 3, disabling anonymous page reclamation and clearing swapcache are enabled at the same time. |
Values: 0, 1, 2, or 3 Default value: 0 |
kernel memory watermarks
Parameter |
Description |
Value |
---|---|---|
vm.lowmem_reserve_dma_ratio |
If the GFP flag (such as GFP_DMA) is not used to express how that memory should be allocated, when the memory in ZONE_HIGHMEM is insufficient, the memory can be allocated from the low end ZONE_NORMAL. When the memory in ZONE_NORMAL is insufficient, the memory can be allocated from the low end ZONE_DMA32. "low end" means that the physical memory address of the zone is smaller. "insufficient" means that the free memory of the current zone is less than the requested memory. Except ZONE_HIGHMEM, other zones reserve memory for the zone that has higher physical memory address, and the reserved memory is called lowmem reserve. The default value is DMA/normal/HighMem: 256 320, in pages. |
The default value is 0. |
OOM incident management
In some cases, you may hope that OOM does not kill the processes. Instead, the black box can report the incidents and trigger crashes to locate faults.
Parameter |
Description |
Value |
---|---|---|
vm.enable_oom_killer |
1 indicates OOM incident management is enabled and 0 indicates it is disabled. |
Default value: 0 |
Memory UCE error collection and reporting
After this feature is enabled, the system collects error information and sends an alarm to the alarm forwarding service when a memory UCE error occurs. Then, the programs that subscribe to the memory UCE fault event can receive the alarm and handle it. This way, memory UCE errors can be detected in real time and handled in a timely manner.
Parameter |
Description |
Value |
---|---|---|
vm.uce_handle_event_enable |
Whether to enable the kernel memory to report UCE errors. 0 indicates disabled and other values indicate enabled. This parameter is only available in Arm. |
The default value is 0. |
Parameters in the /proc/sys/mce Directory
The following parameters are from the files in the /proc/sys/mce directory.
UCE mechanism enhancement
Parameter |
Description |
Value |
---|---|---|
mce.mce_kernel_recover |
Controls whether to enable the kernel UCE mechanism enhancement. The value 1 indicates this option is enabled. You can run the following command to disable this option: echo 0 > /proc/sys/mce/mce_kernel_recover |
The default value is 1. |
Parameters in the /proc/sys/debug Directory
The following parameters are from the files in the /proc/sys/debug directory.
System exception notification
When an oops occurs, the kernel enters the die process and panic process. In this case, the callback functions registered with the die and panic notification chains are called. If the callback functions cause the kernel oops, the kernel enters the oops process. When the callback functions are called again in the oops process, an oops occurs again, and there is a nested oops. As a result, the system is suspended.
To enhance system fault locating and reliability, the system exception notification chain is introduced. When a nested oops occurs in the panic or die notification chain registered by a user, error logs are printed, and the crash process is executed to reset the system.
Parameter |
Description |
Value |
---|---|---|
debug.nest_oops_enhance |
Controls whether to enable the nested oops enhancement interface of the panic notification chain. |
The default value is 1, indicating that this option is enabled. |
debug.nest_panic_enhance |
Controls whether to enable the nested oops enhancement interface of the die notification chain. |
The default value is 1, indicating that this option is enabled. |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot