What Should I Do If I/O Suspension Occasionally Occurs When SCSI EVS Disks Are Used?
Symptom
When SCSI EVS disks are used and containers are created and deleted on a CentOS node, the disks are frequently mounted and unmounted. The read/write rate of the system disk may instantaneously surge. As a result, the system is suspended, affecting the normal node running.
When this problem occurs, the following information is displayed in the dmesg log:
Attached SCSI disk task jdb2/xxx blocked for more than 120 seconds.
Example:

Possible Cause
After a PCI device is hot added to BUS 0, the Linux OS kernel will traverse all the PCI bridges mounted to BUS 0 for multiple times, and these PCI bridges cannot work properly during this period. During this period, if the PCI bridge used by the device is updated, due to a kernel defect, the device considers that the PCI bridge is abnormal, and the device enters a fault mode and cannot work normally. If the front end is writing data into the PCI configuration space for the back end to process disk I/Os, the write operation may be deleted. As a result, the back end cannot receive notifications to process new requests on the I/O ring. Finally, the front-end I/O suspension occurs.
This problem is caused by a Linux kernel defect. For details, see the defects in Linux distributions.
Impact
CentOS Linux kernels of versions earlier than 3.10.0-1127.el7 are affected.
Solution
Install the patch.
CCE cluster versions that support the patch are as follows:
- v1.15.6-r1
- v1.15.11-r1
- v.1.17.9-r0
For existing nodes
- Log in to the worker node as user root.
- Run the following command to install the patch:
cd /root; curl http://{obs_package_bucket}/cluster-versions/CCE-Kernel-Patch-1127.18.2.1.tgz -1 -O; tar -zxf CCE-Kernel-Patch-1127.18.2.1.tgz; bash update.sh; script_path=`cat /etc/crontab | grep 'run/update.sh' | awk -F " " '{print $8}'`; if [ $script_path"x" != "x" ];then sed -i "/grub2-mkconfig/d" $script_path; fiobs_package_bucket is the address resolved from the domain name of the OBS bucket.
- Restart the OS after the patch is installed.
For newly created nodes
- Log in to the CCE console. In the navigation pane, choose Resource Management > Clusters. In the card view of the cluster to which you will add nodes, click Buy Node.
- Enter the following command in the Post-installation Script text box:
cd /root; curl http://{obs_package_bucket}/cluster-versions/CCE-Kernel-Patch-1127.18.2.1.tgz -1 -O; tar -zxf CCE-Kernel-Patch-1127.18.2.1.tgz; bash update.sh; script_path=`cat /etc/crontab | grep 'run/update.sh' | awk -F " " '{print $8}'`; if [ $script_path"x" != "x" ];then sed -i "/grub2-mkconfig/d" $script_path; fiobs_package_bucket is the address resolved from the domain name of the OBS bucket.
- After the node is created, log in to the node and restart the OS for the kernel patch to take effect. Before restarting the node, you are advised to evict the workloads on the node to avoid service interruption.
For existing BMS nodes
- Log in to the node as user root and ensure that the current BMS kernel version is h275.
Run the following command on the node:
uname -r
- Run the following command to install the patch:
mkdir -p /root/upgrade_ovs/;cd /root/upgrade_ovs/;wget https://{obs_package_bucket}/package/canal-agent/canal-agent-20.6.0.B005.sp1.tgz;tar zxvf canal-agent-20.6.0.B005.sp1.tgz;tar zxvf canal-agent/package/openvswitch-20.6.0.B003-x86_64.tar.gz;bash openvswitch/can_ovs.sh;bash openvswitch/can_ovs.sh uninstall;bash openvswitch/can_ovs.sh install;
obs_package_bucket is the address resolved from the domain name of the OBS bucket.
- After the command is executed, run the modinfo openvswitch command to check whether the current Open vSwitch version is 3.10.0-514.44.5.10.h142.x86_64.

Appendix
|
Name |
Value |
|---|---|
|
CN East-Shanghai2 |
obs.cn-east-2.myhuaweicloud.com/cce-east |
|
CN North-Beijing1 |
obs.cn-north-1.myhuaweicloud.com/cce-north |
|
CN North-Beijing4 |
cce-north-4.obs.cn-north-4.myhuaweicloud.com |
|
CN-Hong Kong |
obs.ap-southeast-1.myhuaweicloud.com/cce-ap-southeast |
|
CN South-Guangzhou |
obs.cn-south-1.myhuaweicloud.com/cce-south |
|
AP-Bangkok |
obs.ap-southeast-2.myhuaweicloud.com/cce-ap-southeast-2 |
|
CN Southwest-Guiyang1 |
obs.cn-southwest-2.myhuaweicloud.com/cce-statics.cn-southwest-2 |
|
CN South-Shenzhen |
obs.cn-south-2.myhuaweicloud.com/cce-south-2 |
|
LA-Sao Paulo1 |
obs.sa-brazil-1.myhuaweicloud.com/cce-statics.sa-brazil-1 |
|
AF-Johannesburg |
obs.af-south-1.myhuaweicloud.com/cce-statics.af-south-1 |
|
AP-Singapore |
obs.ap-southeast-3.myhuaweicloud.com/cce-statics.ap-southeast-3 |
|
RU-Moscow2 |
obs.ru-northwest-2.myhuaweicloud.com/cce-statics.ru-northwest-2 |
|
LA-Santiago |
cce.la-south-2.obs.la-south-2.myhuaweicloud.com |
|
CN East-Shanghai1 |
obs.cn-east-3.myhuaweicloud.com/cce-statics.cn-east-3 |
|
CN North-Ulanqab202 |
obs.cn-north-6.myhuaweicloud.com/cce-statics.cn-north-6 |
|
CN North-Beijing2 |
obs.cn-north-2.myhuaweicloud.com/cce-north-2 |
|
CN South-Dongguan201 |
obs.cn-south-3.myhuaweicloud.com/cce-statics.cn-south-3 |
|
CN Southwest-Guiyang201 |
obs.cn-southwest-1.myhuaweicloud.com/cce-statics.cn-southwest-1 |
Did this article solve your problem?
Thank you for your score!Your feedback would help us improve the website.