How Do I Handle the IB Network Failure?
RDMA Communication Failure Between Two IB ECSs
- Check whether the Pkeys on the two ECSs are consistent.
Run the following command to check for the Pkeys allocated to the ECSs:
cat /sys/class/infiniband/mlx5_0/ports/1/pkeys/* | grep -v "0x0000"
Figure 1 Checking Pkey consistency
- If only one Pkey is obtained, contact technical support.
- If two Pkeys are obtained, ensure that the two Pkeys on the two ECSs are the same.
- Run the following command to check whether the firewall is disabled:
Figure 2 Checking the firewall
If the firewall is not disabled, run the following command to disable it:
service firewalld stop
- Check whether the RDMA communication command is correct.
Run the following command on ECS 1 (client):
ib_write_lat -x 0 --pkey_index 0 192.168.0.218
Run the following command on ECS 2 (server):
ib_write_lat -x 0 --pkey_index 0
No IP Address for the ECS IB Port
After you run the ifconfig command, it is found that no IP address has been assigned to the ECS InfiniBand (IB) port.
- Run the following command to check for the Pkey:
cat /sys/class/infiniband/mlx5_0/ports/1/pkeys/* | grep -v "0x0000"
Figure 3 Checking Pkey
If only one Pkey is obtained, contact technical support.
- Run the following command to assign an IP address to the ECS IB port:
If no command output is displayed, the IP address cannot be obtained using DHCP.
- Contact technical support.
After you have performed the preceding steps, if the IB network still cannot be used for communication or the IB port still cannot obtain an IP address, contact technical support for assistance and provide the technical support engineer with the following information.
Item
Description
Example
Your Value
VPC1 ID
VPC 1 ID
Example: fef65559-c154-4229-afc4-9ad0314437ea
N/A
VM1 ID
ID of ECS 1 in VPC 1
Example: f7619b12-3683-4203-9271-f34f283cd740
N/A
VM2 ID
ID of ECS 2 in VPC 1
Example: f75df766-68aa-4ef3-a493-06cdc26ac37a
N/A
Did this article solve your problem?
Thank you for your score!Your feedback would help us improve the website.