Error Message "no socket interface found" Displayed in Logs
Symptom
import os os.environ["NCCL_DEBUG"] = "INFO"
The following error message is displayed.
Possible Causes
The environment variables NCCL_IB_TC, NCCL_IB_GID_INDEX, and NCCL_IB_TIMEOUT are not configured. As a result, the communication is slow and unstable, and the IB communication is interrupted.
Solution
Add environment variables to the code.
import os os.environ["NCCL_IB_TC"] = "128" os.environ["NCCL_IB_GID_INDEX"] = "3" os.environ["NCCL_IB_TIMEOUT"] = "22"
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot