Error Message "BrokenPipeError: Broken pipe" Is Displayed When OBS Data Is Copied
Symptom
When a training job uses MoXing to copy data, error message "BrokenPipeError: [Errno xx] Broken pipe" is printed in the log.
Possible Causes
The possible causes are as follows:
- In a large-scale distributed job, multiple nodes are concurrently copying files in the same bucket, leading to traffic control in the OBS bucket.
- There is a large number of OBS client connections. During the polling between processes or threads, an OBS client connection times out if the server does not respond within 30 seconds. As a result, the server releases the connection.
Solution
- If the issue is caused by traffic control, the following error code is displayed. In this case, . For details about OBS error codes, see OBS Server-Side Error Codes.
[ModelArts Service Log]2021-01-21 11:35:42,178 - file_io.py[line:658] - ERROR: stat:503 errorCode:None errorMessage:None reason:Service Unavailable
- If the issue is caused by the large number of client connections, especially for files larger than 5 GB, OBS APIs cannot be directly called. In this case, use multiple threads to copy data. The timeout duration set on the OBS server is 30s. Run the following commands to reduce the number of processes:
# Configure the number of processes. os.environ['MOX_FILE_LARGE_FILE_TASK_NUM']=1 import moxing as mox # Copy files. mox.file.copy_parallel(src_url=your_src_dir, dst_url=your_target_dir, threads=0, is_processing=False)
When creating a training job, you can use the environment variable MOX_FILE_PARTIAL_MAXIMUM_SIZE to configure the threshold (in bytes) for downloading large files in multiple parts. If the size of a file exceeds the threshold, the file will be downloaded in multiple parts concurrently.
Summary and Suggestions
- Use the notebook environment for online debugging. For details, see Using JupyterLab to Develop Models.
- Use a local IDE (PyCharm or VS Code) to access the cloud environment for debugging. For details, see Using a Local IDE to Develop Models.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot