Help Center/ ModelArts/ Troubleshooting/ Training Jobs/ OBS Operation Issues/ Error Message "BrokenPipeError: Broken pipe" Is Displayed When OBS Data Is Copied
Updated on 2025-08-22 GMT+08:00

Error Message "BrokenPipeError: Broken pipe" Is Displayed When OBS Data Is Copied

Symptom

When a training job uses MoXing to copy data, error message "BrokenPipeError: [Errno xx] Broken pipe" is printed in the log.

Possible Causes

The possible causes are as follows:

  • In a large-scale distributed job, multiple nodes are concurrently copying files in the same bucket, leading to traffic control in the OBS bucket.
  • There is a large number of OBS client connections. During the polling between processes or threads, an OBS client connection times out if the server does not respond within 30 seconds. As a result, the server releases the connection.

Solution

  1. If the issue is caused by traffic control, the following error code is displayed. In this case, . For details about OBS error codes, see OBS Server-Side Error Codes.
    [ModelArts Service Log]2021-01-21 11:35:42,178 - file_io.py[line:658] - ERROR:
    		stat:503
    		errorCode:None
    		errorMessage:None
    		reason:Service Unavailable
  2. If the issue is caused by the large number of client connections, especially for files larger than 5 GB, OBS APIs cannot be directly called. In this case, use multiple threads to copy data. The timeout duration set on the OBS server is 30s. Run the following commands to reduce the number of processes:
    # Configure the number of processes.
    os.environ['MOX_FILE_LARGE_FILE_TASK_NUM']=1
    import moxing as mox
    
    # Copy files.
    mox.file.copy_parallel(src_url=your_src_dir, dst_url=your_target_dir, threads=0, is_processing=False)

    When creating a training job, you can use the environment variable MOX_FILE_PARTIAL_MAXIMUM_SIZE to configure the threshold (in bytes) for downloading large files in multiple parts. If the size of a file exceeds the threshold, the file will be downloaded in multiple parts concurrently.

Summary and Suggestions

Before creating a training job, use the ModelArts development environment to debug your training code and minimize migration errors.