Updated on 2024-07-25 GMT+08:00

Specifications for Custom Images for Training Jobs

When you use a locally developed model and training script to create a custom image, ensure that the custom image complies with the specifications defined by ModelArts.

Specifications

  • Use Ubuntu 18.04 for custom images to in case versions are not compatible.
  • Do not use a custom image larger than 15 GB. The size should not exceed half of the container engine space of the resource pool. Otherwise, the start time of the training job is affected.

    The container engine space of ModelArts public resource pool is 50 GB. By default, the container engine space of the dedicated resource pool is also 50 GB. You can customize the container engine space when creating a dedicated resource pool.

  • The uid of the default user of a custom image must be 1000.
  • The GPU or Ascend driver cannot be installed in a custom image. When you select GPU resources to run training jobs, ModelArts automatically places the GPU driver in the /usr/local/nvidia directory in the training environment. When you select Ascend resources to run training jobs, ModelArts automatically places the Ascend driver in the /usr/local/Ascend/driver directory.
  • x86- or Arm-based custom images can run only with specifications corresponding to their architecture.
    • Run the following command to check the CPU architecture of a custom image:
      docker inspect {Custom image path} | grep Architecture
      The following is the command output for an Arm-based custom image:
      "Architecture": "arm64"
    • If the name of a specification contains Arm, this specification is an Arm-based CPU architecture.

    • If the name of a specification does not contain Arm, this specification is an x86-based CPU architecture.
  • ModelArts does not support the download of open source installation packages. Install the dependency packages required by the training job in the custom image.