Creating a Custom Image on ECS and Using It

Application Scenarios and Process

You can write a Dockerfile based on a preset base image or third-party image to customize your image on ECS. Then, register the image to create a new development environment based on your needs.

This section describes how to install PyTorch 1.8, FFmpeg 3, and GCC 8 on an Ubuntu image to create a new AI development environment.

The following figure shows the whole process.

Figure 1 Creating and debugging an image

Specifications for Custom Images

The base image for creating a custom image must meet either of the following conditions:

It is an open-source image from the official website of Ascend or Docker Hub and it meets the following OS constraints:
x86: Ubuntu 18.04 or Ubuntu 20.04

Arm: Euler 2.8.3 or Euler 2.10.7

There may be a compatibility issue for Ubuntu 20.04.6. Use an earlier version.
If an image error occurs due to unmet requirements, check the image specifications and rectify the fault by referring to Troubleshooting for Custom Images in Notebook Instances. If the fault persists, contact Huawei technical support.

Procedure

Prepare a Linux environment. The following uses ECS as an example.
Create an image on ECS. The Dockerfile sample file is provided.
Upload the created image to SWR.
Register an SWR image on ModelArts.
Create a notebook instance and verify the new image.

Preparing a Docker Server and Configuring the Environment

Prepare a server with Docker enabled. If no such a server is available, create an ECS, buy an EIP, and install required software on it.

ModelArts provides Ubuntu scripts for you to install Docker easier.

The operations on the local Linux server are the same as those on the ECS. For details, see this case.

Log in to the ECS console and click Buy ECS. Select a public image (an Ubuntu 18.04 image is recommended) and set the system disk to 100 GiB. For details, see Purchasing and Logging In to a Linux ECS.
Figure 2 Selecting an image and a disk
Purchase an EIP and bind it to the ECS. For details, see Configure Network.
Configure the VM environment.
1. Run the following command on the Docker ECS to download the installation script:
```
wget https://cnnorth4-modelarts-sdk.obs.cn-north-4.myhuaweicloud.com/modelarts/custom-image-build/install_on_ubuntu1804.sh
```
  Only Ubuntu scripts are supported.
2. Run the following command on the Docker ECS to configure the environment:
```
bash install_on_ubuntu1804.sh
```
  Figure 3 Configured
```
source /etc/profile
```
  The installation script is executed to:
  1. Install Docker.
  2. If the Docker ECS runs on GPUs, install nvidia-docker2 to mount the GPUs to the Docker container.

Creating a Custom Image

This section describes how to edit a Dockerfile, use it to create an image, and use the created image to create a notebook instance. For details about the Dockerfile, see Dockerfile reference.

Querying Base Images (Skip This Step for Third-Party Images)
For details about ModelArts base images, see Preset Dedicated Images in Notebook Instances. Check the image URL in the corresponding section based on the engine type of the preset image.
Access SWR.
1. Log in to the SWR console. In the navigation pane on the left, choose Dashboard, and click Generate Login Command in the upper right corner. On the displayed page, copy the login command.
  Figure 4 Obtaining the login command
  - The validity period of the generated login command is 24 hours. To obtain a long-term valid login command, see Obtaining a Login Command with Long-Term Validity. After you obtain a long-term valid login command, your temporary login commands will still be valid as long as they are in their validity periods.
  - The domain name at the end of the login command is the image repository address. Record the address for later use.
2. Run the login command on the machine where the container engine is installed. The message "Login Succeeded" will be displayed upon a successful login.

Pull a base image or third-party image. The following uses a third-party image as an example.

docker pull swr.ap-southeast-1.myhuaweicloud.com/notebook-xxx/ubuntu:18.04 #Your organization name and image

Compile a Dockerfile.
Run the vim command to create a Dockerfile. If a ModelArts base image is used, see Dockerfile on a ModelArts Base Image for details about the Dockerfile.

If a third-party image is used, add user ma-user whose UID is 1000 and user group ma-group whose GID is 100. For details, see Dockerfile on a Non-ModelArts Base Image.

In this case, PyTorch 1.8, FFmpeg 3, and GCC 8 will be installed on an Ubuntu image to build an AI image.

Build an image.
Run the docker build command to build a new image from the Dockerfile. The descriptions of the command parameters are as follows:
- -t specifies the new image path, including region information, organization name, image name, and version. Set this parameter based on the real-life scenario. Use a complete SWR address for debugging and registration.
- -f specifies the Dockerfile name. Set this parameter based on the real-life scenario.
- The period (.) at the end specifies that the context is the current directory. Set this parameter based on the real-life scenario.
```
docker build -t swr.ap-southeast-1.myhuaweicloud.com/notebook-xxx/pytorch_1_8:v1 -f Dockerfile .
```
Figure 5 Image created

Registering a New Image

After an image is debugged, register it with ModelArts image management so that the image can be used in ModelArts.

Upload the image to SWR.
Log in to SWR first. For details, see Logging in to SWR. Run the following command to push the image:
```
docker push swr.ap-southeast-1.myhuaweicloud.com/notebook-xxx/pytorch_1_8:v1
```
The image is then available on SWR.

Figure 6 Uploading the image to SWR
Register an image.
Registering an image on the ModelArts console

Log in to the ModelArts console. In the navigation pane on the left, choose Image Management to access the image management page.
1. Click Register. Set SWR Source to the image pushed to SWR in step 1. Paste the complete SWR address or click to select a private image from SWR for registration.
2. Set Architecture and Type based on the site requirements. The values must be those of the image source.
When you register an image, ensure that the architecture and type are the same as those of the image source. Otherwise, the creation fails.

Using a New Image to Create a Development Environment

After the image is created, log in to the ModelArts console, go to the notebook tab, and choose the image registered in 2 to create a development environment.
Go to the notebook list, click Open to start the created development environment.
Figure 7 Accessing a development environment
Open a terminal to check the conda environment. For more information about conda, see the official website.
Each kernel in the development environment is essentially a conda environment installed in /home/ma-user/anaconda3/. Run the /home/ma-user/anaconda3/bin/conda env list command to check the conda environment.
Figure 8 Checking the conda environment

Dockerfile on a ModelArts Base Image

Run the vim command to create a Dockerfile. If the base image is provided by ModelArts, the content of the Dockerfile is as follows:

FROM swr.ap-southeast-1.myhuaweicloud.com/atelier/notebook2.0-pytorch-1.4-kernel-cp37:3.3.3-release-v1-20220114

USER root
# section1: config apt source
RUN mv /etc/apt/sources.list /etc/apt/sources.list.bak && \
    echo -e "deb http://repo.huaweicloud.com/ubuntu/ bionic main restricted\ndeb http://repo.huaweicloud.com/ubuntu/ bionic-updates main restricted\ndeb http://repo.huaweicloud.com/ubuntu/ bionic universe\ndeb http://repo.huaweicloud.com/ubuntu/ bionic-updates universe\ndeb http://repo.huaweicloud.com/ubuntu/ bionic multiverse\ndeb http://repo.huaweicloud.com/ubuntu/ bionic-updates multiverse\ndeb http://repo.huaweicloud.com/ubuntu/ bionic-backports main restricted universe multiverse\ndeb http://repo.huaweicloud.com/ubuntu bionic-security main restricted\ndeb http://repo.huaweicloud.com/ubuntu bionic-security universe\ndeb http://repo.huaweicloud.com/ubuntu bionic-security multiverse" > /etc/apt/sources.list && \
    apt-get update
# section2: install ffmpeg and gcc
RUN apt-get -y install ffmpeg && \
    apt -y install gcc-8 g++-8 && \
    update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-8 80 --slave /usr/bin/g++ g++ /usr/bin/g++-8 && \
    rm $HOME/.pip/pip.conf
USER ma-user
# section3: configure conda source and pip source
RUN echo -e "channels:\n  - defaults\nshow_channel_urls: true\ndefault_channels:\n  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main\n  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/r\n  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/msys2\ncustom_channels:\n  conda-forge: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud\n  msys2: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud\n  bioconda: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud\n  menpo: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud\n  pytorch: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud\n  pytorch-lts: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud\n  simpleitk: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud" > $HOME/.condarc && \
    echo -e "[global]\nindex-url = https://pypi.tuna.tsinghua.edu.cn/simple\n[install]\ntrusted-host = https://pypi.tuna.tsinghua.edu.cn" > $HOME/.pip/pip.conf
# section4: create a conda environment(only support python=3.7) and install pytorch1.8
RUN source /home/ma-user/anaconda3/bin/activate && \
    conda create -y --name pytorch_1_8 python=3.7 && \
    conda activate pytorch_1_8 && \
    pip install torch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 && \
    conda deactivate

Dockerfile on a Non-ModelArts Base Image

If a third-party image is used, add user ma-user whose UID is 1000 and user group ma-group whose GID is 100 to the Dockerfile. If UID 1000 or GID 100 in the base image has been used by another user or user group, delete the user or user group. The user and user group have been added to the Dockerfile in this case. You can directly use them.

You only need to set the user ma-user whose UID is 1000 and the user group ma-group whose GID is 100, and grant the read, write, and execute permissions on the target directory to user ma-user.

Run the vim command to create a Dockerfile and add a third-party (non-ModelArts) image as the base image, for example, ubuntu 18.04. The content of the Dockerfile is as follows:

# Replace it with the actual image version.
FROM ubuntu:18.04
# Set the user ma-user whose UID is 1000 and the user group ma-group whose GID is 100
USER root
RUN default_user=$(getent passwd 1000 | awk -F ':' '{print $1}') || echo "uid: 1000 does not exist" && \
    default_group=$(getent group 100 | awk -F ':' '{print $1}') || echo "gid: 100 does not exist" && \
    if [ ! -z ${default_user} ] && [ ${default_user} != "ma-user" ]; then \
        userdel -r ${default_user}; \
    fi && \
    if [ ! -z ${default_group} ] && [ ${default_group} != "ma-group" ]; then \
        groupdel -f ${default_group}; \
    fi && \
    groupadd -g 100 ma-group && useradd -d /home/ma-user -m -u 1000 -g 100 -s /bin/bash ma-user && \
# Grant the read, write, and execute permissions on the target directory to the user ma-user.
chmod -R 750 /home/ma-user

#Configure the APT source and install the ZIP and Wget tools (required for installing conda).
RUN mv /etc/apt/sources.list /etc/apt/sources.list.bak && \
    echo "deb http://repo.huaweicloud.com/ubuntu/ bionic main restricted\ndeb http://repo.huaweicloud.com/ubuntu/ bionic-updates main restricted\ndeb http://repo.huaweicloud.com/ubuntu/ bionic universe\ndeb http://repo.huaweicloud.com/ubuntu/ bionic-updates universe\ndeb http://repo.huaweicloud.com/ubuntu/ bionic multiverse\ndeb http://repo.huaweicloud.com/ubuntu/ bionic-updates multiverse\ndeb http://repo.huaweicloud.com/ubuntu/ bionic-backports main restricted universe multiverse\ndeb http://repo.huaweicloud.com/ubuntu bionic-security main restricted\ndeb http://repo.huaweicloud.com/ubuntu bionic-security universe\ndeb http://repo.huaweicloud.com/ubuntu bionic-security multivers e" > /etc/apt/sources.list && \
apt-get update && \
apt-get install -y zip wget

#Modifying the system Configuration of the image (required for creating the Conda environment)
RUN rm /bin/sh && ln -s /bin/bash /bin/sh

#Switch to user ma-user , download miniconda from the Tsinghua repository, and install miniconda in /home/ma-user.
USER ma-user
RUN cd /home/ma-user/ && \
    wget --no-check-certificate https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/Miniconda3-4.6.14-Linux-x86_64.sh && \
    bash Miniconda3-4.6.14-Linux-x86_64.sh -b -p /home/ma-user/anaconda3 && \
    rm -rf Miniconda3-4.6.14-Linux-x86_64.sh

#Configure the conda and pip sources
RUN mkdir -p /home/ma-user/.pip && \
    echo -e "channels:\n  - defaults\nshow_channel_urls: true\ndefault_channels:\n  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main\n  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/r\n  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/msys2" > /home/ma-user/.condarc && \
    echo -e "[global]\nindex-url = https://pypi.tuna.tsinghua.edu.cn/simple\n[install]\ntrusted-host = https://pypi.tuna.tsinghua.edu.cn" > /home/ma-user/.pip/pip.conf

#Create the conda environment and install the Python third-party package. The ipykernel package is mandatory for starting a kernel.
RUN source /home/ma-user/anaconda3/bin/activate && \
    conda create -y --name pytorch_1_8 python=3.7 && \
    conda activate pytorch_1_8 && \
    pip install torch==1.8.1 torchvision==0.9.1 && \
    pip install ipykernel==6.7.0 && \
    conda init bash && \
    conda deactivate 

#Install FFmpeg and GCC
USER root
RUN apt-get -y install ffmpeg && \
    apt -y install gcc-8 g++-8

Parent topic: Creating a Custom Image for a Notebook Instance

Previous topic: Creating a Custom Image

Next topic: Creating a Custom Image Using Dockerfile