Obtaining the Sample Dataset of ModelArts
ModelArts provides multiple samples based on various AI engines for beginners. For details about the samples, see the ModelArts Best Practices. For each sample, ModelArts has stored the sample dataset in the public OBS bucket. You can select the OBS path based on your region to obtain the sample dataset.
For details about the storage information about each sample dataset, see Sample Dataset Storage Path. You can use different methods to copy the sample dataset to your OBS bucket based on the region where your OBS bucket resides. For details about the methods, see Figure 1. If the dataset is large, you are advised to use method 2 or 3 to copy it. If the dataset is small, method 1 is recommended.
- Method 1: Download and Upload Files
- Method 2: Use the MoXing API to Copy a Dataset from the Public Bucket to Your OBS Bucket
- Method 3: Use the obsutill Tool of OBS to Copy Files
Sample Dataset Storage Path
When the sample dataset is stored, two formats are available: compressed and decompressed files. The data of both formats is the same.
- Compressed file: After downloading the compressed file, upload it to your OBS bucket. You need to decompress the file before using it. However, it is convenient to download the file.
- Decompressed file: You can directly copy the decompressed file to your OBS bucket. That is, you can copy an OBS bucket to another OBS bucket. The prerequisite is that your OBS bucket and the OBS bucket of the sample dataset belong to the same region.
|
Sample Name |
Dataset Format |
Region |
OBS Path |
Sample |
|---|---|---|---|---|
|
Yunbao Detection |
Compressed file |
CN North-Beijing1 |
https://modelarts-cnnorth1-market-dataset.obs.cn-north-1.myhuaweicloud.com/dataset-market/Yunbao-Data-Custom/archiver/Yunbao-Data-Custom.zip |
HUAWEI CLOUD Mascot Detection (Using ExeML for Object Detection) |
|
CN North-Beijing4 |
https://modelarts-cnnorth4-market-dataset.obs.cn-north-4.myhuaweicloud.com/dataset-market/Yunbao-Data-Custom/archiver/Yunbao-Data-Custom.zip |
|||
|
Decompressed file |
CN North-Beijing1 |
obs://modelarts-cnnorth1-market-dataset/dataset-market/Yunbao-Data-Custom/unarchiver |
||
|
CN North-Beijing4 |
obs://modelarts-cnnorth4-market-dataset/dataset-market/Yunbao-Data-Custom/unarchiver |
|||
|
Flower Recognition |
Compressed file |
CN North-Beijing1 |
https://modelarts-cnnorth1-market-dataset.obs.cn-north-1.myhuaweicloud.com/dataset-market/Flowers-Data-Set/archiver/Flowers-Data-Set.zip |
Flower Recognition (Using a Built-in Algorithm in Training Management for Image Classification) |
|
CN North-Beijing4 |
https://modelarts-cnnorth4-market-dataset.obs.cn-north-4.myhuaweicloud.com/dataset-market/Flowers-Data-Set/archiver/Flowers-Data-Set.zip |
|||
|
Decompressed file |
CN North-Beijing1 |
obs://modelarts-cnnorth1-market-dataset/dataset-market/Flowers-Data-Set/unarchiver |
||
|
CN North-Beijing4 |
obs://modelarts-cnnorth4-market-dataset/dataset-market/Flowers-Data-Set/unarchiver |
|||
|
Iceberg Detection |
Compressed file |
CN North-Beijing1 |
https://modelarts-cnnorth1-market-dataset.obs.cn-north-1.myhuaweicloud.com/dataset-market/Iceberg-Data-Set/archiver/Iceberg-Data-Set.zip |
Iceberg Detection (Using the MoXing Framework for Image Classification) |
|
CN North-Beijing4 |
https://modelarts-cnnorth4-market-dataset.obs.cn-north-4.myhuaweicloud.com/dataset-market/Iceberg-Data-Set/archiver/Iceberg-Data-Set.zip |
|||
|
Decompressed file |
CN North-Beijing1 |
obs://modelarts-cnnorth1-market-dataset/dataset-market/Iceberg-Data-Set/unarchiver |
||
|
CN North-Beijing4 |
obs://modelarts-cnnorth4-market-dataset/dataset-market/Iceberg-Data-Set/unarchiver |
|||
|
Handwritten Digit Recognition |
Compressed file |
CN North-Beijing1 |
https://modelarts-cnnorth1-market-dataset.obs.cn-north-1.myhuaweicloud.com/dataset-market/Mnist-Data-Set/archiver/Mnist-Data-Set.zip |
Use MoXing to Develop Training Scripts for Handwritten Digit Recognition Using a Notebook for Handwritten Digit Recognition Using MXNet for Handwritten Digit Recognition |
|
CN North-Beijing4 |
https://modelarts-cnnorth4-market-dataset.obs.cn-north-4.myhuaweicloud.com/dataset-market/Mnist-Data-Set/archiver/Mnist-Data-Set.zip |
|||
|
Decompressed file |
CN North-Beijing1 |
obs://modelarts-cnnorth1-market-dataset/dataset-market/Mnist-Data-Set/unarchiver |
||
|
CN North-Beijing4 |
obs://modelarts-cnnorth4-market-dataset/dataset-market/Mnist-Data-Set/unarchiver |
|||
|
Caltech Image Recognition |
Compressed file |
CN North-Beijing1 |
https://modelarts-cnnorth1-market-dataset.obs.cn-north-1.myhuaweicloud.com/dataset-market/Caltech101-data-set/archiver/Caltech101-data-set.zip |
|
|
CN North-Beijing4 |
https://modelarts-cnnorth4-market-dataset.obs.cn-north-4.myhuaweicloud.com/dataset-market/Caltech101-data-set/archiver/Caltech101-data-set.zip |
|||
|
Decompressed file |
CN North-Beijing1 |
obs://modelarts-cnnorth1-market-dataset/dataset-market/Caltech101-data-set/unarchiver |
||
|
CN North-Beijing4 |
obs://modelarts-cnnorth4-market-dataset/dataset-market/Caltech101-data-set/unarchiver |
Method 1: Download and Upload Files
There is no specific restriction on the region. You can select any region to download the dataset. To improve operation efficiency, you are advised to download the compressed dataset. However, the download and upload speeds depend on your local network conditions.
- Select the storage path of the target sample dataset. You can select a dataset in an OBS bucket in any region. You are advised to download the compressed dataset file. Click the link to download the sample dataset to the local PC.
For example, if you click the link for downloading the Yunbao-Data-Custom.zip dataset for Yunbao Detection in the CN North-Beijing1 region, the Yunbao-Data-Custom.zip file is downloaded to the local PC.
- Decompress the obtained file and upload all folders of the dataset to the OBS directory.
- First, create an OBS bucket and a folder for storing the sample dataset.
For example, create an OBS bucket named test-modelarts and a folder named dataset-yunbao.
- Decompress the Yunbao-Data-Custom.zip file to the Yunbao-Data-Custom directory on the local PC.
- Upload all files in the Yunbao-Data-Custom directory to the test-modelarts/dataset-yunbao directory on OBS. For details about how to upload files, see Uploading a File.
- First, create an OBS bucket and a folder for storing the sample dataset.
Method 2: Use the MoXing API to Copy a Dataset from the Public Bucket to Your OBS Bucket
The sample dataset must be in the same region as your OBS bucket and you are familiar with notebook operations and ModelArts MoXing. You can copy the sample dataset from the public bucket to your OBS bucket.
You are advised to obtain the OBS path (in OBS format) of the desired decompressed dataset listed in Table 1, create a notebook instance in ModelArts, and copy the dataset to your OBS bucket.
- Access the ModelArts management console, create a notebook instance, and create a file on the Jupyter page.
- Click the new file to access the development environment.
- Check whether the public bucket where the sample dataset resides is accessible.
For example, obtain the sample dataset of Yunbao Detection in the CN North-Beijing1 region from Table 1. The OBS path is obs://modelarts-cnnorth1-market-dataset/dataset-market/Yunbao-Data-Custom/unarchiver. Run the following command to check whether the public bucket is accessible:
import moxing as mox mox.file.exists('obs://modelarts-cnnorth1-market-dataset/dataset-market/Yunbao-Data-Custom/unarchiver')If True is returned, the OBS bucket is normal.
- Check whether your OBS bucket can be accessed.
For example, create an OBS bucket named test-modelarts and a folder named dataset-yunbao. Run the following command to check whether your bucket is accessible:
import moxing as mox mox.file.exists('obs://test-modelarts/dataset-yunbao')If True is returned, the OBS bucket is normal.
- Check whether you have the write permission on the OBS bucket.
For example, the path of the target OBS bucket is obs://test-modelarts/dataset-yunbao. Run the following command to check the permission:
import moxing as mox mox.file.write('obs://test-modelarts/dataset-yunbao/obs_file.txt', 'Hello, OBS Bucket!') mox.file.remove('obs://test-modelarts/dataset-yunbao/obs_file.txt', recursive=False) - Run the following command to copy the sample dataset from the public bucket to your OBS bucket:
import moxing as mox mox.file.copy_parallel('obs://modelarts-cnnorth1-market-dataset/dataset-market/Yunbao-Data-Custom/unarchiver', 'obs://test-modelarts/dataset-yunbao ') print ('Copy procedure is completed')When Copy procedure is completed and the execution time are returned, the dataset is copied. Information similar to the following is displayed:
Copy procedure is completed CPU times: user 117 ms, sys: 92.3 ms, total: 209 ms Wall time: 58.3 s
Method 3: Use the obsutill Tool of OBS to Copy Files
The sample dataset must be in the same region as your OBS bucket. You can use the obsutil tool provided by OBS to copy the sample dataset. You are advised to obtain the OBS path (in OBS format) of the decompressed dataset file in Table 1 and copy the file to your OBS bucket by running the object copy command in Copying an Object.
For details about how to use obsutil, see obsutil in the Object Storage Service Tools Guide.

Did this article solve your problem?
Thank you for your score!Your feedback would help us improve the website.