Import Operation
After a dataset is created, you can directly synchronize data from the dataset. Alternatively, you can import more data by importing the dataset. Data can be imported from an OBS directory or the manifest file.
Prerequisites
- You have created a dataset.
- You have stored the data to be imported in OBS. You have stored the manifest file in OBS.
- The OBS buckets and ModelArts are in the same region.
Import Modes
There are two import modes: OBS path and Manifest file.
- OBS path: indicates that the dataset to be imported has been stored in an OBS directory in advance. In this case, you need to select an OBS path that you can access. In addition, the directory structure in the OBS path must comply with the specifications. For details, see Specifications for Importing Data from an OBS Directory. Only the following types of dataset support the OBS path import mode: Image classification, Object detection, Text classification, Table, and Sound classification.
- Manifest file: indicates that the dataset file is in the manifest format and data is imported from the manifest file. The manifest file defines the mapping between labeling objects and content. In addition, the manifest file has been uploaded to OBS. For details about the specifications of the manifest file, see Specifications for Importing the Manifest File.
Before importing an object detection dataset, ensure that the labeling range of the labeling file does not exceed the size of the original image. Otherwise, the import may fail.
Dataset Type |
Importing Data from an OBS Path |
Importing Data from a Manifest File |
---|---|---|
Image classification |
Supported Follow the format specifications described in Image Classification. |
Supported Follow the format specifications described in Image Classification. |
Object detection |
Supported Follow the format specifications described in Object Detection. |
Supported Follow the format specifications described in Object Detection. |
Image segmentation |
Supported Follow the format specifications described in Image Segmentation. |
Supported Follow the format specifications described in Image Segmentation. |
Sound classification |
Supported Follow the format specifications described in Sound Classification. |
Supported Follow the format specifications described in Sound Classification. |
Speech labeling |
N/A |
Supported Follow the format specifications described in Speech Paragraph Labeling. |
Speech paragraph labeling |
N/A |
Supported Follow the format specifications described in Speech Labeling. |
Text classification |
Supported Follow the format specifications described in Text Classification. |
Supported Follow the format specifications described in Text Classification. |
Named entity recognition |
N/A |
Supported Follow the format specifications described in Named Entity Recognition. |
Text triplet |
N/A |
Supported Follow the format specifications described in Text Triplet. |
Table |
Supported You can also import data from DWS, DLI, or MRS. Follow the format specifications described in Table. |
N/A |
Video |
N/A |
Supported Follow the format specifications described in Video Labeling. |
Free format |
N/A |
N/A |
Importing Data from an OBS Path
The parameters on the GUI for data import vary according to the dataset type. The following uses a dataset of the image classification type as an example.
- Log in to the ModelArts management console. In the left navigation pane, choose Data Management > Datasets. The Datasets page is displayed.
- Locate the row that contains the desired dataset and choose More > Import in the Operation column.
Alternatively, you can click the dataset name to go to the Dashboard tab page of the dataset, and click Import in the upper right corner.
- In the Import dialog box, set Import Mode to OBS path and set OBS path to the path for storing data. Then click OK.
You can import a dataset of the table type from data sources such as OBS, Data Warehouse Service (DWS), Data Lake Insight (DLI), and MapReduce Service (MRS). The settings and data requirements for importing a dataset are the same as those for creating a dataset. For details about the parameters, see Table.
Figure 1 Importing the dataset from an OBS path
After the data import is successful, the data is automatically synchronized to the dataset. On the Datasets page, you can click the dataset name to view its details and label the data.
Importing Data from a Manifest File
The parameters on the GUI for data import vary according to the dataset type. The following uses a dataset of the object detection type as an example. Datasets of the table type cannot be imported from the manifest file.
- Log in to the ModelArts management console. In the left navigation pane, choose Data Management > Datasets. The Datasets page is displayed.
- Locate the row that contains the desired dataset and choose More > Import in the Operation column.
Alternatively, you can click the dataset name to go to the Dashboard tab page of the dataset, and click Import in the upper right corner.
- In the Import dialog box, set the parameters as follows and click OK.
- Import Mode: Select Manifest file.
- Manifest file: Select the OBS path for storing the manifest file.
- Import by Label: The system automatically obtains the labels of the dataset. You can click Add Label to add a label or click the deletion icon on the right to delete a label. This field is optional. After importing a dataset, you can add or delete labels during data labeling.
- Import labels: If this parameter is selected, the labels defined in the manifest file are imported to the ModelArts dataset.
- Import only hard examples: If this parameter is selected, only the hard attribute data of the manifest file is imported. Examples whose hard attribute is true in the manifest file are hard examples.
Figure 2 Importing the dataset
After the data import is successful, the data is automatically synchronized to the dataset. On the Datasets page, you can click the dataset name to go to the Dashboard tab page of the dataset, and click Label in the upper right corner. On the displayed dataset details page, view detailed data and label data.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot