Updated on 2024-10-29 GMT+08:00

Specifications for Importing a Manifest File

The manifest file defines the mapping between labeled objects and content. The manifest file import mode means that the manifest file is used for dataset import. The manifest file can be imported from OBS. When importing a manifest file from OBS, ensure that you have the permissions to access the directory where the manifest file is stored.

There are many requirements on the manifest file compilation. Import new data from OBS. Generally, manifest file import is used for data migration of ModelArts in different regions or using different accounts. If you have labeled data in a region using ModelArts, you can obtain the manifest file of the published dataset from the output path. Then you can import the dataset using the manifest file to ModelArts of other regions or accounts. The imported data carries the labeling information and does not need to be labeled again, improving development efficiency.

The manifest file that contains information about the original file and labeling can be used in labeling, training, and inference scenarios. The manifest file that contains only information about the original file can be used in inference scenarios or used to generate an unlabeled dataset. The manifest file must meet the following requirements:

  • The manifest file uses the UTF-8 encoding format.
  • The manifest file uses the JSON Lines format (jsonlines.org). A line contains one JSON object.
    {"source": "/path/to/image1.jpg", "annotation": ... }
    {"source": "/path/to/image2.jpg", "annotation": ... }
    {"source": "/path/to/image3.jpg", "annotation": ... }

    In the preceding example, the manifest file contains multiple lines of JSON object.

  • The manifest file can be generated by you, third-party tools, or ModelArts Data Labeling. The file name can be any valid file name. To facilitate the internal use of the ModelArts system, the file name generated by the ModelArts data labeling function consists of the following strings: DatasetName-VersionName.manifest. For example, animal-v201901231130304123.manifest.

Image Classification

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
{
    "source":"s3://path/to/image1.jpg",
    "usage":"TRAIN",
    "hard":"true",
    "hard-coefficient":0.8,
    "id":"0162005993f8065ef47eefb59d1e4970",
    "annotation": [
        {
            "type": "modelarts/image_classification",
            "name": "cat",
            "property": {
                "color":"white",
                "kind":"Persian cat"            
            },
            "hard":"true",
            "hard-coefficient":0.8,
            "annotated-by":"human",
            "creation-time":"2019-01-23 11:30:30"        
        },
        {
            "type": "modelarts/image_classification",
            "name":"animal",
            "annotated-by":"modelarts/active-learning",
            "confidence": 0.8,
            "creation-time":"2019-01-23 11:30:30"        
        }],
    "inference-loc":"/path/to/inference-output"
}
Table 1 Parameters

Parameter

Mandatory

Description

source

Yes

URI of an object to be labeled. For details about data source types and examples, see Table 2.

usage

No

By default, the parameter value is left blank. Possible values are as follows:

  • TRAIN: The object is used for training.
  • EVAL: The object is used for evaluation.
  • TEST: The object is used for testing.
  • INFERENCE: The object is used for inference.

If the parameter value is left blank, you decide how to use the object.

id

No

Sample ID exported from the system. You do not need to set this parameter when importing the sample.

annotation

No

If the parameter value is left blank, the object is not labeled. The value of annotation consists of an object list. For details about the parameters, see Table 3.

inference-loc

No

This parameter is available when the file is generated by the inference service, indicating the location of the inference result file.

Table 2 Data source types

Type

Example

OBS

"source":"s3://path-to-jpg"

Content

"source":"content://I love machine learning"

Table 3 annotation objects

Parameter

Mandatory

Description

type

Yes

Label type. Possible values are as follows:

  • image_classification: image classification
  • text_classification: text classification
  • text_entity: named entity recognition
  • object_detection: object detection
  • audio_classification: sound classification
  • audio_content: speech labeling
  • audio_segmentation: speech paragraph labeling

name

Yes/No

This parameter is mandatory for the classification type but optional for other types. This example uses the image classification type.

id

Yes/No

Label ID. This parameter is mandatory for triplets but optional for other types. The entity label ID of a triplet is in E+number format, for example, E1 and E2. The relationship label ID of a triplet is in R+number format, for example, R1 and R2.

property

No

Labeling property. In this example, the cat has two properties: color and kind.

hard

No

Indicates whether the example is a hard example. True indicates that the labeling example is a hard example, and False indicates that the labeling example is not a hard example.

annotated-by

No

The default value is human, indicating manual labeling.

  • human

creation-time

No

Time when the labeling job was created. It is the time when labeling information was written, not the time when the manifest file was generated.

confidence

No

Confidence score of machine labeling. The value ranges from 0 to 1.

Image Segmentation

{
    "annotation": [{
        "annotation-format": "PASCAL VOC",
        "type": "modelarts/image_segmentation",
        "annotation-loc": "s3://path/to/annotation/image1.xml",
        "creation-time": "2020-12-16 21:36:27",
        "annotated-by": "human"
    }],
    "usage": "train",
    "source": "s3://path/to/image1.jpg",
    "id": "16d196c19bf61994d7deccafa435398c",
    "sample-type": 0
}
  • The parameters such as source, usage, and annotation are the same as those described in Image Classification. For details, see Table 1.
  • annotation-loc indicates the path for saving the label file. This parameter is mandatory for image segmentation and object detection but optional for other labeling types.
  • annotation-format indicates the format of the label file. This parameter is optional. The default value is PASCAL VOC. Only PASCAL VOC is supported.
  • sample-type indicates a sample format. Value 0 indicates image, 1 text, 2 audio, 4 table, and 6 video.
Table 4 PASCAL VOC format parameters

Parameter

Mandatory

Description

folder

Yes

Directory where the data source is located

filename

Yes

Name of the file to be labeled

size

Yes

Image pixel

  • width: image width. This parameter is mandatory.
  • height: image height. This parameter is mandatory.
  • depth: number of image channels. This parameter is mandatory.

segmented

Yes

Segmented or not

mask_source

No

Segmentation mask path

object

Yes

Object detection information. Multiple object{} functions are generated for multiple objects.

  • name: type of the labeled content. This parameter is mandatory.
  • pose: shooting angle of the labeled content. This parameter is mandatory.
  • truncated: whether the labeled content is truncated (0 indicates that the content is not truncated). This parameter is mandatory.
  • occluded: whether the labeled content is occluded (0 indicates that the content is not occluded). This parameter is mandatory.
  • difficult: whether the labeled object is difficult to identify (0 indicates that the object is easy to identify). This parameter is mandatory.
  • confidence: confidence score of the labeled object. The value ranges from 0 to 1. This parameter is optional.
  • bndbox: bounding box type. This parameter is mandatory. For details about the possible values, see Table 5.
  • mask_color: label color, which is represented by the RGB value. This parameter is mandatory.
Table 5 Bounding box types

Parameter

Shape

Labeling information

polygon

Polygon

Coordinates of points

<x1>100<x1>

<y1>100<y1>

<x2>200<x2>

<y2>100<y2>

<x3>250<x3>

<y3>150<y3>

<x4>200<x4>

<y4>200<y4>

<x5>100<x5>

<y5>200<y5>

<x6>50<x6>

<y6>150<y6>

<x7>100<x7>

<y7>100<y7>

Example:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<annotation>
    <folder>NA</folder>
    <filename>image_0006.jpg</filename>
    <source>
        <database>Unknown</database>
    </source>
    <size>
        <width>230</width>
        <height>300</height>
        <depth>3</depth>
    </size>
    <segmented>1</segmented>
    <mask_source>obs://xianao/out/dataset-8153-Jmf5ylLjRmSacj9KevS/annotation/V001/segmentationClassRaw/image_0006.png</mask_source>
    <object>
        <name>bike</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <mask_color>193,243,53</mask_color>
        <occluded>0</occluded>
        <polygon>
            <x1>71</x1>
            <y1>48</y1>
            <x2>75</x2>
            <y2>73</y2>
            <x3>49</x3>
            <y3>69</y3>
            <x4>68</x4>
            <y4>92</y4>
            <x5>90</x5>
            <y5>101</y5>
            <x6>45</x6>
            <y6>110</y6>
            <x7>71</x7>
            <y7>48</y7>
        </polygon>
    </object>
</annotation>

Text Classification

{
    "source": "content://I like this product ",
    "id":"XGDVGS",
    "annotation": [
        {
            "type": "modelarts/text_classification",
            "name": " positive",
            "annotated-by": "human",
            "creation-time": "2019-01-23 11:30:30"        
        } ]
}

The content parameter indicates the text to be labeled (in UTF-8 encoding format, which can be Chinese). The other parameters are the same as those described in Image Classification. For details, see Table 1.

Named Entity Recognition

{
    "source":"content://Michael Jordan is the most famous basketball player in the world.",
    "usage":"TRAIN",
    "annotation":[
        {
            "type":"modelarts/text_entity",
            "name":"Person",
            "property":{
                "@modelarts:start_index":0,
                "@modelarts:end_index":14
            },
            "annotated-by":"human",
            "creation-time":"2019-01-23 11:30:30"
        },
        {
            "type":"modelarts/text_entity",
            "name":"Category",
            "property":{
                "@modelarts:start_index":34,
                "@modelarts:end_index":44
            },
            "annotated-by":"human",
            "creation-time":"2019-01-23 11:30:30"
        }
    ]
}

The parameters such as source, usage, and annotation are the same as those described in Image Classification. For details, see Table 1.

Table 6 describes the property parameters. For example, if you want to extract Michael from "source":"content://Michael Jordan", the value of start_index is 0 and that of end_index is 7.

Table 6 property parameters

Parameter

Data type

Description

@modelarts:start_index

Integer

Start position of the text. The value starts from 0, including the characters specified by start_index.

@modelarts:end_index

Integer

End position of the text, excluding the characters specified by end_index.

Text Triplet

{
    "source":"content://"Three Body" is a series of long science fiction novels created by Liu Cix.",
    "usage":"TRAIN",
    "annotation":[
        {
            "type":"modelarts/text_entity",
            "name":"Person",
            "id":"E1",
            "property":{
                "@modelarts:start_index":67,
                "@modelarts:end_index":74
            },
            "annotated-by":"human",
            "creation-time":"2019-01-23 11:30:30"
        },
        {
            "type":"modelarts/text_entity",
            "name":"Book",
            "id":"E2",
            "property":{
                "@modelarts:start_index":0,
                "@modelarts:end_index":12
            },
            "annotated-by":"human",
            "creation-time":"2019-01-23 11:30:30"
        },
        {
            "type":"modelarts/text_triplet",
            "name":"Author",
            "id":"R1",
            "property":{
                "@modelarts:from":"E1",
                "@modelarts:to":"E2"
            },
            "annotated-by":"human",
            "creation-time":"2019-01-23 11:30:30"
        },
        {
            "type":"modelarts/text_triplet",
            "name":"Works",
            "id":"R2",
            "property":{
                "@modelarts:from":"E2",
                "@modelarts:to":"E1"
            },
            "annotated-by":"human",
            "creation-time":"2019-01-23 11:30:30"
        }
    ]
}

The parameters such as source, usage, and annotation are the same as those described in Image Classification. For details, see Table 1.

Table 5 property parameters describes the property parameters. @modelarts:start_index and @modelarts:end_index are the same as those of named entity recognition. For example, when source is set to content://"Three Body" is a series of long science fiction novels created by Liu Cix., Liu Cix is an entity person, Three Body is an entity book, the person is the author of the book, and the book is works of the person.

Table 7 property parameters

Parameter

Data type

Description

@modelarts:start_index

Integer

Start position of the triplet entities. The value starts from 0, including the characters specified by start_index.

@modelarts:end_index

Integer

End position of the triplet entities, excluding the characters specified by end_index.

@modelarts:from

String

Start entity ID of the triplet relationship

@modelarts:to

String

Entity ID pointed to in the triplet relationship

Object Detection

{
    "source":"s3://path/to/image1.jpg",
    "usage":"TRAIN",
    "hard":"true",
    "hard-coefficient":0.8,
    "annotation": [
        {
            "type":"modelarts/object_detection",
            "annotation-loc": "s3://path/to/annotation1.xml",
            "annotation-format":"PASCAL VOC",
            "annotated-by":"human",
            "creation-time":"2019-01-23 11:30:30"                
        }]
}
  • The parameters such as source, usage, and annotation are the same as those described in Image Classification. For details, see Table 1.
  • annotation-loc indicates the path for saving the label file. This parameter is mandatory for object detection and image segmentation but optional for other labeling types.
  • annotation-format indicates the format of the label file. This parameter is optional. The default value is PASCAL VOC. Only PASCAL VOC is supported.
Table 8 PASCAL VOC format parameters

Parameter

Mandatory

Description

folder

Yes

Directory where the data source is located

filename

Yes

Name of the file to be labeled

size

Yes

Image pixel

  • width: image width. This parameter is mandatory.
  • height: image height. This parameter is mandatory.
  • depth: number of image channels. This parameter is mandatory.

segmented

Yes

Segmented or not

object

Yes

Object detection information. Multiple object{} functions are generated for multiple objects.

  • name: type of the labeled content. This parameter is mandatory.
  • pose: shooting angle of the labeled content. This parameter is mandatory.
  • truncated: whether the labeled content is truncated (0 indicates that the content is not truncated). This parameter is mandatory.
  • occluded: whether the labeled content is occluded (0 indicates that the content is not occluded). This parameter is mandatory.
  • difficult: whether the labeled object is difficult to identify (0 indicates that the object is easy to identify). This parameter is mandatory.
  • confidence: confidence score of the labeled object. The value ranges from 0 to 1. This parameter is optional.
  • bndbox: bounding box type. This parameter is mandatory. For details about the possible values, see Table 9.
Table 9 Bounding box types

Parameter

Shape

Labeling information

point

Point

Coordinates of a point

<x>100<x>

<y>100<y>

line

Line

Coordinates of points

<x1>100<x1>

<y1>100<y1>

<x2>200<x2>

<y2>200<y2>

bndbox

Rectangle

Coordinates of the upper left and lower right points

<xmin>100<xmin>

<ymin>100<ymin>

<xmax>200<xmax>

<ymax>200<ymax>

polygon

Polygon

Coordinates of points

<x1>100<x1>

<y1>100<y1>

<x2>200<x2>

<y2>100<y2>

<x3>250<x3>

<y3>150<y3>

<x4>200<x4>

<y4>200<y4>

<x5>100<x5>

<y5>200<y5>

<x6>50<x6>

<y6>150<y6>

circle

Circle

Center coordinates and radius

<cx>100<cx>

<cy>100<cy>

<r>50<r>

Example:
<annotation>
   <folder>test_data</folder>
   <filename>260730932.jpg</filename>
   <size>
       <width>767</width>
       <height>959</height>
       <depth>3</depth>
   </size>
   <segmented>0</segmented>
   <object>
       <name>point</name>
       <pose>Unspecified</pose>
       <truncated>0</truncated>
       <occluded>0</occluded>
       <difficult>0</difficult>
       <point>
           <x1>456</x1>
           <y1>596</y1>
       </point>
   </object>
   <object>
       <name>line</name>
       <pose>Unspecified</pose>
       <truncated>0</truncated>
       <occluded>0</occluded>
       <difficult>0</difficult>
       <line>
           <x1>133</x1>
           <y1>651</y1>
           <x2>229</x2>
           <y2>561</y2>
       </line>
   </object>
   <object>
       <name>bag</name>
       <pose>Unspecified</pose>
       <truncated>0</truncated>
       <occluded>0</occluded>
       <difficult>0</difficult>
       <bndbox>
           <xmin>108</xmin>
           <ymin>101</ymin>
           <xmax>251</xmax>
           <ymax>238</ymax>
       </bndbox>
   </object>
   <object>
       <name>boots</name>
       <pose>Unspecified</pose>
       <truncated>0</truncated>
       <occluded>0</occluded>
       <difficult>0</difficult>
       <hard-coefficient>0.8</hard-coefficient>
       <polygon>
           <x1>373</x1>
           <y1>264</y1>
           <x2>500</x2>
           <y2>198</y2>
           <x3>437</x3>
           <y3>76</y3>
           <x4>310</x4>
           <y4>142</y4>
       </polygon>
   </object>
   <object>
       <name>circle</name>
       <pose>Unspecified</pose>
       <truncated>0</truncated>
       <occluded>0</occluded>
       <difficult>0</difficult>
       <circle>
           <cx>405</cx>
           <cy>170</cy>
           <r>100<r>
       </circle>
   </object>
</annotation>

Sound Classification

{
"source":
"s3://path/to/pets.wav", 
    "annotation": [
        {
            "type": "modelarts/audio_classification",
            "name":"cat",    
            "annotated-by":"human",
            "creation-time":"2019-01-23 11:30:30"
        } 
    ]
}

The parameters such as source, usage, and annotation are the same as those described in Image Classification. For details, see Table 1.

Speech Labeling

{
    "source":"s3://path/to/audio1.wav",
    "annotation":[
        {
            "type":"modelarts/audio_content",
            "property":{
                "@modelarts:content":"Today is a good day."
            },
            "annotated-by":"human",
            "creation-time":"2019-01-23 11:30:30"
        }
    ]
}
  • The parameters such as source, usage, and annotation are the same as those described in Image Classification. For details, see Table 1.
  • The @modelarts:content parameter in property indicates speech content. The data type is String.

Speech Paragraph Labeling

{
    "source":"s3://path/to/audio1.wav",
    "usage":"TRAIN",
    "annotation":[
        {
           
"type":"modelarts/audio_segmentation",
            "property":{
                "@modelarts:start_time":"00:01:10.123",
                "@modelarts:end_time":"00:01:15.456",
               
                "@modelarts:source":"Tom",
               
                "@modelarts:content":"How are you?"
            },
           "annotated-by":"human",
           "creation-time":"2019-01-23 11:30:30"
        },
        {
           "type":"modelarts/audio_segmentation",
            "property":{
                "@modelarts:start_time":"00:01:22.754",
                "@modelarts:end_time":"00:01:24.145",
                "@modelarts:source":"Jerry",
                "@modelarts:content":"I'm fine, thank you."
            },
           "annotated-by":"human",
           "creation-time":"2019-01-23 11:30:30"
        }
    ]
}
  • The parameters such as source, usage, and annotation are the same as those described in Image Classification. For details, see Table 1.
  • Table 10 describes the property parameters.
    Table 10 property parameters

    Parameter

    Data type

    Description

    @modelarts:start_time

    String

    Start time of the sound. The format is hh:mm:ss.SSS.

    hh indicates the hour, mm indicates the minute, ss indicates the second, and SSS indicates the millisecond.

    @modelarts:end_time

    String

    End time of the sound. The format is hh:mm:ss.SSS.

    hh indicates the hour, mm indicates the minute, ss indicates the second, and SSS indicates the millisecond.

    @modelarts:source

    String

    Sound source

    @modelarts:content

    String

    Sound content

Video Labeling

{
	"annotation": [{
		"annotation-format": "PASCAL VOC",
		"type": "modelarts/object_detection",
		"annotation-loc": "s3://path/to/annotation1_t1.473722.xml",
		"creation-time": "2020-10-09 14:08:24",
		"annotated-by": "human"
	}],
	"usage": "train",
	"property": {
		"@modelarts:parent_duration": 8,
		"@modelarts:parent_source": "s3://path/to/annotation1.mp4",
		"@modelarts:time_in_video": 1.473722
	},
	"source": "s3://input/path/to/annotation1_t1.473722.jpg",
	"id": "43d88677c1e9a971eeb692a80534b5d5",
	"sample-type": 0
}
  • The parameters such as source, usage, and annotation are the same as those described in Image Classification. For details, see Table 1.
  • annotation-loc indicates the path for saving the label file. This parameter is mandatory for object detection but optional for other labeling types.
  • annotation-format indicates the format of the label file. This parameter is optional. The default value is PASCAL VOC. Only PASCAL VOC is supported.
  • sample-type indicates a sample format. Value 0 indicates image, 1 text, 2 audio, 4 table, and 6 video.
Table 11 property parameters

Parameter

Data type

Description

@modelarts:parent_duration

Double

Duration of the labeled video, in seconds

@modelarts:time_in_video

Double

Timestamp of the labeled video frame, in seconds

@modelarts:parent_source

String

OBS path of the labeled video

Table 12 PASCAL VOC format parameters

Parameter

Mandatory

Description

folder

Yes

Directory where the data source is located

filename

Yes

Name of the file to be labeled

size

Yes

Image pixel

  • width: image width. This parameter is mandatory.
  • height: image height. This parameter is mandatory.
  • depth: number of image channels. This parameter is mandatory.

segmented

Yes

Segmented or not

object

Yes

Object detection information. Multiple object{} functions are generated for multiple objects.

  • name: type of the labeled content. This parameter is mandatory.
  • pose: shooting angle of the labeled content. This parameter is mandatory.
  • truncated: whether the labeled content is truncated (0 indicates that the content is not truncated). This parameter is mandatory.
  • occluded: whether the labeled content is occluded (0 indicates that the content is not occluded). This parameter is mandatory.
  • difficult: whether the labeled object is difficult to identify (0 indicates that the object is easy to identify). This parameter is mandatory.
  • confidence: confidence score of the labeled object. The value ranges from 0 to 1. This parameter is optional.
  • bndbox: bounding box type. This parameter is mandatory. For details about the possible values, see Table 13.
Table 13 Bounding box types

Parameter

Shape

Labeling information

point

Point

Coordinates of a point

<x>100<x>

<y>100<y>

line

Line

Coordinates of points

<x1>100<x1>

<y1>100<y1>

<x2>200<x2>

<y2>200<y2>

bndbox

Rectangle

Coordinates of the upper left and lower right points

<xmin>100<xmin>

<ymin>100<ymin>

<xmax>200<xmax>

<ymax>200<ymax>

polygon

Polygon

Coordinates of points

<x1>100<x1>

<y1>100<y1>

<x2>200<x2>

<y2>100<y2>

<x3>250<x3>

<y3>150<y3>

<x4>200<x4>

<y4>200<y4>

<x5>100<x5>

<y5>200<y5>

<x6>50<x6>

<y6>150<y6>

circle

Circle

Center coordinates and radius

<cx>100<cx>

<cy>100<cy>

<r>50<r>

Example:
<annotation>
   <folder>test_data</folder>
   <filename>260730932_t1.473722.jpg.jpg</filename>
   <size>
       <width>767</width>
       <height>959</height>
       <depth>3</depth>
   </size>
   <segmented>0</segmented>
   <object>
       <name>point</name>
       <pose>Unspecified</pose>
       <truncated>0</truncated>
       <occluded>0</occluded>
       <difficult>0</difficult>
       <point>
           <x1>456</x1>
           <y1>596</y1>
       </point>
   </object>
   <object>
       <name>line</name>
       <pose>Unspecified</pose>
       <truncated>0</truncated>
       <occluded>0</occluded>
       <difficult>0</difficult>
       <line>
           <x1>133</x1>
           <y1>651</y1>
           <x2>229</x2>
           <y2>561</y2>
       </line>
   </object>
   <object>
       <name>bag</name>
       <pose>Unspecified</pose>
       <truncated>0</truncated>
       <occluded>0</occluded>
       <difficult>0</difficult>
       <bndbox>
           <xmin>108</xmin>
           <ymin>101</ymin>
           <xmax>251</xmax>
           <ymax>238</ymax>
       </bndbox>
   </object>
   <object>
       <name>boots</name>
       <pose>Unspecified</pose>
       <truncated>0</truncated>
       <occluded>0</occluded>
       <difficult>0</difficult>
       <hard-coefficient>0.8</hard-coefficient>
       <polygon>
           <x1>373</x1>
           <y1>264</y1>
           <x2>500</x2>
           <y2>198</y2>
           <x3>437</x3>
           <y3>76</y3>
           <x4>310</x4>
           <y4>142</y4>
       </polygon>
   </object>
   <object>
       <name>circle</name>
       <pose>Unspecified</pose>
       <truncated>0</truncated>
       <occluded>0</occluded>
       <difficult>0</difficult>
       <circle>
           <cx>405</cx>
           <cy>170</cy>
           <r>100<r>
       </circle>
   </object>
</annotation>