Help Center/ ModelArts/ API Reference/ Historical APIs/ Data Management (Old Version)/ Obtaining Details About an Auto Labeling Task
Updated on 2024-05-30 GMT+08:00

Obtaining Details About an Auto Labeling Task

Function

Obtain details about an auto labeling task, including auto labeling and auto grouping tasks. You can specify the task_id parameter to obtain the details about a specific task.

Debugging

You can debug this API in API Explorer which supports automatic authentication. API Explorer can automatically generate SDK code examples and provide the SDK code example debugging.

URI

GET /v2/{project_id}/datasets/{dataset_id}/tasks/{task_id}

Table 1 URI parameters

Parameter

Mandatory

Type

Description

dataset_id

Yes

String

Dataset ID

project_id

Yes

String

Project ID. For details, see Obtaining a Project ID and Name.

task_id

Yes

String

Task ID

Request Parameters

None

Response Parameters

Status code: 200

Table 2 Response body parameters

Parameter

Type

Description

code

String

Task running status code

config

SmartTaskConfig object

Task configuration

create_time

String

Task creation time

elapsed_time

Long

Execution time

error_code

String

Error code

error_detail

String

Error details

error_msg

String

Error message

message

String

Task running information.

progress

Float

Percentage of current task progress.

resource_id

String

Resource ID.

result

Result object

Task result.

status

Integer

Task status. Options:

  • -1: queuing

  • 0: initializing

  • 1: running

  • 2: failed

  • 3: succeeded

  • 4: stopping

  • 5: stopped

task_id

String

Task ID

task_name

String

Task name

export_type

Integer

Export type. Options: Options:

  • 0: Export to OBS.

  • 1: Export to sample attributes.

Table 3 SmartTaskConfig

Parameter

Type

Description

algorithm_type

String

Algorithm type for auto labeling. Options:

  • fast: Only labeled samples are used for training.

  • accurate: Unlabeled samples are also used for semi-supervised training.

ambiguity

Boolean

Whether to perform clustering based on the image blurring degree.

annotation_output

String

Output path of the active learning labeling result

collect_rule

String

Sample collection rule. The default value is all, indicating full collection. Only all is available.

collect_sample

Boolean

Whether to enable sample collection. Options:

  • true: Sample collection is enabled. (Default)

  • false: Sample collection is disabled.

confidence_scope

String

Confidence range of key samples. The minimum and maximum values are separated by hyphens (-). Example: 0.10-0.90.

description

String

Job description

engine_name

String

Engine name

export_format

Integer

Format of the exported directory. Options:

  • 1: tree structure, for example, rabbits/1.jpg,bees/2.jpg.

  • 2: tile structure, for example, 1.jpg, 1.txt; 2.jpg, 2.txt

export_params

ExportParams object

Parameters of a dataset export task

flavor

Flavor object

Training resource flavor

image_brightness

Boolean

Whether to perform clustering based on the image brightness

image_colorfulness

Boolean

Whether to perform clustering based on the image color

inf_cluster_id

String

ID of a dedicated cluster. This parameter is left blank by default, indicating that a dedicated cluster is not used. When using a dedicated cluster to deploy services, ensure that the cluster status is normal. After this parameter is set, the network configuration of the cluster is used, and the vpc_id parameter does not take effect.

inf_config_list

Array of InfConfig objects

Configuration list required for running an inference job, which is optional and left blank by default

inf_output

String

Output path of inference in active learning

infer_result_output_dir

String

OBS directory for storing sample prediction results. This parameter is optional. The {service_id}-infer-result subdirectory in the output_dir directory is used by default.

key_sample_output

String

Output path of hard examples in active learning

log_url

String

OBS URL of the logs of a training job. By default, this parameter is left blank.

manifest_path

String

Path of the manifest file, which is used as the input for training and inference

model_id

String

Model ID

model_name

String

Model name

model_parameter

String

Model parameters

model_version

String

Model version

n_clusters

Integer

Number of clusters

name

String

Task name

output_dir

String

Sample output path. The format is as follows: Dataset output path/Dataset name-Dataset ID/annotation/auto-deploy/. Example: /test/work_1608083108676/dataset123-g6IO9qSu6hoxwCAirfm/annotation/auto-deploy/.

parameters

Array of TrainingParameter objects

Running parameters of a training job

pool_id

String

Resource pool ID

property

String

Attribute name

req_uri

String

Inference path of a batch job

result_type

Integer

Processing mode of auto grouping results. Options:

  • 0: The results are saved to OBS.

  • 1: The results are saved to samples.

samples

Array of SampleLabels objects

Labeling information for samples to be auto labeled

stop_time

Integer

Timeout interval, in minutes. The default value is 15 minutes. This parameter is used only in the scenario of auto labeling for videos.

time

String

Timestamp in active learning

train_data_path

String

Path for storing existing training datasets

train_url

String

OBS URL of the output file of a training job. By default, this parameter is left blank.

version_format

String

Format of a dataset version. Options:

  • Default

  • CarbonData (supported only by table datasets)

  • CSV

worker_server_num

Integer

Number of workers in a training job

Table 4 ExportParams

Parameter

Type

Description

clear_hard_property

Boolean

Whether to clear hard example attributes. Options:

  • true: Hard example attributes are cleared. (Default)

  • false: Hard example attributes are not cleared.

export_dataset_version_format

String

Format of the dataset version to be exported

export_dataset_version_name

String

Name of the dataset version to be exported

export_dest

String

Dataset export type. Options:

  • DIR: Export to OBS. (default)

  • NEW_DATASET: Export to a new dataset.

export_new_dataset_name

String

Name of the new dataset to which data is exported

export_new_dataset_work_path

String

Working directory of the new dataset to which data is exported

ratio_sample_usage

Boolean

Whether to randomly allocate data to the training and validation datasets based on the specified ratio. Options:

  • true: The data is randomly allocated to the training and validation datasets.

  • false: The data is not randomly allocated to the training and validation datasets. (Default)

sample_state

String

Sample status. Options:

  • __ALL__: labeled

  • __NONE__: unlabeled

  • __UNCHECK__: to be accepted

  • __ACCEPTED__: accepted

  • __REJECTED__: rejected

  • __UNREVIEWED__: to be reviewed

  • __REVIEWED__: reviewed

  • __WORKFORCE_SAMPLED__: sampled

  • __WORKFORCE_SAMPLED_UNCHECK__: sampling pending check

  • __WORKFORCE_SAMPLED_CHECKED__: sampling checked

  • __WORKFORCE_SAMPLED_ACCEPTED__: sampling accepted

  • __WORKFORCE_SAMPLED_REJECTED__: sampling rejected

  • __AUTO_ANNOTATION__: to be confirmed

samples

Array of strings

ID list of exported samples

search_conditions

Array of SearchCondition objects

Exported search criteria. Multiple search criteria are in the OR relationship.

train_sample_ratio

String

Split ratio of training and validation datasets for specified version release. The default value is 1.00, indicating that all data is allocated to the training dataset.

Table 5 SearchCondition

Parameter

Type

Description

coefficient

String

Filter by difficulty coefficient

frame_in_video

Integer

A frame in the video

hard

String

Whether a sample is a hard example. Options:

  • 0: The label is not a hard example.

  • 1: The label is a hard example.

import_origin

String

Filter by data source

kvp

String

CT dosage, filtered by dosage.

label_list

SearchLabels object

Label search criteria

labeler

String

Annotator

metadata

SearchProp object

Search by sample attribute

parent_sample_id

String

Parent sample ID

sample_dir

String

Directory where samples are stored (the directory must end with a slash (/)). Only samples in the specified directory are searched for. Recursive search of directories is not supported.

sample_name

String

Search by sample name, including the file name extension

sample_time

String

When a sample is added to the dataset, an index is created based on the last modification time (accurate to day) of the sample on OBS. You can search for the sample based on the time. Options:

  • month: Search for samples added in the last 30 days

  • day: Search for samples added from yesterday (one day before) to today.

  • yyyyMMdd-yyyyMMdd: Search for samples added in a specified period. The format is Start date-End date. Maximum number of days for the search: 30. For example, 20190901-2019091501 indicates that samples generated from September 1 to September 15, 2019 are searched for.

score

String

Search by confidence

slice_thickness

String

DICOM layer thickness. Samples are filtered by layer thickness.

study_date

String

DICOM scanning time

time_in_video

String

A time point in the video

Table 6 SearchLabels

Parameter

Type

Description

labels

Array of SearchLabel objects

Label search criteria

op

String

If you want to search for multiple labels, op must be specified. If you search for only one label, op can be left blank. Options:

  • OR: OR operation

  • AND: AND operation

Table 7 SearchLabel

Parameter

Type

Description

name

String

Label name

op

String

Operation type between multiple attributes. Options:

  • OR: OR operation

  • AND: AND operation

property

Map<String,Array<String>>

Label attribute, which is in the Object format and stores any key-value pairs. key indicates the attribute name, and value indicates the value list. If value is null, the search is not performed by value. Otherwise, the search value can be any value in the list.

type

Integer

Label type. Options:

  • 0: image classification

  • 1: object detection

  • 3: image segmentation

  • 100: text classification

  • 101: named entity recognition

  • 102: text triplet relationship

  • 103: text triplet entity

  • 200: sound classification

  • 201: speech content

  • 202: speech paragraph labeling

  • 600: video labeling

Table 8 SearchProp

Parameter

Type

Description

op

String

Relationship between attribute values. Options:

  • AND: AND relationship

  • OR: OR relationship

props

Map<String,Array<String>>

Search criteria of an attribute. Multiple search criteria can be set.

Table 9 Flavor

Parameter

Type

Description

code

String

Attribute code of a resource specification, which is used for task creating

Table 10 InfConfig

Parameter

Type

Description

envs

Map<String,String>

Environment variable key-value pair required for running a model. This parameter is optional. By default, it is left blank. To ensure data security, do not enter sensitive information in environment variables.

instance_count

Integer

Number of instances (compute nodes) for deploying a model

model_id

String

Model ID

specification

String

Resource specifications of real-time services. For details, see Deploying a Service.

weight

Integer

Traffic weight allocated to a model. This parameter is mandatory only when infer_type is set to real-time. The sum of the weights must be 100.

Table 11 TrainingParameter

Parameter

Type

Description

label

String

Parameter name

value

String

Parameter value

Table 12 Result

Parameter

Type

Description

annotated_sample_count

Integer

Number of labeled samples.

confidence_scope

String

Confidence. The value ranges from 0 to 1.

dataset_name

String

Dataset name, which can contain 1 to 100 characters. Only letters, digits, underscores (_), and hyphens (-) are allowed.

dataset_type

String

Dataset type. Options:

  • 0: image classification

  • 1: object detection

  • 3: image segmentation

  • 100: text classification

  • 101: named entity recognition

  • 102: text triplet

  • 200: sound classification

  • 201: speech content

  • 202: speech paragraph labeling

  • 400: table dataset

  • 600: video labeling

  • 900: free format

description

String

Result description

dlf_model_job_name

String

DLF model inference job name

dlf_service_job_name

String

DLF real-time service job name

dlf_train_job_name

String

DLF training job name

events

Array of Event objects

Event

hard_example_path

String

Path for storing hard examples

hard_select_tasks

Array of HardSelectTask objects

List of selected hard example jobs

manifest_path

String

Path for storing the manifest files

model_id

String

Model ID

model_name

String

Model name

model_version

String

Model version

samples

Array of SampleLabels objects

Inference result of the real-time video service.

service_id

String

Real-time service ID

service_name

String

Real-time service name

service_resource

String

ID of the real-time service bound to a user.

total_sample_count

Integer

Total number of samples

train_data_path

String

Path for storing training data

train_job_id

String

Training job ID

train_job_name

String

Training job name

unconfirmed_sample_count

Integer

Number of samples to be confirmed

version_id

String

Dataset version ID

version_name

String

Dataset version name

workspace_id

String

Workspace ID. If no workspace is created, the default value is 0. If a workspace is created and used, use the actual value.

Table 13 Event

Parameter

Type

Description

create_time

Long

Event creation time

description

String

Event description

elapsed_time

Long

Time when an event is executed

error_code

String

Error code

error_message

String

Error message

events

Array of Event objects

List of sub-events

level

Integer

Event severity.

name

String

Event name

ordinal

Integer

Sequence number.

parent_name

String

Parent event name.

status

String

Event status. Options:

  • waiting: DCS instance restoration is waiting to begin.

  • running

  • failed: indicates that a job fails to be processed.

  • success: The subtask is successfully executed.

Table 14 HardSelectTask

Parameter

Type

Description

create_at

Long

Task creation time

dataset_id

String

Dataset ID

dataset_name

String

Dataset name

hard_select_task_id

String

ID of selected hard example task

task_status

String

Task status

time

Long

Execution time

update_at

Long

Task update time

Table 15 SampleLabels

Parameter

Type

Description

labels

Array of SampleLabel objects

List of sample labels. If this parameter is left blank, all sample labels are deleted.

metadata

SampleMetadata object

Attribute key-value pair of the sample metadata

sample_id

String

Sample ID

sample_type

Integer

Sample type. Options:

  • 0: image

  • 1: text

  • 2: audio

  • 4: table

  • 6: video

  • 9: free format

sample_usage

String

Sample usage. Options:

  • TRAIN: training

  • EVAL: validation

  • TEST: test

  • INFERENCE: inference

source

String

Source address of sample data, which can be obtained by calling the sample list API.

worker_id

String

ID of a labeling team member

Table 16 SampleLabel

Parameter

Type

Description

annotated_by

String

Video labeling method, which is used to determine whether a video is labeled manually or automatically. Options:

  • human: manual labeling

  • auto: auto labeling

id

String

Label ID

name

String

Label name

property

SampleLabelProperty object

Attribute key-value pair of the sample label, such as the object shape and shape feature

score

Float

Confidence. The value ranges from 0 to 1.

type

Integer

Label type. Options:

  • 0: image classification

  • 1: object detection

  • 3: image segmentation

  • 100: text classification

  • 101: named entity recognition

  • 102: text triplet relationship

  • 103: text triplet entity

  • 200: sound classification

  • 201: speech content

  • 202: speech paragraph labeling

  • 600: video labeling

Table 17 SampleLabelProperty

Parameter

Type

Description

@modelarts:content

String

Speech text content, which is a default attribute dedicated to the speech label (including the speech content and speech start and end points)

@modelarts:end_index

Integer

End position of the text, which is a default attribute dedicated to the named entity label. The end position does not include the character corresponding to the value of end_index. Examples:

  • If the text is "Barack Hussein Obama II (born on August 4, 1961) is an attorney and politician.", the start_index and end_index of Barack Hussein Obama II are 0 and 23, respectively.

  • If the text is "Hope is the thing with feathers", start_index and end_index of Hope are 0 and 4, respectively.

@modelarts:end_time

String

Speech end time, which is a default attribute dedicated to the speech start/end point label, in the format of hh:mm:ss.SSS. (hh indicates hour; mm indicates minute; ss indicates second; and SSS indicates millisecond.)

@modelarts:feature

Object

Shape feature, which is a default attribute dedicated to the object detection label, with type of List The upper left corner of an image is used as the coordinate origin [0, 0]. Each coordinate point is represented by [x, y]. x indicates the horizontal coordinate, and y indicates the vertical coordinate (both x and y are greater than or equal to 0). The format of each shape is as follows:

  • bndbox: consists of two points, for example, [[0,10],[50,95]]. The upper left vertex of the rectangle is the first point, and the lower right vertex is the second point. That is, the x-coordinate of the first point must be less than the x-coordinate of the second point, and the y-coordinate of the first point must be less than the y-coordinate of the second point.

  • polygon: consists of multiple points that are connected in sequence to form a polygon, for example, [[0,100],[50,95],[10,60],[500,400]].

  • circle: consists of the center and radius, for example, [[100,100],[50]].

  • line: consists of two points, for example, [[0,100],[50,95]]. The first point is the start point, and the second point is the end point.

  • dashed: consists of two points, for example, [[0,100],[50,95]]. The first point is the start point, and the second point is the end point.

  • point: consists of one point, for example, [[0,100]].

  • polyline: consists of multiple points, for example, [[0,100],[50,95],[10,60],[500,400]].

@modelarts:from

String

Start entity ID of the triplet relationship label, which is a default attribute dedicated to the triplet relationship label

@modelarts:hard

String

Whether the sample is labeled as a hard example, which is a default attribute. Options:

  • 0/false: The label is not a hard example.

  • 1/true: The label is a hard example.

@modelarts:hard_coefficient

String

Coefficient of difficulty of each label level, which is a default attribute. The value ranges from 0 to 1.

@modelarts:hard_reasons

String

Reasons why the sample is a hard example, which is a default attribute. Use a hyphen (-) to separate every two hard example reason IDs, for example, 3-20-21-19. Options:

  • 0: No object is identified.

  • 1: The confidence is low.

  • 2: The clustering result based on the training dataset is inconsistent with the prediction result.

  • 3: The prediction result is greatly different from the data of the same type in the training dataset.

  • 4: The prediction results of multiple consecutive similar images are inconsistent.

  • 5: There is a large offset between the image resolution and the feature distribution of the training dataset.

  • 6: There is a large offset between the aspect ratio of the image and the feature distribution of the training dataset.

  • 7: There is a large offset between the brightness of the image and the feature distribution of the training dataset.

  • 8: There is a large offset between the saturation of the image and the feature distribution of the training dataset.

  • 9: There is a large offset between the color richness of the image and the feature distribution of the training dataset.

  • 10: There is a large offset between the definition of the image and the feature distribution of the training dataset.

  • 11: There is a large offset between the number of frames of the image and the feature distribution of the training dataset.

  • 12: There is a large offset between the standard deviation of area of image frames and the feature distribution of the training dataset.

  • 13: There is a large offset between the aspect ratio of image frames and the feature distribution of the training dataset.

  • 14: There is a large offset between the area portion of image frames and the feature distribution of the training dataset.

  • 15: There is a large offset between the edge of image frames and the feature distribution of the training dataset.

  • 16: There is a large offset between the brightness of image frames and the feature distribution of the training dataset.

  • 17: There is a large offset between the definition of image frames and the feature distribution of the training dataset.

  • 18: There is a large offset between the stack of image frames and the feature distribution of the training dataset.

  • 19: The data augmentation result based on GaussianBlur is inconsistent with the prediction result of the original image.

  • 20: The data augmentation result based on fliplr is inconsistent with the prediction result of the original image.

  • 21: The data augmentation result based on Crop is inconsistent with the prediction result of the original image.

  • 22: The data augmentation result based on flipud is inconsistent with the prediction result of the original image.

  • 23: The data augmentation result based on scale is inconsistent with the prediction result of the original image.

  • 24: The data augmentation result based on translate is inconsistent with the prediction result of the original image.

  • 25: The data augmentation result based on shear is inconsistent with the prediction result of the original image.

  • 26: The data augmentation result based on superpixels is inconsistent with the prediction result of the original image.

  • 27: The data augmentation result based on sharpen is inconsistent with the prediction result of the original image.

  • 28: The data augmentation result based on add is inconsistent with the prediction result of the original image.

  • 29: The data augmentation result based on invert is inconsistent with the prediction result of the original image.

  • 30: The data is predicted to be abnormal.

@modelarts:shape

String

Object shape, which is a default attribute dedicated to the object detection label and is left empty by default. Options:

  • bndbox: rectangle

  • polygon: polygon

  • circle: circle

  • line: straight line

  • dashed: dashed line

  • point: point

  • polyline: polyline

@modelarts:source

String

Speech source, which is a default attribute dedicated to the speech start/end point label and can be set to a speaker or narrator

@modelarts:start_index

Integer

Start position of the text, which is a default attribute dedicated to the named entity label. The start value begins from 0, including the character corresponding to the value of start_index.

@modelarts:start_time

String

Speech start time, which is a default attribute dedicated to the speech start/end point label, in the format of hh:mm:ss.SSS. (hh indicates hour; mm indicates minute; ss indicates second; and SSS indicates millisecond.)

@modelarts:to

String

Direction entity ID of the triplet relationship label, which is a default attribute dedicated to the triplet relationship label

Table 18 SampleMetadata

Parameter

Type

Description

@modelarts:import_origin

Integer

Sample source, which is a default attribute.

@modelarts:hard

Double

Whether the sample is labeled as a hard sample, which is a default attribute. Options:

  • 0: The label is not a hard example.

  • 1: The label is a hard example.

@modelarts:hard_coefficient

Double

Coefficient of difficulty of each sample level, which is a default attribute. The value ranges from 0 to 1.

@modelarts:hard_reasons

Array of integers

ID of a hard example reason, which is a default attribute. Options:

  • 0: No object is identified.

  • 1: The confidence is low.

  • 2: The clustering result based on the training dataset is inconsistent with the prediction result.

  • 3: The prediction result is greatly different from the data of the same type in the training dataset.

  • 4: The prediction results of multiple consecutive similar images are inconsistent.

  • 5: There is a large offset between the image resolution and the feature distribution of the training dataset.

  • 6: There is a large offset between the aspect ratio of the image and the feature distribution of the training dataset.

  • 7: There is a large offset between the brightness of the image and the feature distribution of the training dataset.

  • 8: There is a large offset between the saturation of the image and the feature distribution of the training dataset.

  • 9: There is a large offset between the color richness of the image and the feature distribution of the training dataset.

  • 10: There is a large offset between the definition of the image and the feature distribution of the training dataset.

  • 11: There is a large offset between the number of frames of the image and the feature distribution of the training dataset.

  • 12: There is a large offset between the standard deviation of area of image frames and the feature distribution of the training dataset.

  • 13: There is a large offset between the aspect ratio of image frames and the feature distribution of the training dataset.

  • 14: There is a large offset between the area portion of image frames and the feature distribution of the training dataset.

  • 15: There is a large offset between the edge of image frames and the feature distribution of the training dataset.

  • 16: There is a large offset between the brightness of image frames and the feature distribution of the training dataset.

  • 17: There is a large offset between the definition of image frames and the feature distribution of the training dataset.

  • 18: There is a large offset between the stack of image frames and the feature distribution of the training dataset.

  • 19: The data augmentation result based on GaussianBlur is inconsistent with the prediction result of the original image.

  • 20: The data augmentation result based on fliplr is inconsistent with the prediction result of the original image.

  • 21: The data augmentation result based on Crop is inconsistent with the prediction result of the original image.

  • 22: The data augmentation result based on flipud is inconsistent with the prediction result of the original image.

  • 23: The data augmentation result based on scale is inconsistent with the prediction result of the original image.

  • 24: The data augmentation result based on translate is inconsistent with the prediction result of the original image.

  • 25: The data augmentation result based on shear is inconsistent with the prediction result of the original image.

  • 26: The data augmentation result based on superpixels is inconsistent with the prediction result of the original image.

  • 27: The data augmentation result based on sharpen is inconsistent with the prediction result of the original image.

  • 28: The data augmentation result based on add is inconsistent with the prediction result of the original image.

  • 29: The data augmentation result based on invert is inconsistent with the prediction result of the original image.

  • 30: The data is predicted to be abnormal.

@modelarts:size

Array of objects

image size, including width, height, and depth. The type is List[/topic/body/section/table/tgroup/tbody/row/entry/p/br. {""}) (br]. In the list, the first number indicates the width (pixels), the second number indicates the height (pixels), and the third number indicates the depth (the depth can be left blank and the default value is 3). For example, [100,200,3] and [100,200] are both valid. Note: This parameter is mandatory only when the sample label list contains the object detection label.

Request Example

Run the following command to obtain details about an auto labeling task:

GET https://{endpoint}/v2/{project_id}/datasets/{dataset_id}/tasks/{task_id}

Response Example

Status code: 200

OK

{
  "resource_id" : "XGrRZuCV1qmMxnsmD5u",
  "create_time" : "2020-11-23 11:08:20",
  "progress" : 10.0,
  "status" : 1,
  "message" : "Start to export annotations. Export task id is jMZGm2SBp4Ymr2wrhAK",
  "code" : "ModelArts.4902",
  "elapsed_time" : 0,
  "result" : {
    "total_sample_count" : 49,
    "annotated_sample_count" : 30
  },
  "export_type" : 0,
  "config" : {
    "ambiguity" : false,
    "worker_server_num" : 0,
    "collect_sample" : false,
    "algorithm_type" : "fast",
    "image_brightness" : false,
    "image_colorfulness" : false
  }
}

Status Code

Status Code

Description

200

OK

401

Unauthorized

403

Forbidden

404

Not Found

Error Code

For details, see Error Codes.