Description
Data labeling is a key step in data engineering. It aims to add accurate labels to unlabeled datasets to provide effective supervision signals for model training.
The quality of labeled data directly affects the training effect and accuracy of the model. Therefore, an efficient and accurate labeling process is critical.
The data labeling function allows you to create annotation tasks, label datasets (annotation tasks), review labeled datasets (review tasks), and manage annotation tasks (task management). The functions supported by different roles and the displayed frontend pages are slightly different. For details, see Table 1.
Table 1 Data labeling task permissions supported by different roles
|
Role |
Labeling Task Creation |
Data Labeling |
Labeling Review |
Labeling Task Management |
|
Super Admin |
√ |
√ |
- |
√ |
|
Administrator |
√ |
√ |
- |
√ |
|
Annotation administrator |
√ |
√ |
- |
√ |
|
Annotation operator |
- |
√ |
- |
- |
|
Annotation auditor |
- |
- |
√ |
- |
Currently, text, video, and image datasets can be labeled.
Labeling a Text Dataset
- On the Annotation Task page, click the current labeling task to execute the task.
For the "Annotate operators" role, you can click Annotation to execute the job.
- On the labeling page, label data record one by one.
For example, to label a single-turn Q&A dataset, you need to check whether the question (Q) and answer (A) of text data are correct one by one. If Q or A is incorrect, you can edit it.
- After labeling a piece of data, click Submit to label the remaining data.
- You can view the number of labeled data records and the labeling progress on the left of the page.
- To transfer a labeling task to another person, return to the Annotation Task page, click Transfer in the Operation column, and select the person to whom the task is transferred and the number of data records to be transferred.
- After all data is labeled, a message is displayed, indicating that the labeling task is successful.
Labeling a Video Dataset
- On the Annotation Task page, click the current labeling task to execute the task.
For the "Annotate operators" role, you can click Annotation to execute the job.
- On the labeling page, label data record one by one.
- After labeling a piece of data, click Submit to label the remaining data.
- You can view the number of labeled data records and the labeling progress on the left of the page.
- To transfer a labeling task to another person, return to the Annotation Task page, click Transfer in the Operation column, and select the person to whom the task is transferred and the number of data records to be transferred.
- If AI pre-labeling is enabled when you create a labeling task and the labeling requirement is set to Partial labeling, you can label some data and click Submit All in the upper right corner to automatically label the remaining data.
- After all data is labeled, a message is displayed, indicating that the labeling task is successful.
Labeling an Image Dataset
- On the Annotation Task page, click the current labeling task to execute the task.
For the "Annotate operators" role, you can click Annotation to execute the job.
- On the labeling page, label data record one by one.
- After labeling a piece of data, click Submit to label the remaining data.
- You can view the number of labeled data records and the labeling progress on the left of the page.
- To transfer a labeling task to another person, return to the Annotation Task page, click Transfer in the Operation column, and select the person to whom the task is transferred and the number of data records to be transferred.
- If AI pre-labeling is enabled when you create a labeling task and the labeling requirement is set to Partial labeling, you can label some data and click Submit All in the upper right corner to automatically label the remaining data.
- After all data is labeled, a message is displayed, indicating that the labeling task is successful.