Updated on 2024-04-30 GMT+08:00

Named Entity Recognition

Named entity recognition assigns labels to named entities in text, such as time and locations. Before labeling, pay attention to the following:

A label name of a named entity can contain a maximum of 1024 characters, including letters, digits, hyphens (-), underscores (_), and special characters.

Data management is being upgraded and is invisible to users who have not used data management.

Starting Labeling

  1. Log in to the ModelArts management console. In the left navigation pane, choose Data Management > Label Data.
  2. In the labeling job list, select a labeling type from the All type drop-down list, click the job to be performed based on the labeling type. The details page of the job is displayed.
    Figure 1 Selecting a labeling type
  3. The job details page displays all data of the labeling job.

Synchronizing New Data

ModelArts automatically synchronizes data and labeling information from datasets to the labeling job.

To quickly obtain the latest data in the datasets, on the Unlabeled tab page of the labeling job details page, click Synchronize New Data.

Symptom:

After the labeled data is uploaded to OBS and synchronized, the data is displayed as unlabeled.

Possible causes:

Automatic encryption is enabled in the OBS bucket.

Solution:

Create an OBS bucket and upload data again, or disable bucket encryption and upload data again.

Labeling Text Files

The labeling job details page displays the Unlabeled and Labeled tabs. The Unlabeled tab page is displayed by default.

  1. On the Unlabeled tab page, the objects to be labeled are listed in the left pane. In the list, click the text object to be labeled, select a part of text displayed under Label Set for labeling, and select a label in the Label Set area in the right pane.

    You can repeat this operation to select objects and add labels to the objects.

    Figure 2 Labeling for named entity recognition
  2. Click Save Current Page in the lower part of the page to complete the labeling.

Adding a Label

  • Adding labels on the Unlabeled tab page: Click the plus sign (+) next to Label Set. On the Add Label page that is displayed, add a label name, select a label color, and click OK.
    Figure 3 Adding a named entity label (1)
  • Adding labels on the Labeled tab page: Click the plus sign (+) next to Label Set. On the Add Label page that is displayed, add a label name, select a label color, and click OK.
    Figure 4 Adding a named entity label (2)
    Figure 5 Adding a named entity label

Viewing the Labeled Text

On the dataset details page, click the Labeled tab to view the list of the labeled text. You can also view all labels supported by the dataset in the All Labels area on the right.

Modifying Labeled Data

After labeling data, you can modify labeled data on the Labeled tab page.

On the labeling job details page, click the Labeled tab, and modify the text information in the label information area on the right.

  • Modifying based on texts

    On the labeling job details page, click the Labeled tab, and select the text to be modified from the text list.

    Manual deletion: In the text list, click the text. When the text background turns blue, the text is selected. On the right of the page, click above a text label to delete the label.

  • Modifying based on labels

    On the labeling job details page, click the Labeled tab. The information about all labels is displayed on the right.

    • Batch modification: In the All Labels area, click the edit icon in the Operation column, add a label name in the text box, select a label color, and click OK.
    • Batch deletion: In the All Labels area, click the deletion icon in the Operation column to delete the label. In the dialog box that is displayed, select Delete the label or Delete the label and objects with only the label, and click OK.

Adding a File

In addition to the data synchronized, you can directly add data on labeling job details page for labeling.

  1. On the labeling job details page, click the Unlabeled tab, click Add data in the upper left corner.
  2. Configure input data and click OK.
    For details about how to import data, see Importing Data.
    Figure 6 Importing data

Deleting a File

You can quickly delete the files you want to discard.

  • On the Unlabeled tab page, select the text to be deleted, and click Delete in the upper left corner to delete the text.
  • On the Labeled tab page, select the text to be deleted and click Delete. Alternatively, tick Select Current Page to select all text objects on the current page and click Delete in the upper left corner.

The background of the selected text is blue.

Managing Annotators

If team labeling is enabled for a labeling job, view its labeling details on the Annotator Management tab page. Additionally, you can add, modify, or delete annotators.

  1. Choose Data Management > Label Data. On the My Creations tab page, view the list of all labeling jobs.
  2. Locate the row that contains the target team labeling job. (The name of a team labeling job is followed by .)
  3. Choose More > Annotator Management in the Operation column. Alternatively, click the job name to go to the job details page, and choose Team Labeling > Annotator Management in the upper right corner.
    Figure 7 Annotator management (1)
    Figure 8 Annotator management (2)
  • Adding an annotator

    Click Add Member, select a member name, and click OK.

    Click Send Email in the Operation column to send the labeling job to the annotator by email.

  • Modifying annotator information

    Click Modify in the Operation column to modify the role of the annotator.

  • Deleting an annotator

    Click Delete in the Operation column to delete the annotator.