Updated on 2023-09-06 GMT+08:00

Creating a Data Processing Task

You can create a data processing task to verify, cleanse, select, or augment existing data.

Prerequisites

  • Data has been prepared. Specifically, a dataset has been created or data has been uploaded to OBS.
  • The OBS directory you use and ModelArts are in the same region.

Procedure

  1. Log in to the ModelArts management console. In the left navigation pane, choose Data Management > Process Data. The Process Data page is displayed.
  2. On the Process Data page, click Create.
  3. On the page for creating a data processing task, set required algorithm parameters.
    1. Set the basic information, including Name, Version, and Description. For Version, the system automatically creates a version number, which is named according to a certain rule, for example, V0001 and V0002. The version number cannot be changed.
      Specify Name and Description according to actual requirements.
      Figure 1 Basic information for creating a data processing task
    2. Set the scenario type. You can set the scenario type to image classification or object detection.
    3. Set the data processing type. Data processing types include data cleaning, data validation, data selection, and data augmentation.
      Set the operator parameters based on the data processing type. For details about the operator parameters, see Built-in Operators.
      Figure 2 Setting the scenario type and data processing type
    4. Set the input and output. Select Datasets or OBSCatalog based on the site requirements. If you select Datasets, you need to enter the dataset name and dataset version. If you select OBSCatalog, you need to enter the correct OBS path.
      Figure 3 Input and output settings - Datasets
      Figure 4 Input and output settings - OBSCatalog
    5. After confirming that the parameter settings are correct, click Next.