Updated on 2025-07-02 GMT+08:00

Managing Processed Datasets

If you have completed data processing, data synthesis, data labeling, or data combination tasks for a dataset, click Generate in the corresponding task list to generate a processed dataset. The generated datasets are managed by the platform in a unified manner and used for subsequent publishing tasks.

The platform allows you to view basic information and data lineage of processed datasets. The procedure is as follows:

  1. Log in to ModelArts Studio Large Model Deveopment Platform. In the My Spaces area, click the required workspace.
    Figure 1 My Spaces
  2. In the navigation pane, choose Data Engineering > Data Management > Datasets. Click the Processed Dataset tab.
  3. Click the dataset name to view the basic information, data preview, data lineage, and operation records of the processed dataset.
    • On the Basic Information tab page, you can view the data details, data source, and extended information. In the Extended Info area, you can set dataset properties as required, including the dataset property name, industry, language, and custom tag.
    • On the Data Preview tab page, you can view the processed data.
    • On the Data Lineage tab page, view the operations performed on the dataset, such as import and synthesis.
    • On the Operation Record tab page, view the operations and status of the dataset.
  4. Click Delete in the Operation column to delete unnecessary datasets.
    • To restore a deleted dataset, click Show Deleted Data in the upper right corner. In the list of deleted datasets, restore the dataset.
    • To permanently delete a dataset, click the dataset name to go to the dataset details page, confirm the dataset content, and delete it permanently.

    Deleting a processed dataset is a high-risk operation. Before deleting a dataset, ensure that it is no longer used.