Updated on 2025-11-19 GMT+08:00

Importing Data

Before using ModelArts Studio, you need to prepare OBS buckets and resource pools to support subsequent model optimization, compression and deployment tasks, and storage of model optimization and task logs.

  1. Prepare ModelArts Studio resources. For details, see Preparations.
  2. Prepare a training dataset.

    The format of the NLP fine-tuning dataset to be imported to the platform must meet the text dataset format requirements.

  3. Import the dataset from OBS to ModelArts Studio. For details about OBS, see Using OBS Console.
  4. Log in to ModelArts Studio and access the desired workspace.
  5. In the navigation pane, choose Data engineering > Data Obtaining > import job list. On the displayed page, click Create Import Job in the upper right corner.
    Figure 1 Create Import Job

    The OBS bucket and ModelArts Studio must be in the same region. Otherwise, the OBS path cannot be selected.

  6. On the Create Import Job page, select the dataset type, file format, and import source.

    Set Import Source to OBS and click . In the Storage Location dialog box, select the data to be imported and click OK.

  7. Enter the dataset name and description. Enter extended information if required.
    Extended Info includes Dataset Property and Dataset Copyright.
    • Dataset Property: You can add industry, language, and custom information to a dataset.
    • Dataset Copyright: In addition to users' self-built datasets, open-source datasets may be used for model training. The dataset copyright function is used to record and manage the copyright information of datasets to ensure that data is used in compliance with laws and regulations and clearly understand the dataset sources and related copyright authorization. By filling in the information, you can trace the source of the data and specify the restrictions and permissions for using the data, thereby protecting data copyright and avoiding copyright disputes.
  8. Click Create Now in the lower right corner of the page to return to the Import Task page. On the page that is displayed, you can view the task status of the dataset. If the task status is Succeeded, the data is successfully imported.
  9. To view the imported dataset, choose Data Engineering > Data Management > Datasets, and click the Original Dataset tab.
    If the task status is Failed, the import has failed. The possible causes are as follows:
    • The file name extension is incorrect. Check whether the file name extension is correct. For example, if you create a dataset in CSV format, the file name extension must be .csv.
    • The file content fails to be verified. Check whether the format of the uploaded file is correct. You can download data samples on the Create Import Job page for comparison.