Using Data Engineering to Build a DeepSeek Model Dataset
Process of Building a DeepSeek Model Dataset
Table 1 describes how to use data engineering to build a third-party model dataset on ModelArts Studio.
|
Process |
Sub-process |
Description |
Operation Guide |
|---|---|---|---|
|
Importing data to the Pangu platform |
Creating an import task |
Import data stored in OBS or local data into the platform for centralized management, facilitating subsequent processing or publishing.
NOTE:
When importing a dataset, set the dataset type to Single Round QA. |
|
|
Processing other datasets |
Processing other datasets |
Use custom processing operators to preprocess data, ensuring it meets the model training standards and service requirements. |
|
|
Publishing other datasets |
Publishing other datasets |
Data publishing refers to publishing a single dataset in a specific format as a published dataset for subsequent model training operations. |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot