Help Center/
ModelArts/
FAQs/
ExeML (Old Version)/
Preparing Data/
What Are the Requirements for Training Data When You Create a Predictive Analytics Project in ExeML?
Updated on 2022-12-06 GMT+08:00
What Are the Requirements for Training Data When You Create a Predictive Analytics Project in ExeML?
Requirements on Datasets
- Dataset consists of letters, digits, hyphens (-), and underscores (_), and must be in CSV format. Data files cannot be stored in the root directory of an OBS bucket, but in a folder in the OBS bucket, for example, /obs-xxx/data/input.csv.
- Use newline characters (\n or LF) to separate lines and commas (,) to separate columns in the file content. The file content cannot include non-English symbols (for example, Chinese characters). The column content cannot contain special characters such as commas, line breaks, or quotation marks. It is recommended that the column content consist of only letters and numbers.
- Data training
- The number of columns in the training data must be the same, and there has to be at least 100 data records (a feature with different values is considered as different data records).
- The training columns cannot contain timestamp formats (such as yy-mm-dd and yyyy-mm-dd).
- If a column has only one value, the column is considered invalid. Ensure that there are at least two values in the label column and no data is missing.
The label column is the training target specified in a training task. It is the output (prediction item) for the model trained using the dataset.
- In addition to the label column, the dataset must contain at least two valid feature columns. Ensure that there are at least two values in each feature column and that the percentage of missing data must be lower than 10%.
- The CSV file cannot contain a table header, or the training will fail.
Parent topic: Preparing Data
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
The system is busy. Please try again later.
For any further questions, feel free to contact us through the chatbot.
Chatbot