Updated on 2024-05-07 GMT+08:00

Introduction to Training Job Logs

Overview

Training logs record the runtime process and exception information of training jobs and provide useful details for fault location. The standard output and standard error information in your code are displayed in training logs. If you encounter an issue during the execution of a ModelArts training job, view logs first. In most scenarios, you can locate the issue based on the error information reported in logs.

Retention Period

Logs are classified into the following types based on the retention period:

  • Real-time logs: generated during training job running and can be viewed on the ModelArts training job details page.
  • Historical logs: After a training job is completed, you can view its historical logs on the ModelArts training job details page. ModelArts automatically stores the logs for 30 days.
  • Permanent logs: These logs are dumped to your OBS bucket. When creating a training job, you can enable persistent log saving and set a job log path for dumping.
    Figure 1 Enabling Persistent Log Saving

Real-time logs and historical logs have no difference in content.

Related Chapters