Querying the Running Metrics of a Specified Task in a Training Job
Function
This API is used to query the running metrics of a specified task in a training job on ModelArts.
This API applies to the following scenario: When you need to view the performance metrics of a specified task in a training job, you can call this API to obtain the running metrics. Before using this API, ensure that you have obtained the training job ID and task ID and have the permission to view running metrics. After the query is complete, the platform returns the performance metrics of the task. If the training job ID or task ID does not exist, no task metric is generated, or you do not have the operation permission, the API will return an error message.
Debugging
You can debug this API through automatic authentication in API Explorer or use the SDK sample code generated by API Explorer.
URI
GET /v2/{project_id}/training-jobs/{training_job_id}/metrics/{task_id}
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
project_id |
Yes |
String |
Definition: Project ID. For details, see Obtaining a Project ID and Name. Constraints: The value can contain 1 to 64 characters. Letters, digits, and hyphens (-) are allowed. Range: N/A Default Value: N/A |
training_job_id |
Yes |
String |
Definition: ID of a training job Constraints: For details, see Querying a Training Job List. Range: N/A Default Value: N/A |
task_id |
Yes |
String |
Definition: Name of a training job. You can obtain the value from the status.tasks field in the training job details. Constraints: For one node, the default is worker-0. For multiple nodes, it includes worker-0, worker-1, and so on. Range: N/A Default Value: N/A |
Request Parameters
None
Response Parameters
Status code: 200
Parameter |
Type |
Description |
---|---|---|
metrics |
Array of MetricObject objects |
Definition: Running metrics. |
Parameter |
Type |
Description |
---|---|---|
metric |
String |
Definition: Running metrics. Range:
|
value |
Array of doubles |
Definition: Value of a running metric. An average value is collected every minute. |
Example Requests
The following shows how to query the running metrics of the work-0 task of the training job whose UUID is 2cd88daa-31a4-40a8-a58f-d186b0e93e4f.
GET https://endpoint/v2/{project_id}/training-jobs/2cd88daa-31a4-40a8-a58f-d186b0e93e4f/metrics/worker-0
Example Responses
Status code: 200
ok
{ "metrics" : [ { "metric" : "cpuUsage", "value" : [ -1, -1, 2.43, 4.524, 6.714, 12.422, 9.214, 5.36, 7.5, 10.088, 8.975, 11.423, 11.548, 14.563, 16.833 ] }, { "metric" : "memUsage", "value" : [ -1, -1, 0.04, 0.521, 1.652, 4.252, 6.433, 7.384, 7.982, 8.718, 9.365, 9.881, 10.192, 9.994, 9.005 ] }, { "metric" : "gpuUtil", "value" : [ -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 ] }, { "metric" : "gpuMemUsage", "value" : [ -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 ] }, { "metric" : "npuUtil", "value" : [ -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 ] }, { "metric" : "npuMemUsage", "value" : [ -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 ] } ] }
Status Codes
Status Code |
Description |
---|---|
200 |
ok |
Error Codes
See Error Codes.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot