Querying the Running Metrics of a Specified Task in a Training Job
Function
This API is used to query the running metrics of a specified task in a training job on ModelArts.
This API applies to the following scenario: When you need to view the performance metrics of a specified task in a training job, you can call this API to obtain the running metrics. Before using this API, ensure that you have obtained the training job ID and task ID and have the permission to view running metrics. After the query is complete, the platform returns the performance metrics of the task. If the training job ID or task ID does not exist, no task metric is generated, or you do not have the operation permission, the API will return an error message.
Debugging
You can debug this API through automatic authentication in API Explorer or use the SDK sample code generated by API Explorer. Obtain its CLI example hcloud ModelArts ShowTrainingJobMetrics.
Authorization Information
Each account has all the permissions required to call all APIs, but IAM users must be assigned the required permissions.
- If you are using role/policy-based authorization, see Permissions Policies and Supported Actions for details on the required permissions.
- If you are using identity policy-based authorization, the following identity policy-based permissions are required.
Action
Access Level
Resource Type (*: required)
Condition Key
Alias
Dependencies
modelarts:trainJob:get
Read
trainJob *
g:ResourceTag/<tag-key>
-
-
-
-
modelarts:poolType
-
modelarts:poolId
-
URI
GET /v2/{project_id}/training-jobs/{training_job_id}/metrics/{task_id}
|
Parameter |
Mandatory |
Type |
Description |
|---|---|---|---|
|
project_id |
Yes |
String |
Definition: Project ID. For details, see Obtaining a Project ID and Name. Constraints: The value can contain 1 to 64 characters. Letters, digits, and hyphens (-) are allowed. Range: N/A Default Value: N/A |
|
training_job_id |
Yes |
String |
Definition: Training job ID. For details, see Obtaining Training Jobs. Constraints: N/A Range: N/A Default Value: N/A |
|
task_id |
Yes |
String |
Definition: Name of a training job. You can obtain the value from the status.tasks field in the training job details. Constraints: For one node, the default is worker-0. For multiple nodes, it includes worker-0, worker-1, and so on. Range: N/A Default Value: N/A |
Request Parameters
None
Response Parameters
Status code: 200
|
Parameter |
Type |
Description |
|---|---|---|
|
metrics |
Array of MetricObject objects |
Definition: Running metrics. |
|
Parameter |
Type |
Description |
|---|---|---|
|
metric |
String |
Definition: Running metrics. Range:
|
|
value |
Array of doubles |
Definition: Value of a running metric. An average value is collected every minute. |
Example Requests
The following shows how to query the running metrics of the work-0 task of the training job whose UUID is 2cd88daa-31a4-40a8-a58f-d186b0e93e4f.
GET https://endpoint/v2/{project_id}/training-jobs/2cd88daa-31a4-40a8-a58f-d186b0e93e4f/metrics/worker-0
Example Responses
Status code: 200
ok
{
"metrics" : [ {
"metric" : "cpuUsage",
"value" : [ -1, -1, 2.43, 4.524, 6.714, 12.422, 9.214, 5.36, 7.5, 10.088, 8.975, 11.423, 11.548, 14.563, 16.833 ]
}, {
"metric" : "memUsage",
"value" : [ -1, -1, 0.04, 0.521, 1.652, 4.252, 6.433, 7.384, 7.982, 8.718, 9.365, 9.881, 10.192, 9.994, 9.005 ]
}, {
"metric" : "gpuUtil",
"value" : [ -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 ]
}, {
"metric" : "gpuMemUsage",
"value" : [ -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 ]
}, {
"metric" : "npuUtil",
"value" : [ -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 ]
}, {
"metric" : "npuMemUsage",
"value" : [ -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 ]
} ]
}
Status Codes
|
Status Code |
Description |
|---|---|
|
200 |
ok |
Error Codes
See Error Codes.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot