Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Situation Awareness
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive
Help Center/ ModelArts/ API Reference/ Historical APIs/ Training Management (Old Version)/ Training Jobs/ Querying the Details About a Training Job Version

Querying the Details About a Training Job Version

Updated on 2024-06-13 GMT+08:00

Function

This API is used to obtain the details about a specified training job based on the job ID.

URI

GET /v1/{project_id}/training-jobs/{job_id}/versions/{version_id}

Table 1 describes the required parameters.
Table 1 Parameters

Parameter

Mandatory

Type

Description

project_id

Yes

String

Project ID. For details about how to obtain a project ID, see Obtaining a Project ID and Name.

job_id

Yes

Long

ID of a training job

version_id

Yes

Long

Version ID of a training job

Request Body

None

Response Body

Table 2 describes the response parameters.
Table 2 Parameters

Parameter

Type

Description

is_success

Boolean

Whether the request is successful

job_id

Long

ID of a training job

job_name

String

Name of a training job

job_desc

String

Description of a training job

version_id

Long

Version ID of a training job

version_name

String

Version name of a training job

pre_version_id

Long

Name of the previous version of a training job

engine_type

Integer

Engine type of a training job. The mapping between engine_type and engine_name is as follows:

  • engine_type: 1, engine_name: TensorFlow
  • engine_type: 2, engine_name: MXNet
  • engine_type: 3, engine_name: Ray
  • engine_type: 4, engine_name: Caffe
  • engine_type: 5, engine_name: Spark_MLlib
  • engine_type: 9, engine_name: XGBoost-Sklearn
  • engine_type: 10, engine_name: PyTorch
  • engine_type: 12, engine_name: Horovod

engine_name

String

Name of the engine selected for a training job. Currently, the following engines are supported:

  • Ascend-Powered-Engine
  • Caffe
  • Horovod
  • MXNet
  • PyTorch
  • Ray
  • Spark_MLlib
  • TensorFlow
  • XGBoost-Sklearn
  • MindSpore-GPU

engine_id

Long

ID of the engine selected for a training job

engine_version

String

Version of the engine selected for a training job

status

Integer

Status of a training job. For details about the job statuses, see Job Statuses.

app_url

String

Code directory of a training job

boot_file_url

String

Boot file of a training job

create_time

Long

Time when a training job is created

parameter

Array<Object>

Running parameters of a training job. This parameter is a container environment variable when a training job uses a custom image. For details, see Table 3.

duration

Long

Training job running duration, in milliseconds

spec_id

Long

ID of the resource specifications selected for a training job

core

String

Number of cores of the resource specifications

cpu

String

CPU memory of the resource specifications

gpu_num

Integer

Number of GPUs of the resource specifications

gpu_type

String

GPU type of the resource specifications

worker_server_num

Integer

Number of workers in a training job

data_url

String

Dataset of a training job

train_url

String

OBS path of the training job output file

log_url

String

OBS URL of the logs of a training job. By default, this parameter is left blank. Example value: /usr/train/

dataset_version_id

String

Dataset version ID of a training job

dataset_id

String

Dataset ID of a training job

data_source

Array<Object>

Dataset of a training job. For details, see Table 4.

model_id

Long

Model ID of a training job

model_metric_list

String

Model metrics of a training job. For details, see Table 5.

system_metric_list

Object

System monitoring metrics of a training job. For details, see Table 6.

user_image_url

String

SWR URL of a custom image used by a training job

user_command

String

Boot command used to start the container of a custom image of a training job

resource_id

String

Charged resource ID of a training job

dataset_name

String

Dataset of a training job

spec_code

String

Resource specifications selected for a training job

start_time

Long

Training start time

volumes

Array<Object>

Storage volume that can be used by a training job. For details, see Table 11.

dataset_version_name

String

Dataset of a training job

pool_name

String

Name of a resource pool

pool_id

String

ID of a resource pool

nas_mount_path

String

Local mount path of SFS Turbo (NAS). Example value: /home/work/nas

nas_share_addr

String

Shared path of SFS Turbo (NAS). Example value: 192.168.8.150:/

nas_type

String

Only NFS is supported. Example value: nfs

Table 3 parameter parameters

Parameter

Type

Description

label

String

Parameter name

value

String

Parameter value

Table 4 data_source parameters

Parameter

Type

Description

dataset_id

String

Dataset ID of a training job

dataset_version

String

Dataset version ID of a training job

type

String

Dataset type

  • obs: Data from OBS is used.
  • dataset: Data from a specified dataset is used.

data_url

String

OBS bucket path

Table 5 model_metric_list parameters

Parameter

Type

Description

metric

JSON Array

Validation metrics of a classification of a training job. For details, see Table 7.

total_metric

JSON

Overall validation parameters of a training job. For details, see Table 9.

Table 6 system_metric_list parameters

Parameter

Type

Description

cpuUsage

Array

CPU usage of a training job

memUsage

Array

Memory usage of a training job

gpuUtil

Array

GPU usage of a training job

Table 7 metric parameters

Parameter

Type

Description

metric_values

JSON

Validation metrics of a classification of a training job. For details, see Table 8.

reserved_data

JSON

Reserved parameter

metric_meta

JSON

Classification of a training job, including the classification ID and name

Table 8 metric_values parameters

Parameter

Type

Description

recall

Float

Recall of a classification of a training job

precision

Float

Precision of a classification of a training job

accuracy

Float

Accuracy of a classification of a training job

Table 9 total_metric parameters

Parameter

Type

Description

total_metric_meta

JSON

Reserved parameter

total_reserved_data

JSON

Reserved parameter

total_metric_values

JSON

Overall validation metrics of a training job. For details, see Table 10.

Table 10 total_metric_values parameters

Parameter

Type

Description

f1_score

Float

F1 score of a training job

recall

Float

Total recall of a training job

precision

Float

Total precision of a training job

accuracy

Float

Total accuracy of a training job

Table 11 volumes parameters

Parameter

Mandatory

Type

Description

nfs

No

Object

Storage volume of the shared file system type. Only the training jobs running in the resource pool with a shared file system network connected support such storage volumes. For details, see Table 6.

host_path

No

Object

Storage volume of the host file system type. Only training jobs running in a dedicated resource pool support such storage volumes. For details, see Table 7.

Table 12 nfs parameters

Parameter

Mandatory

Type

Description

id

Yes

String

ID of an SFS Turbo file system

src_path

Yes

String

Path to an SFS Turbo file system

dest_path

Yes

String

Local path to a training job

read_only

No

Boolean

Whether dest_path is read-only. The default value is false.

  • true: read-only permission
  • false: read/write permission. This is the default value.
Table 13 host_path parameters

Parameter

Mandatory

Type

Description

src_path

Yes

String

Local path to a host

dest_path

Yes

String

Local path to a training job

read_only

No

Boolean

Whether dest_path is read-only. The default value is false.

  • true: read-only permission
  • false: read/write permission. This is the default value.

Sample Request

The following shows how to obtain the details about the job whose job_id is 10 and version_id is 10.

GET    https://endpoint/v1/{project_id}/training-jobs/10/versions/10

Sample Response

  • Successful response
    {
        "is_success": true,
        "job_id": 10,
        "job_name": "TestModelArtsJob",
        "job_desc": "TestModelArtsJob desc",
        "version_id": 10,
        "version_name": "jobVersion",
        "pre_version_id": 5,
        "engine_type": 1,
        "engine_name": "TensorFlow",
        "engine_id": 1,
        "engine_version": "TF-1.4.0-python2.7",
        "status": 10,
        "app_url": "/usr/app/",
        "boot_file_url": "/usr/app/boot.py",
        "create_time": 1524189990635,
        "parameter": [
            {
                "label": "learning_rate",
                "value": 0.01
            }
        ],
        "duration": 532003,
        "spec_id": 1,
        "core": 2,
        "cpu": 8,
        "gpu_num": 2,
        "gpu_type": "Pnt1",
        "worker_server_num": 1,
        "data_url": "/usr/data/",
        "train_url": "/usr/train/",
        "log_url": "/usr/log/",
        "dataset_version_id": "2ff0d6ba-c480-45ae-be41-09a8369bfc90",
        "dataset_id": "38277e62-9e59-48f4-8d89-c8cf41622c24",
        "data_source": [
            {
                "type": "obs",
                "data_url": "/qianjiajun-test/minst/data/"
            }
        ],
        "user_image_url": "100.125.5.235:20202/jobmng/custom-cpu-base:1.0",
        "user_command": "bash -x /home/work/run_train.sh python /home/work/user-job-dir/app/mnist/mnist_softmax.py --data_url /home/work/user-job-dir/app/mnist_data",
        "model_id": 1,
        "model_metric_list": "{\"metric\":[{\"metric_values\":{\"recall\":0.005833,\"precision\":0.000178,\"accuracy\":0.000937},\"reserved_data\":{},\"metric_meta\":{\"class_name\":0,\"class_id\":0}}],\"total_metric\":{\"total_metric_meta\":{},\"total_reserved_data\":{},\"total_metric_values\":{\"recall\":0.005833,\"id\":0,\"precision\":0.000178,\"accuracy\":0.000937}}}",
        "system_metric_list": {
            "cpuUsage": [
                "0",
                "3.10",
                "5.76",
                "0",
                "0",
                "0",
                "0"
            ],
            "memUsage": [
                "0",
                "0.77",
                "2.09",
                "0",
                "0",
                "0",
                "0"
            ],
            "gpuUtil": [
                "0",
                "0.25",
                "0.88",
                "0",
                "0",
                "0",
                "0"
            ]
    },
        "dataset_name": "dataset-test",
        "dataset_version_name": "dataset-version-test",
        "spec_code": "xxxxxxxx",
        "start_time": 1563172362000,
        "volumes": [
            {
                "nfs": {
                    "id": "43b37236-9afa-4855-8174-32254b9562e7",
                    "src_path": "192.168.8.150:/",
                    "dest_path": "/home/work/nas",
                    "read_only": false
                }
            },
            {
                "host_path": {
                    "src_path": "/root/work",
                    "dest_path": "/home/mind",
                    "read_only": false
                }
            }
        ],
        "pool_id": "pool9928813f",
        "pool_name": "pnt1",
        "nas_mount_path": "/home/work/nas",
        "nas_share_addr": "192.168.8.150:/",
        "nas_type": "nfs"
    }
  • Failed response
    {
        "is_success": false,
        "error_message": "Error string",
        "error_code": "ModelArts.0105"
    }

Status Code

For details about the status code, see Status Code.

We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out more

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback