Updated on 2023-12-14 GMT+08:00

Querying the Details About a Training Job Configuration

Function

This API is used to obtain the details about a specified training job configuration.

URI

GET /v1/{project_id}/training-job-configs/{config_name}

Table 1 describes the required parameters.
Table 1 URI parameters

Parameter

Mandatory

Type

Description

project_id

Yes

String

Project ID. For details about how to obtain a project ID, see Obtaining a Project ID and Name.

config_name

Yes

String

Name of a training job configuration

Table 2 Query parameters

Parameter

Mandatory

Type

Description

config_type

No

String

Configuration type to be queried. Options:

  • custom: Query the custom configurations.
  • sample: Query the sample configurations. The default value is custom.

Request Body

None

Response Body

Table 3 describes the response parameters.
Table 3 Parameters

Parameter

Type

Description

is_success

Boolean

Whether the request is successful

error_message

String

Error message of a failed API call.

This parameter is not included when the API call succeeds.

error_code

String

Error code of a failed API call. For details, see Error Codes. This parameter is not included when the API call succeeds.

config_name

String

Name of a training job configuration

config_desc

String

Description of a training job configuration

worker_server_num

Integer

Number of workers in a training job

app_url

String

Code directory of a training job

boot_file_url

String

Boot file of a training job

model_id

Long

Model ID of a training job

parameter

JSON Array

Running parameters of a training job. It is a collection of label-value pairs. This parameter is a container environment variable when a training job uses a custom image. For details, see Table 8.

spec_id

Long

ID of the resource specifications selected for a training job

data_url

String

Dataset of a training job

dataset_id

String

Dataset ID of a training job

dataset_version_id

String

Dataset version ID of a training job

data_source

JSON Array

Dataset of a training job For details, see Table 4.

engine_type

Integer

Engine type of a training job

engine_name

String

Name of the engine selected for a training job

engine_id

Long

ID of the engine selected for a training job

engine_version

String

Version of the engine selected for a training job

train_url

String

OBS URL of the output file of a training job. By default, this parameter is left blank. Example value: /usr/train/

log_url

String

OBS URL of the logs of a training job. By default, this parameter is left blank. Example value: /usr/train/

user_image_url

String

SWR URL of a custom image used by a training job

user_command

String

Boot command used to start the container of a custom image of a training job

spec_code

String

Resource specifications selected for a training job

gpu_type

String

GPU type of the resource specifications

create_time

Long

Time when a training job parameter configuration is created

cpu

String

CPU memory of the resource specifications

gpu_num

Integer

Number of GPUs of the resource specifications

core

String

Number of cores of the resource specifications

dataset_name

String

Dataset of a training job

dataset_version_name

String

Dataset of a training job

pool_id

String

ID of a resource pool

pool_name

String

Name of a resource pool

volumes

JSON Array

Storage volume that can be used by a training job. For details, see Table 5.

nas_mount_path

String

Local mount path of SFS Turbo (NAS). Example value: /home/work/nas

nas_share_addr

String

Shared path of SFS Turbo (NAS). Example value: 192.168.8.150:/

nas_type

String

Only NFS is supported. Example value: nfs

Table 4 data_source parameters

Parameter

Type

Description

dataset_id

String

Dataset ID of a training job

dataset_version

String

Dataset version ID of a training job

type

String

Dataset type. Options:

  • obs: Data from OBS is used.
  • dataset: Data from a specified dataset is used.

data_url

String

OBS bucket path

Table 5 volumes parameters

Parameter

Type

Description

nfs

Object

Storage volume of the shared file system type. Only the training jobs running in a resource pool with the shared file system network connected support such storage volumes. For details, see Table 6.

host_path

Object

Storage volume of the host file system type. Only training jobs running in a dedicated resource pool support such storage volumes. For details, see Table 7.

Table 6 nfs parameters

Parameter

Type

Description

id

String

ID of an SFS Turbo file system

src_path

String

Address of an SFS Turbo file system

dest_path

String

Local path to a training job

read_only

Boolean

Whether dest_path is read-only. The default value is false.

  • true: read-only permission
  • false: read/write permission. This is the default value.
Table 7 host_path parameters

Parameter

Type

Description

src_path

String

Local path to a host

dest_path

String

Local path to a training job

read_only

Boolean

Whether dest_path is read-only. The default value is false.

  • true: read-only permission
  • false: read/write permission. This is the default value.
Table 8 parameter parameters

Parameter

Type

Description

label

String

Parameter name

value

String

Parameter value

Sample Request

The following shows how to obtain the details about the job configuration named config123.

GET    https://endpoint/v1/{project_id}/training-job-configs/config123

Sample Response

  • Successful response
    {
        "spec_code": "modelarts.vm.gpu.v100",
        "user_image_url": "100.125.5.235:20202/jobmng/custom-cpu-base:1.0",
        "user_command": "bash -x /home/work/run_train.sh python /home/work/user-job-dir/app/mnist/mnist_softmax.py --data_url /home/work/user-job-dir/app/mnist_data",
        "gpu_type": "nvidia-v100",
        "dataset_version_id": "2ff0d6ba-c480-45ae-be41-09a8369bfc90",
        "engine_name": "TensorFlow",
        "is_success": true,
        "nas_mount_path": "/home/work/nas",
        "worker_server_num": 1,
        "nas_share_addr": "192.168.8.150:/",
        "train_url": "/test/minst/train_out/out1/",
        "nas_type": "nfs",
        "spec_id": 4,
        "parameter": [
            {
                "label": "learning_rate",
                "value": 0.01
            }
        ],
        "log_url": "/usr/log/",
        "config_name": "config123",
        "app_url": "/usr/app/",
        "create_time": 1559045426000,
        "dataset_id": "38277e62-9e59-48f4-8d89-c8cf41622c24",
        "volumes": [
            {
                "nfs": {
                    "id": "43b37236-9afa-4855-8174-32254b9562e7",
                    "src_path": "192.168.8.150:/",
                    "dest_path": "/home/work/nas",
                    "read_only": false
                }
            },
            {
                "host_path": {
                    "src_path": "/root/work",
                    "dest_path": "/home/mind",
                    "read_only": false
                }
            }
        ],
        "cpu": "64",
        "model_id": 4,
        "boot_file_url": "/usr/app/boot.py",
        "dataset_name": "dataset-test",
        "pool_id": "pool9928813f",
        "config_desc": "This is a config desc test",
        "gpu_num": 1,
        "data_source": [
            {
                "type": "obs",
                "data_url": "/test/minst/data/"
            }
        ],
        "pool_name": "p100",
        "dataset_version_name": "dataset-version-test",
        "core": "8",
        "engine_type": 1,
        "engine_id": 3,
        "engine_version": "TF-1.8.0-python2.7",
        "data_url": "/test/minst/data/"
    }
  • Failed response
    {
        "is_success": false,
        "error_message": "Error string",
        "error_code": "ModelArts.0105"
    }

Status Code

For details about the status code, see Table 1.