Updated on 2023-12-14 GMT+08:00

Creating a Version of a Training Job

Function

This API is used to create a version of a training job.

Calling this API is an asynchronous operation. The job status can be obtained by calling the APIs described in Querying a Training Job List and Querying the Details About a Training Job Version.

URI

POST /v1/{project_id}/training-jobs/{job_id}/versions

Table 1 describes the required parameters.
Table 1 Parameters

Parameter

Mandatory

Type

Description

project_id

Yes

String

Project ID. For details about how to obtain a project ID, see Obtaining a Project ID and Name.

job_id

Yes

Long

ID of a training job

Request Body

Table 2 describes the request parameters.
Table 2 Request parameters

Parameter

Mandatory

Type

Description

job_desc

No

String

Description of a training job. The value must contain 0 to 256 characters. By default, this parameter is left blank.

config

Yes

Object

Parameters for creating a training job For details, see Table 3.

Table 3 config parameters

Parameter

Mandatory

Type

Description

worker_server_num

Yes

Integer

Number of workers in a training job. Obtain the maximum value from Querying Job Resource Specifications.

app_url

Yes

String

Code directory of a training job, for example, /usr/app/. This parameter must be used together with boot_file_url. After setting model_id, you do not need to set app_url or boot_file_url, and engine_id.

boot_file_url

Yes

String

Boot file of a training job, which needs to be stored in the code directory. Example value: /usr/app/boot.py This parameter must be used together with app_url. After setting model_id, you do not need to set app_url or boot_file_url, and engine_id.

parameter

No

Array<Object>

Running parameters of a training job. It is a collection of label-value pairs. For details, see the sample request. This parameter is a container environment variable when a training job uses a custom image. For details, see Table 5.

data_url

No

String

OBS URL of the dataset required by a training job. By default, this parameter is left blank. For example, /usr/data/. This parameter cannot be used together with data_source or dataset_id and dataset_version_id. However, one of the parameters must exist.

dataset_id

No

String

Dataset ID of a training job. This parameter must be used together with dataset_version_id, but cannot be used together with data_url or data_source.

dataset_version_id

No

String

Dataset version ID of a training job. This parameter must be used together with dataset_id, but cannot be used together with data_url or data_source.

data_source

No

JSON Array

Dataset of a training job. This parameter cannot be used with data_url, dataset_id, or dataset_version_id. For details, see Table 4.

spec_id

Yes

Long

ID of the resource specifications selected for a training job. Obtain the ID by calling the API described in Querying Job Resource Specifications. When creating a public pool job, ensure that spec_id is mandatory and cannot be used with pool_id.

pool_id

Yes

String

ID of a dedicated resource pool. To obtain the ID, do as follows: Log in to the ModelArts management console, choose Dedicated Resource Pools in the navigation pane on the left, and view the resource pool ID in the dedicated resource pool list. When creating a dedicated pool job, ensure that pool_id is mandatory and cannot be used with spec_id.

engine_id

Yes

Long

ID of the engine selected for a training job. The default value is 1. After setting model_id, you do not need to set app_url or boot_file_url, and engine_id. Obtain the ID by calling the API described in Querying Job Engine Specifications.

model_id

Yes

Long

ID of the built-in model of a training job. Obtain model_id by calling the API described in Querying a Built-in Algorithm. After setting model_id, you do not need to set app_url or boot_file_url, and engine_id.

train_url

Yes

String

OBS URL of the output file of a training job. By default, this parameter is left blank. Example value: /bucket/trainUrl/

log_url

No

String

OBS URL of the logs of a training job. By default, this parameter is left blank. Example value: /usr/train/

pre_version_id

Yes

Long

ID of the previous version of a training job. You can obtain the value of version_id by calling the API described in Obtaining Training Job Versions.

user_image_url

No

String

SWR URL of a custom image used by a training job. Example value: 100.125.5.235:20202/jobmng/custom-cpu-base:1.0

user_command

No

String

Boot command used to start the container of a custom image of a training job. The format is bash /home/work/run_train.sh python /home/work/user-job-dir/app/train.py {python_file_parameter}.

Table 4 data_source parameters

Parameter

Mandatory

Type

Description

dataset_id

No

String

Dataset ID of a training job. This parameter must be used together with dataset_version_id, but cannot be used together with data_url.

dataset_version

No

String

Dataset version ID of a training job. This parameter must be used together with dataset_id, but cannot be used together with data_url.

type

No

String

Dataset type. The value can be obs or dataset. obs and dataset cannot be used at the same time.

data_url

No

String

OBS bucket path. This parameter cannot be used together with dataset_id or dataset_version.

Table 5 parameter parameters

Parameter

Mandatory

Type

Description

label

No

String

Parameter name

value

No

String

Parameter value

Response Body

Table 6 describes the response parameters.
Table 6 Parameters

Parameter

Type

Description

is_success

Boolean

Whether the request is successful

error_message

String

Error message of a failed API call.

This parameter is not included when the API call succeeds.

error_code

String

Error code of a failed API call. For details, see Error Codes. This parameter is not included when the API call succeeds.

job_id

Long

ID of a training job

job_name

String

Name of a training job

status

Int

Status of a training job. For details about the job statuses, see Job Statuses.

create_time

Long

Timestamp when a training job is created

version_id

Long

Version ID of a training job

version_name

String

Version name of a training job

Sample Request

The following shows how to create a job whose job_id is 10 and pre_version_id is 20.
POST    https://endpoint/v1/{project_id}/training-jobs/10/versions/
{
    "job_desc": "This is a ModelArts job",
    "config": {
        "worker_server_num": 1,
        "app_url": "/usr/app/",
        "boot_file_url": "/usr/app/boot.py",
        "parameter": [
            {
                "label": "learning_rate",
                "value": "0.01"
            },
            {
                "label": "batch_size",
                "value": "32"
            }
        ],
        "dataset_id": "38277e62-9e59-48f4-8d89-c8cf41622c24",
        "dataset_version_id": "2ff0d6ba-c480-45ae-be41-09a8369bfc90",
        "spec_id": 1,
        "engine_id": 1,
        "train_url": "/usr/train/",
        "log_url": "/usr/log/",
        "pre_version_id": 20,
        "model_id": 1,
        "pool_id": "test-pool"
    }
}

Sample Response

  • Successful response
    {
        "is_success": true,
        "job_id": 10,
        "job_name": "TestModelArtsJob",
        "status": 1,
        "create_time": 1524189990635,
        "version_id": 10,
        "version_name":"V0001"
    }
  • Failed response
    {
        "is_success": false,
        "error_message": "Error string",
        "error_code": "ModelArts.0105"
    }

Status Code

For details about the status code, see Status Code.