Help Center> RES> API Reference> APIs (Old Version)> Job-related APIs> Submitting Streaming Training Jobs

Submitting Streaming Training Jobs

Function

This API is used to submitting streaming training jobs.

URI

POST /v1/{project_id}/stream-etl-job

Table 1 describes the URI parameters.

**Table 1** URI parameters
Parameter	Mandatory	Type	Description
project_id	Yes	String	Project ID, which is used for resource isolation. For details about how to obtain the project ID, see Obtaining a Project ID.

Request

Table 2 describes the request parameters.

**Table 2** Request parameters
Parameter	Mandatory	Type	Description
workspace_id	No	String	Workspace ID. The default value is 0.
job_name	Yes	String	Training job name. The value can contain a maximum of 20 characters.
job_description	No	String	Training job description. The value can contain a maximum of 256 characters.
nearline_platform	Yes	JSON	Offline computing platform. For details, see Table 3.
strategy	Yes	JSON	Strategy information. For details, see Table 5.

**Table 3** **nearline_platform** parameters
Parameter	Mandatory	Type	Description
platform	Yes	String	Platform name. The value can contain a maximum of 64 characters. Currently, only DLI is supported.
platform_parameter	Yes	JSON	Platform parameter. For details, see Table 4.
computing_resource	No	String	Resource specifications required for the normal running of the DLI jobs.
config_load_path	Yes	String	OBS path that stores the files generated by the selected configurations

**Table 4** **platform_parameter** parameters
Parameter	Mandatory	Type	Description
cluster_name	Yes	String	Cluster name.
cluster_id	No	String	Cluster ID.

**Table 5** **strategy** parameters
Parameter	Mandatory	Type	Description
strategy_type	Yes	String	The optional value is nearline.
name	Yes	String	Strategy alias. The value can contain a maximum of 60 characters.
algorithm_type	Yes	String	Algorithm type. The option is as follows: NEARLINE_ONLINE_TRAINING
parameter	Yes	JSON	Algorithm parameter. For details, see Table 6.

**Table 6** **parameter** parameters
Parameter	Mandatory	Type	Description
data_source	Yes	JSON	Data source parameter. For details, see Table 7. The standard recommendation data supported by the real-time streaming nearline job comes from List of User Behaviors.
data_source_config	Yes	JSON	Data source configuration. For details, see Table 10.
algorithm_config	Yes	JSON	Algorithm configuration. For details, see Table 11.

**Table 7** **data_source** parameters
Parameter	Mandatory	Type	Description
platform	Yes	String	Platform name. Currently, only DIS is supported. The data required by the real-time nearline jobs is added to the DIS platform where RES reads the data for nearline computing tasks.
in_stream_conf	Yes	JSON	Platform parameter. For details, see Table 8.
out_stream_conf	Yes	JSON	Platform parameter. For details, see Table 9.

**Table 8** **in_stream_conf** parameters
Parameter	Mandatory	Type	Description
stream_name	No	String	Name of the DIS stream. The stream is used to receive nearline behavior data.
starting_offsets	Yes	String	Start position for reading DIS data. LATEST: Latest data is read first. EARLIEST: Earliest data is read first.

**Table 9** **out_stream_conf** parameters
Parameter	Mandatory	Type	Description
stream_name	No	String	Name of the DIS stream. The stream is used to store the ranking preprocessing data generated by the calculation of behavior data and profile libraries for model training. Data in the stream is intermediate data generated by streaming training jobs. You only need to specify the stream name and do not need to send or obtain data from the stream.
starting_offsets	Yes	String	Start position for reading DIS data. LATEST indicates that the latest data is read first.

**Table 10** **data_source_config** parameters
Parameter	Mandatory	Type	Description
interval	Yes	Integer	Time interval for the running of nearline jobs, in seconds. For example, the value 10 indicates that the nearline strategy performs the computing tasks every 10 seconds, including stream data reading and processing.

**Table 11** **algorithm_config** parameters
Parameter	Mandatory	Type	Description
online_job_uuid	Yes	String	UUID of the associated online service.
flow_name	Yes	String	Name of an online process of a associated online service. The behavior parameters, model file path, and data preprocessing information required by the streaming training job are obtained from the online process.
online_training_config	Yes	JSON	Platform parameter. For details, see Table 12.
bad_record_log	No	String	Path to access the error data log. Folders that house the error data are placed in the path.

**Table 12** **online_training_config** parameters
Parameter	Mandatory	Type	Description
spec_id	Yes	Integer	Resource specification ID of a ranking job Before using ModelArts, query the access keys by referring to Querying the Access Keys of ModelArts and associate the access keys with ModelArts by referring to Associating the AK/SK with ModelArts. Then, obtain the value returned by the spec_id parameter by referring to Querying the Compute Node Specifications of ModelArts.
optimize_parameters	Yes	JSON	Platform parameter. For details, see Table 13.
update_interval	Yes	Integer	Interval for updating the ranking model, in minutes. For example, the value 10 indicates that the ranking model is saved to OBS every 10 minutes.

**Table 13** **optimize_parameters** parameters
Parameter	Mandatory	Type	Description
type	Yes	String	Optimizer type. The option is as follows: ftrl
initial_accumulator_value	Yes	Double	Parameter that can adjust the learning step dynamically. The value ranges from 0 (0 is not included) to 1. The default value is 0.1.
lambda1	Yes	Double	Overlaid on the norm (x, 1) of the model and used to limit the model value to prevent overfitting. The value ranges from 0 to 1. The default value is 0.
lambda2	Yes	Double	Overlaid on the norm (x, 2) of the model and used to limit the model value to prevent overfitting. The value ranges from 0 to 1. The default value is 0.
learning_rate	Yes	Double	Hyper-parameter that controls the step size of the optimizer in the optimization direction. The value ranges from 0 (0 is not included) to 1. The default value is 0.1.

Response

Table 14 describes the response parameters.

**Table 14** Response parameters
Parameter	Mandatory	Type	Description
is_success	Yes	Boolean	Whether the request is successful
nearline_uuid	Yes	String	Candidate set ID
job_id	Yes	String	Job ID

Example

Example request

{
	"job_name": "Nearline-update",
	"job_description": "",
	"nearline_platform": {
		"platform": "DLI",
		"platform_parameter": {
			"cluster_name": "dli-1"
		},
		"config_load_path": "<OBS path for storing the configuration files>",
		"computing_resource": ""
	},
	"storage": {
		"user_profile_storage": {
			"platform": "CloudTable",
			"platform_parameter": {
				"cluster_id": "96219587-3bb2-4eed-a8d0-0cda6dc50223",
				"cluster_name": "cloudtable-62d2",
				"table_name": "write-profile-user"
			}
		},
		"item_profile_storage": {
			"platform": "CloudTable",
			"platform_parameter": {
				"cluster_id": "96219587-3bb2-4eed-a8d0-0cda6dc50223",
				"cluster_name": "cloudtable-62d2",
				"table_name": "write-profile-item"
			}
		},
		"filter_set_storage": {
			"platform": "CloudTable",
			"platform_parameter": {
				"cluster_id": "96219587-3bb2-4eed-a8d0-0cda6dc50223",
				"cluster_name": "cloudtable-62d2",
				"table_name": "write-profile-filter"
			}
		}
	},
	"strategy": {
		"name": "Update user profiles based on behavior data",
		"algorithm_type": "NEARLINE_UPDATE_USER_PORTRAIT",
		"strategy_type": "nearline",
		"parameter": {
			"data_source_config": {
				"behavior_type": ["view", "click", "collect", "uncollect", "search_click", "comment", "share", "like", "dislike", "grade", "consume", "use"],
				"interval": "10"
			},
			"data_source": {
				"platform": "DIS",
				"platform_parameter": {
					"stream_name": "dis-evan",
					"starting_offsets": "latest"
				}
			},
			"algorithm_config": {
				"update_context": true,
				"update_item_hotvalue_flag": true,
				"filter_history_flag": true,
				"max_history_num": 100,
				"result_path": "<Path for storing the real-time sample data>",
	                        "global_features_information_path":"<Path for storing the global configuration tables>",
				"bad_record_log":"<Path for storing exception data logs>"
			}
		}
	}
}

Example of a successful response

{
    "is_success": true,
    "job_id": "cdf49df766f2499586685b08212fd03f",
    "nearline_uuid": "61496485f0ba4a77b02b4f66f3c11078"
}

Example of a failed response

{
    "is_success": false,
    "error_code": "res.1008",
    "error_msg": "The request parameter(job_name) is null."
}

Status Code

For details about status codes, see Status Codes.

Parent topic: Job-related APIs

Last Article: Submitting Realtime Streaming Nearline Jobs

Next Article: Submitting Data Quality Jobs

Did this article solve your problem?

Thank you for your score！Your feedback would help us improve the website.

Products

Compute

Application

Dedicated Cloud

Storage

Management & Deployment

Migration

Network

Enterprise Intelligence

Video

Database

Edge Cloud Services

DevCloud

Security

Cloud Communications

Internet of Things

Solutions

Industry-Specific Solutions

General-Purpose Solutions

Security

DevOps

Enterprise Intelligence

Essential Platform

Big Data

Visual Cognition

Speech and Semantics

Support

Help Center

Customer Services

Developers

Console

语言 - Language

中国站 - 简体中文

中国站 - English

International - 简体中文

International - English