Submitting Combined Jobs
Function
This API is used to submit combined jobs for offline computing tasks and generate candidate sets using the selected strategies.
URI
POST /v1/{project_id}/training
| Parameter | Mandatory | Type | Description |
|---|---|---|---|
| project_id | Yes | String | Project ID, which is used for resource isolation. For details about how to obtain the project ID, see Obtaining a Project ID. |
Request
| Parameter | Mandatory | Type | Description |
|---|---|---|---|
| workspace_id | No | String | Workspace ID. The default value is 0. |
| job_name | Yes | String | Training job name. The value can contain a maximum of 20 characters. Only digits, letters, underscores (_), and hyphens (-) are allowed. |
| job_description | No | String | Training job description. The value can contain a maximum of 256 characters. |
| offline_platform | Yes | List | Offline computing platform. For details, see Table 3. |
| data_source | Yes | List | Data source. For details, see Table 5. |
| storage | Yes | List | Storage information. For details, see Table 8. |
| algorithm_setting | Yes | JSON | Algorithm configuration. For details, see Table 10. |
| filter_rules | No | JSON | List of filter rules. For details, see Table 12. |
| Parameter | Mandatory | Type | Description |
|---|---|---|---|
| platform | Yes | String | Platform name. The value can be DLI. |
| platform_parameter | Yes | JSON | Platform parameter. For details, see Table 4. |
| computing_resource | No | String | Resource specifications required for the normal running of the DLI jobs. |
| config_load_path | Yes | String | Path to access the configuration items |
| Parameter | Mandatory | Type | Description |
|---|---|---|---|
| cluster_name | Yes | String | Cluster name. The value can contain a maximum of 64 characters. |
| Parameter | Mandatory | Type | Description |
|---|---|---|---|
| offline | Yes | List | Offline data source. For details, see Table 6. |
| Parameter | Mandatory | Type | Description |
|---|---|---|---|
| table_type_id | Yes | String | General data templates:
For details about the data format, see Offline Data Sources. |
| data_source_url | Yes | String | Data source path. The value can contain a maximum of 1000 characters. |
| data_format | Yes | String | Data format. The options are csv, parquet, json, or orc. |
| data_param | No | JSON | Data parameter. For details, see Table 7. This parameter is mandatory when the data format is csv and optional for other data formats. |
| start_time | No | String | Start time for collecting the general source data, for example, 2018-01-01. |
| end_time | No | String | End time for collecting the general source data, for example, 2018-02-01. |
| Parameter | Mandatory | Type | Description |
|---|---|---|---|
| header | Yes | String | Whether to display the table header. true indicates that the table header is displayed; false indicates that the table header is not displayed. |
| delimiter | Yes | String | Delimiter. The value can contain a maximum of 10 characters. |
| quote | Yes | String | Quotation character. The value can contain a maximum of 10 characters. |
| escape | Yes | String | Escape character. The value can contain a maximum of 10 characters. |
| Parameter | Mandatory | Type | Description |
|---|---|---|---|
| platform | Yes | String | Platform name. Currently, only CloudTable is supported. |
| platform_parameter | Yes | JSON | Storage platform parameter. For details, see Table 9. |
| Parameter | Mandatory | Type | Description |
|---|---|---|---|
| cluster_id | Yes | String | Cluster ID |
| table_name | Yes | String | Table name. The value can contain a maximum of 64 characters. |
| cluster_name | No | String | Cluster name |
| data_version | No | String | Data version. The options are V1 and V2. |
| region_info | No | JSON | Pre-partition information You need to set the pre-partition information only when the data version is V2. No pre-partition information is needed when the data version is V2. For details, see Table 17. |
| Parameter | Mandatory | Type | Description |
|---|---|---|---|
| start_time | No | Long | Start time of data training, expressed in the form of a timestamp in milliseconds. |
| end_time | No | Long | End time of data training, expressed in the form of a timestamp in milliseconds. |
| strategy | Yes | List | Strategy set. For details, see Table 11. |
| Parameter | Mandatory | Type | Description |
|---|---|---|---|
| strategy_type | Yes | String | (Optional) Strategy type:
|
| name | Yes | String | Strategy alias. The value can contain a maximum of 60 characters. |
| algorithm_type | Yes | String | Algorithm type |
| parameter | Yes | JSON | Algorithm parameter (JSON format) NOTE: This API is used to submit a combined job. Parameters vary according to the selected strategies.
|
| Parameter | Mandatory | Type | Description |
|---|---|---|---|
| behavior_rules | No | List | Filter rule configuration for user behaviors. For details, see Table 13. |
| blacklist | No | String | Blacklisting rule configuration |
| whitelist | No | String | Whitelisting rule configuration |
| etl_uuid | No | String | UUID generated by extracting user and item features in Feature Engineering, used for configuring attribute filter rules. |
| Parameter | Mandatory | Type | Description |
|---|---|---|---|
| behavior_type | Yes | String | Behavior types:
|
| interval | Yes | Integer | Elapsed time (days). The value ranges from 1 to 10,000. |
| frequency | Yes | Integer | Frequency. The value ranges from 1 to 10,000. |
Response
| Parameter | Mandatory | Type | Description |
|---|---|---|---|
| is_success | Yes | Boolean | Whether the request is successful |
| strategies | Yes | List | Returned strategy result. For details, see Table 15. |
| job_id | Yes | String | Job ID |
| filter_uuid | Yes | String | UUID generated by using filter rules |
| Parameter | Mandatory | Type | Description |
|---|---|---|---|
| strategy_type | Yes | String | (Optional) Strategy type:
|
| name | Yes | String | Strategy alias |
| algorithm_type | Yes | String | Algorithm type |
| parameter | Yes | JSON | Algorithm parameter. For details, see Parameters for Supported Strategies. |
| candidate_set | Yes | List | Set of candidates. For details, see Table 16. |
| Parameter | Mandatory | Type | Description |
|---|---|---|---|
| uuid | Yes | String | Candidate set ID |
| description | Yes | String | Candidate set description |
| Parameter | Mandatory | Type | Description |
|---|---|---|---|
| region_num | Yes | Integer | Number of pre-partitions Eight pre-partitions are recommended by default. |
| index_region_num | No | Integer | Number of pre-partitions in an index table. This parameter is required only for the Initial User Profile-Item Profile-Standard Wide Table Generation operator in feature engineering project. For other offline operators, this parameter is not required because no index table is generated. |
Example
- Example request
{ "job_name": "yyn-test", "job_description": "yyn-test", "data_source": [ { "offline": [ { "table_type_id": "USER_META", "data_format": "csv", "data_param": { "header": "false", "delimiter": ",", "quote": "\"", "escape": "\\" }, "data_source_url": "<OBS path for storing the data sources>" } { "table_type_id": "ITEM_META", "data_format": "csv", "data_param": { "header": "false", "delimiter": ",", "quote": "\"", "escape": "\\" }, "data_source_url": "<OBS path for storing the data sources>" }, { "table_type_id": "USER_BEHAVIOR", "data_format": "csv", "data_param": { "header": "false", "delimiter": ",", "quote": "\"", "escape": "\\" }, "data_source_url": "<OBS path for storing the data sources>" } ] } ], "offline_platform": [ { "platform": "DLI", "platform_parameter": { "cluster_name": "res_one" }, "config_load_path": "<Path for loading configurations>" } ], "storage": [ { "platform": "CloudTable", "platform_parameter": { "cluster_id": "cca518b4-a9fb-4dbf-80bb-d6838cbdcc87", "cluster_name": "cloudtable-ccb1-sec", "table_name": "yyn-555" } } ], "algorithm_setting": { "strategy": [ { "name": "Recommendation Based on Specific Behavior Popularity" by default, "algorithm_type": "SpecificBehavior", "strategy_type": "recall", "parameter": { "data_source_config": { "retain_days": 30, "behavior_type": "collect", "start_time": 1543593600000, "end_time": 1543939200000 }, "algorithm_config": {}, "candidate_set_config": { "is_recommended_by_category": false } } }, { "name": "ItemCF Recommendation" "algorithm_type": "ItemCF", "strategy_type": "recall", "parameter": { "data_source_config": { "retain_days": 30, "behavior_weights": [ { "behavior_type": "view", "weight": 1 } ] }, "algorithm_config": { "similar_metric": "cosine" }, "candidate_set_config": { "max_recommended_num": 1000 } } }, { "name": "Business Rule - Historical Behavior-based Recommendation", "algorithm_type": "HistoryBehaviorMemory", "strategy_type": "recall", "parameter": { "data_source_config": { "retain_days": 30 }, "algorithm_config": { "history_behavior_memories": [ { "behavior_type": "view", "least_intension": 1 } ] }, "candidate_set_config": {} } }, { "name": "Field-aware factorization machine", "strategy_type": "sorting", "algorithm_type": "FFM", "parameter": { "algorithm_parameters": { "max_iterations": 50, "early_stop_iterations": 5, "fields_feature_size_path": "<OBS path for storing data>", "algorithm_specify_parameters": { "latent_vector_length": 10 }, "regular_parameters": { "l2_regularization": 0, "regular_loss_compute_mode": "full" }, "initial_parameters": { "initial_method": "normal", "mean_value": 0, "standard_deviation": 0.001 }, "optimize_parameters": { "type": "grad", "learning_rate": 0.001 } }, "algorithm_type": "FFM", "spec_id": 1, "run_path": "<Root path for storing training results>", "training_data_path": "<OBS path for storing the training data>", "test_data_path": "<OBS path for storing the test data>" } } ], "start_time": 1543593600000, "end_time": 1543939200000 }, "filter_rules": { "behavior_rules": [ { "behavior_type": "collect", "interval": 7, "frequency": 5 } ], "blacklist": "<Path for storing the blacklists>", "whitelist": "<Path for storing the whitelists>" } } - Example of a successful response
{ "is_success": true, "strategies": [ { "name": "Recommendation Based on Specific Behavior Popularity" by default, "strategy_type": "recall", "algorithm_type": "SpecificBehavior", "parameter": { "data_source_config": { "retain_days": 30, "behavior_type": "collect", "start_time": 1543593600000, "end_time": 1543939200000 }, "algorithm_config": {}, "candidate_set_config": { "is_recommended_by_category": false } }, "candidate_set": [ { "uuid": "bb45ef1d31a7488584724f58d468d9ae", "description": "[Recommendation Based on Specific Behavior Popularity by default] Candidate sets generated by the recommendation algorithms of specific behavior popularity" } ], "strategy_id": 0 }, { "name": "ItemCF Recommendation" "strategy_type": "recall", "algorithm_type": "ItemCF", "parameter": { "data_source_config": { "retain_days": 30, "behavior_weights": [ { "behavior_type": "view", "weight": 1 } ] }, "algorithm_config": { "similar_metric": "cosine" }, "candidate_set_config": { "max_recommended_num": 1000 } }, "candidate_set": [ { "uuid": "958d09223b2e4175b2740f8f782cc5fc", "description": "[ItemCF Recommendation] User-item list candidate sets generated by the ItemCF algorithm" } ], "strategy_id": 1 }, { "name": "Business Rule - Historical Behavior-based Recommendation", "strategy_type": "recall", "algorithm_type": "HistoryBehaviorMemory", "parameter": { "data_source_config": { "retain_days": 30 }, "algorithm_config": { "history_behavior_memories": [ { "behavior_type": "view", "least_intension": 1 } ] }, "candidate_set_config": {} }, "candidate_set": [ { "uuid": "1b5301f0c7804e28b66eb46c92249ed2", "description": "[Business Rule - Historical Behavior-based Recommendation] User-item list candidate sets generated by CustomRule" } ], "strategy_id": 2 }, { "name": "Field-aware factorization machine", "strategy_type": "sorting", "algorithm_type": "FFM", "parameter": { "algorithm_parameters": { "row_features_size": "6", "algorithm_specify_parameters": { "latent_vector_length": 10 }, "initial_parameters": { "initial_method": "normal", "mean_value": -0.001, "standard_deviation": 0.001 }, "optimize_parameters": { "type": "grad", "learning_rate": 0.1, "log_loss_reduce_method": "mean" }, "regular_parameters": { "l2_loss_weight_lambda": 0.001 }, "loss_mode": { "l2_loss_mode": "full" }, "fields_feature_size_path": "<Data storage path>" }, "algorithm_type": "FFM", "spec_id": 1, "name": "Field-aware factorization machine", "run_path": "<Root path for storing the training sets>", "training_data_path": "<OBS path for storing the training data>", "test_data_path": "<OBS path for storing the test data>" }, "candidate_set": [ { "uuid": "4aa9f06d24254fedbe462bfbfb879e63", "description": "Field-aware factorization machine", } ], "strategy_id": 0 } ], "filter_uuid": "857578fafa4746dd873722d661725154", "job_id": "f171a66489904462bad0a89d9b7483de" } - Example of a failed response
{ "is_success": false, "error_code": "res.2013", "error_msg": "There dataSource is empty or less than five." }
Status Code
For details about status codes, see Status Codes.
Last Article: Job-related APIs
Next Article: Submitting Retrieval Jobs
Did this article solve your problem?
Thank you for your score!Your feedback would help us improve the website.