Creating a Cluster and Submitting a Job
Function
This API is used to create an MRS cluster, submit a job, and terminate the cluster after the job is complete. This API is supported in MRS 1.8.9 or later. Before using this API, you need to obtain the following resource information:
- Create or query a VPC and subnet.
- Create or query a key pair using an ECS.
- Obtain the region information by referring to Endpoints.
- Obtain the MRS version and the components supported by the MRS version by referring to Obtaining the MRS Cluster Information.
Constraints
None
Debugging
You can debug this API through automatic authentication in API Explorer. API Explorer can automatically generate sample SDK code and provide the sample SDK code debugging.
URI
POST /v2/{project_id}/run-job-flow
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
project_id |
Yes |
String |
The project ID. For details about how to obtain the project ID, see Obtaining a Project ID. |
Request Parameters
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
is_dec_project |
No |
Boolean |
Whether the cluster is specific for the DeC. The default value is false. |
cluster_version |
Yes |
String |
The cluster version. Example: MRS 3.1.0. |
cluster_name |
Yes |
String |
The cluster name, which must be unique. A cluster name can contain only 1 to 64 characters. Only letters, numbers, hyphens (-), and underscores (_) are allowed. |
cluster_type |
Yes |
String |
The cluster type. The options are as follows:
|
charge_info |
No |
ChargeInfo object |
The billing type. For details, see Table 7. |
region |
Yes |
String |
Information about the region where the cluster is located. For details, see Endpoints. |
vpc_name |
Yes |
String |
The name of the VPC where the subnet is located. Obtain the VPC name by performing the following operations on the VPC management console:
|
subnet_id |
No |
String |
The subnet ID. Obtain the subnet ID by performing the following operations on the VPC management console:
|
subnet_name |
Yes |
String |
The subnet name. Obtain the subnet name by performing the following operations on the VPC management console:
|
components |
Yes |
String |
List of component names, which are separated by commas (,). For details about the components that are supported, see "Components Supported by MRS" in Obtaining the MRS Cluster Information. |
external_datasources |
No |
Array of ClusterDataConnectorMap objects |
When deploying components such as Hive and Ranger, you can associate data connections and store metadata in associated databases. For details about the parameters, see Table 3. |
availability_zone |
Yes |
String |
The AZ name. Multi-AZ clusters are not supported. For details about AZs, see Endpoints. |
security_groups_id |
No |
String |
The ID of the security group configured for the cluster.
|
auto_create_default_security_group |
No |
Boolean |
Whether to create the default security group for the MRS cluster. The default value is false. If this parameter is set to true, the default security group will be created for the cluster regardless of whether security_groups_id is specified. |
safe_mode |
Yes |
String |
The running mode of an MRS cluster. The options are as follows:
|
manager_admin_password |
Yes |
String |
Password of the MRS Manager administrator. The password must meet the following requirements:
|
login_mode |
Yes |
String |
Node login mode.
|
node_root_password |
No |
String |
The password of user root for logging in to a cluster node. A password must meet the following complexity requirements:
|
node_keypair_name |
No |
String |
The name of a key pair. You can use a key pair to log in to a cluster node. |
enterprise_project_id |
No |
String |
The enterprise project ID. When you create a cluster, associate the enterprise project ID with the cluster. The default value is 0, indicating the default enterprise project. To obtain the enterprise project ID, see the id value in the enterprise_project field data structure table in "Querying the Enterprise Project List" in Enterprise Management API Reference. |
eip_address |
No |
String |
EIP bound to an MRS cluster, which can be used to access MRS Manager. The EIP must have been created and must be in the same region as the cluster. |
eip_id |
No |
String |
ID of the bound EIP. This parameter is mandatory when eip_address is configured. To obtain the EIP ID, log in to the VPC console, choose Network > Elastic IP and Bandwidth > Elastic IP, click the EIP to be bound, and obtain the ID in the Basic Information area. |
mrs_ecs_default_agency |
No |
String |
Name of the agency bound to a cluster node by default. The value is fixed to MRS_ECS_DEFAULT_AGENCY. An agency allows ECS or BMS to manage MRS resources. You can configure an agency of the ECS type to automatically obtain the AK/SK to access OBS. The MRS_ECS_DEFAULT_AGENCY agency has the OBS OperateAccess permission of OBS and the CES FullAccess (for users who have enabled fine-grained policies), CES Administrator, and KMS Administrator permissions in the region where the cluster is located. |
template_id |
No |
String |
The template used for node deployment when the cluster type is CUSTOM.
|
tags |
No |
Array of Tag objects |
The cluster tags. One cluster can have a maximum of 10 tags. The tag name (key) must be unique. For details about the parameters, see Table 4. |
log_collection |
No |
Integer |
Whether to collect logs when cluster creation fails. The default value is 1, indicating that OBS buckets will be created and only used to collect logs that record MRS cluster creation failures. Enumerated values:
|
node_groups |
Yes |
Array of NodeGroupV2 objects |
Information about the node groups that form the cluster. For details about the parameters, see Table 5. |
bootstrap_scripts |
No |
Array of BootstrapScript objects |
The bootstrap action script. For details about the parameters, see Table 13. |
log_uri |
No |
String |
The OBS path to which cluster logs are dumped. After the log dump function is enabled, the read and write permissions on the OBS path are required to upload logs. Configure the default agency MRS_ECS_DEFULT_AGENCY or customize an agency with the read and write permissions on the OBS path. For details, see Configuring a Storage-Compute Decoupled Cluster (Agency). This parameter is available only for cluster versions that support dumping cluster logs to OBS. |
component_configs |
No |
Array of ComponentConfig objects |
The custom configuration of cluster components. This parameter applies only to cluster versions that support the feature of creating a cluster by customizing component configurations. For details about this parameter, see Table 14. |
delete_when_no_steps |
No |
Boolean |
Whether to automatically terminate the cluster after the job is complete. The default value is false. |
steps |
Yes |
Array of StepConfig objects |
The job list. For details about this parameter, see Table 16. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
map_id |
No |
Integer |
The data connection association ID |
connector_id |
No |
String |
The data connection ID |
component_name |
No |
String |
The component name |
role_type |
No |
String |
The component role type. The options are as follows:
|
source_type |
No |
String |
The data connection type. The options are as follows:
|
cluster_id |
No |
String |
The ID of the associated cluster |
status |
No |
Integer |
The data connection status. The options are as follows:
|
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
key |
Yes |
String |
The tag key
|
value |
Yes |
String |
The value
|
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
group_name |
Yes |
String |
The node group name. The value can contain a maximum of 64 characters, including letters, numbers, and underscores (_). The rules for configuring node groups are as follows:
|
node_num |
Yes |
Integer |
The number of nodes. The value ranges from 0 to 500. The maximum number of core and task nodes is 500. |
node_size |
Yes |
String |
The instance specifications of a node, for example, c3.4xlarge.2.linux.bigdata. For details about instance specifications, see ECS Specifications Used by MRS and BMS Specifications Used by MRS. You are advised to obtain the specifications supported by the corresponding version in the corresponding region from the cluster creation page on the MRS console. |
root_volume |
No |
Volume object |
The system disk information of the node. This parameter is optional for some VMs or the system disk of the BMS and mandatory in other cases. For details about this parameter, see Table 6. |
data_volume |
No |
Volume object |
The data disk information. This parameter is mandatory when data_volume_count is not 0. For details about this parameter, see Table 6. |
data_volume_count |
No |
Integer |
The number of data disks on a node. The value ranges from 0 to 20. |
charge_info |
No |
ChargeInfo object |
The billing type of a node group. The billing types of master and core node groups are the same as those of the cluster. The billing type of the task node group can be different. For details about this parameter, see Table 7. |
auto_scaling_policy |
No |
AutoScalingPolicy object |
The auto scaling rule information. For details about this parameter, see Table 8. |
assigned_roles |
No |
Array of strings |
This parameter is mandatory when the cluster type is CUSTOM. You can specify the roles deployed in a node group. This parameter is a string array. Each string represents a role expression. Role expression definition:
|
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
type |
Yes |
String |
The disk type. The options are as follows:
|
size |
Yes |
Integer |
The data disk size, in GB. The value ranges from 10 to 32768. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
charge_mode |
Yes |
String |
The billing mode. The options are as follows:
|
period_type |
No |
String |
Subscription period. The value can be:
|
period_num |
No |
Integer |
Number of periods. This parameter is valid and mandatory only when charge_mode is set to prePaid.
|
is_auto_pay |
No |
Boolean |
Whether the order will be automatically paid. This parameter is available for yearly/monthly mode. By default, the automatic payment is disabled. Available values are as follows:
|
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
auto_scaling_enable |
Yes |
Boolean |
Whether to enable the auto scaling policy. |
min_capacity |
Yes |
Integer |
The minimum number of nodes reserved in the node group. Value range: [0, 500] |
max_capacity |
Yes |
Integer |
The maximum number of nodes in the node group. Value range: [0, 500] |
resources_plans |
No |
Array of ResourcesPlan objects |
The resource plan list. If this parameter is left blank, the resource plan is disabled. When auto_scaling_enable is set to true, either this parameter or rules must be configured. For details about this parameter, see Table 9. |
rules |
No |
Array of Rule objects |
The list of auto scaling rules. When auto_scaling_enable is set to true, either this parameter or resources_plans must be configured. For details about this parameter, see Table 10. |
exec_scripts |
No |
Array of ScaleScript objects |
The list of custom scaling automation scripts. If this parameter is left blank, the automation script is disabled. For details about this parameter, see Table 12. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
period_type |
Yes |
String |
The cycle type of a resource plan. Currently, only the following cycle type is supported: daily |
start_time |
Yes |
String |
The start time of a resource plan. The value is in the format of hour:minute, indicating that the time ranges from 00:00 to 23:59. |
end_time |
Yes |
String |
The end time of a resource plan. The value is in the same format as that of start_time. The interval between end_time and start_time must be greater than or equal to 30 minutes. |
min_capacity |
Yes |
Integer |
The minimum number of reserved nodes in a node group in a resource plan. Value range: [0, 500] |
max_capacity |
Yes |
Integer |
The maximum number of reserved nodes in a node group in a resource plan. Value range: [0, 500] |
effective_days |
No |
Array of strings |
The effective date of a resource plan. If this parameter is left blank, it indicates that the resource plan takes effect every day. The options are as follows: MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, SATURDAY, and SUNDAY |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
name |
Yes |
String |
The name of an auto scaling rule. The name can contain only 1 to 64 characters. Only letters, numbers, hyphens (-), and underscores (_) are allowed. Rule names must be unique in a node group. |
description |
No |
String |
The description about an auto scaling rule. It contains a maximum of 1,024 characters. |
adjustment_type |
Yes |
String |
The adjustment type of an auto scaling rule. The options are as follows:
|
cool_down_minutes |
Yes |
Integer |
The cluster cooling time after an auto scaling rule is triggered, in minutes, during which period no auto scaling operation is performed. The value ranges from 0 to 10080. One week is equal to 10,080 minutes. |
scaling_adjustment |
Yes |
Integer |
The number of cluster nodes that can be adjusted at a time. Value range: [1, 100] |
trigger |
Yes |
Trigger object |
The condition for triggering a rule. For details about this parameter, see Table 11. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
metric_name |
Yes |
String |
The metric name. This triggering condition makes a judgment according to the value of the metric. A metric name contains a maximum of 64 characters. |
metric_value |
Yes |
String |
The metric threshold to trigger a rule. The value must be an integer or a number with two decimal places. |
comparison_operator |
No |
String |
The metric judgment logic operator. The options are as follows:
|
evaluation_periods |
Yes |
Integer |
The number of consecutive five-minute periods, during which a metric threshold is reached. The value ranges from 1 to 288. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
name |
Yes |
String |
The name of a custom automation script. Names must be unique in a cluster. The value can contain only numbers, letters, spaces, hyphens (-), and underscores (_) and cannot start with a space. The value can contain 1 to 64 characters. |
uri |
Yes |
String |
The path of a custom automation script. Set this parameter to an OBS bucket path or a local VM path.
|
parameters |
No |
String |
Parameters of a custom automation script. Multiple parameters are separated by space. The following predefined system parameters can be transferred:
|
nodes |
Yes |
Array of strings |
The name of the node group where the custom automation script is executed. |
active_master |
No |
Boolean |
Whether the custom automation script runs only on the active master node. The default value is false, indicating that the custom automation script can run on all master nodes. |
fail_action |
Yes |
String |
Whether to continue executing subsequent scripts and creating a cluster after the custom automation script fails to be executed. Notes:
|
action_stage |
Yes |
String |
The time when a script is executed. Enumerated values:
|
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
name |
Yes |
String |
The name of a bootstrap action script, which must be unique in a cluster. The value can contain only numbers, letters, spaces, hyphens (-), and underscores (_) and cannot start with a space. The value can contain 1 to 64 characters. |
uri |
Yes |
String |
The path of a bootstrap action script. Set this parameter to an OBS bucket path or a local VM path. OBS bucket path: Enter a script path, For example, enter the path of the public sample script provided by MRS. Example: obs://bootstrap/presto/presto-install.sh. If dualroles is installed, the parameter of the presto-install.sh script is dualroles. If worker is installed, the parameter of the presto-install.sh script is worker. Based on the Presto usage habit, you are advised to install dualroles on the active master nodes and worker on the core nodes. Local VM path: Enter a script path. The script path must start with a slash (/) and end with .sh. |
parameters |
No |
String |
The bootstrap action script parameters. |
nodes |
Yes |
Array of strings |
The name of the node group where the bootstrap action script is executed |
active_master |
No |
Boolean |
Whether the bootstrap action script runs only on active master nodes. The default value is false, indicating that the bootstrap action script can run on all master nodes. |
fail_action |
Yes |
String |
Whether to continue executing subsequent scripts and creating a cluster after the bootstrap action script fails to execute. The default value is errorout, indicating that the action is stopped. Note: You are advised to set this parameter to continue in the commissioning phase so that the cluster can continue to be installed and started no matter whether the bootstrap action is successful. Enumerated values:
|
before_component_start |
No |
Boolean |
The time when the bootstrap action script is executed. Currently, the following two options are available: Before component start and After component start. The default value is false, indicating that the bootstrap action script is executed after the component is started. |
start_time |
No |
Long |
The execution time of one bootstrap action script. |
state |
No |
String |
The running status of one bootstrap action script. The options are as follows:
|
action_stages |
No |
Array of strings |
Select the time when the bootstrap action script is executed.
|
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
component_name |
Yes |
String |
The component name |
configs |
No |
Array of Config objects |
The component configuration item list. For details about this parameter, see Table 15. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
key |
Yes |
String |
The configuration name. Only the configuration names displayed on the MRS component configuration page are supported. |
value |
Yes |
String |
The configuration value |
config_file_name |
Yes |
String |
The configuration file name. Only the file names displayed on the MRS component configuration page are supported. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
job_execution |
Yes |
JobExecution object |
The job parameter. For details about this parameter, see Table 17. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
job_type |
Yes |
String |
The job type. The options are as follows:
|
job_name |
Yes |
String |
The job name. The value can contain 1 to 64 characters. Only letters, numbers, hyphens (-), and underscores (_) are allowed. Identical job names are allowed but not recommended. |
arguments |
No |
Array of strings |
The key parameter for program execution. The parameter is specified by the function of the user's program. MRS is only responsible for loading the parameter. The value can contain a maximum of 150,000 characters. Special characters (;|&>'<$!"\) are not allowed. This parameter can be left blank. Notes:
|
properties |
No |
Map<String,String> |
The program system parameter. The value can contain a maximum of 2,048 characters. Special characters (;|&>'<$!\\) are not allowed. This parameter can be left blank. |
Response Parameters
Status code: 200
Parameter |
Type |
Description |
---|---|---|
cluster_id |
String |
The cluster ID, which is returned by the system after the cluster is created. |
Example Request
Create an MRS 3.2.0-LTS.1 cluster where the custom management nodes and control nodes are the same nodes and submit a HiveScript job.
POST /v2/{project_id}/run-job-flow
{
"cluster_version" : "MRS 3.2.0-LTS.1",
"cluster_name" : "mrs_heshe_dm",
"cluster_type" : "CUSTOM",
"charge_info" : {
"charge_mode" : "postPaid"
},
"region" : "",
"availability_zone" : "",
"vpc_name" : "vpc-37cd",
"subnet_id" : "1f8c5ca6-1f66-4096-bb00-baf175954f6e",
"subnet_name" : "subnet",
"components" : "Hadoop,Spark2x,HBase,Hive,Hue,Loader,Kafka,Storm,Flume,Flink,Oozie,Ranger,Tez",
"safe_mode" : "KERBEROS",
"manager_admin_password" : "your password",
"login_mode" : "PASSWORD",
"node_root_password" : "your password",
"mrs_ecs_default_agency" : "MRS_ECS_DEFAULT_AGENCY",
"template_id" : "mgmt_control_combined_v2",
"log_collection" : 1,
"tags" : [ {
"key" : "tag1",
"value" : "111"
}, {
"key" : "tag2",
"value" : "222"
} ],
"node_groups" : [ {
"group_name" : "master_node_default_group",
"node_num" : 3,
"node_size" : "Sit3.4xlarge.4.linux.bigdata",
"root_volume" : {
"type" : "SAS",
"size" : 480
},
"data_volume" : {
"type" : "SAS",
"size" : 600
},
"data_volume_count" : 1,
"assigned_roles" : [ "OMSServer:1,2", "SlapdServer:1,2", "KerberosServer:1,2", "KerberosAdmin:1,2", "quorumpeer:1,2,3", "NameNode:2,3", "Zkfc:2,3", "JournalNode:1,2,3", "ResourceManager:2,3", "JobHistoryServer:2,3", "DBServer:1,3", "Hue:1,3", "LoaderServer:1,3", "MetaStore:1,2,3", "WebHCat:1,2,3", "HiveServer:1,2,3", "HMaster:2,3", "MonitorServer:1,2", "Nimbus:1,2", "UI:1,2", "JDBCServer2x:1,2,3", "JobHistory2x:2,3", "SparkResource2x:1,2,3", "oozie:2,3", "LoadBalancer:2,3", "TezUI:1,3", "TimelineServer:3", "RangerAdmin:1,2", "UserSync:2", "TagSync:2", "KerberosClient", "SlapdClient", "meta", "HSConsole:2,3", "FlinkResource:1,2,3", "DataNode:1,2,3", "NodeManager:1,2,3", "IndexServer2x:1,2", "ThriftServer:1,2,3", "RegionServer:1,2,3", "ThriftServer1:1,2,3", "RESTServer:1,2,3", "Broker:1,2,3", "Supervisor:1,2,3", "Logviewer:1,2,3", "Flume:1,2,3", "HSBroker:1,2,3" ]
}, {
"group_name" : "node_group_1",
"node_num" : 3,
"node_size" : "Sit3.4xlarge.4.linux.bigdata",
"root_volume" : {
"type" : "SAS",
"size" : 480
},
"data_volume" : {
"type" : "SAS",
"size" : 600
},
"data_volume_count" : 1,
"assigned_roles" : [ "DataNode", "NodeManager", "RegionServer", "Flume:1", "Broker", "Supervisor", "Logviewer", "HBaseIndexer", "KerberosClient", "SlapdClient", "meta", "HSBroker:1,2", "ThriftServer", "ThriftServer1", "RESTServer", "FlinkResource" ]
}, {
"group_name" : "node_group_2",
"node_num" : 1,
"node_size" : "Sit3.4xlarge.4.linux.bigdata",
"root_volume" : {
"type" : "SAS",
"size" : 480
},
"data_volume" : {
"type" : "SAS",
"size" : 600
},
"data_volume_count" : 1,
"assigned_roles" : [ "NodeManager", "KerberosClient", "SlapdClient", "meta", "FlinkResource" ]
} ],
"log_uri" : "obs://bucketTest/logs",
"delete_when_no_steps" : true,
"steps" : [ {
"job_execution" : {
"job_name" : "import_file",
"job_type" : "DistCp",
"arguments" : [ "obs://test/test.sql", "/user/hive/input" ]
}
}, {
"job_execution" : {
"job_name" : "hive_test",
"job_type" : "HiveScript",
"arguments" : [ "obs://test/hive/sql/HiveScript.sql" ]
}
} ]
}
Example Response
Status code: 200
Example successful response
{ "cluster_id" : "da1592c2-bb7e-468d-9ac9-83246e95447a" }
Status Codes
For details, see Status Codes.
Error Codes
See Error Codes.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot