Creating a Batch Processing Job
Function
This API is used to create a batch processing job in a queue.
URI
- URI format
- Parameter description
Table 1 URI parameter Parameter
Mandatory
Type
Description
project_id
Yes
String
Project ID, which is used for resource isolation. For details about how to obtain its value, see Obtaining a Project ID.
Request Parameters
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
file |
Yes |
String |
Name of the package that is of the JAR or pyFile type and has been uploaded to the DLI resource management system. You can also specify an OBS path, for example, obs://Bucket name/Package name. |
className |
Yes |
String |
Java/Spark main class of the batch processing job. |
queue |
No |
String |
Queue name. Set this parameter to the name of the created DLI queue. The queue must be of the general-purpose type.
NOTE:
|
cluster_name |
No |
String |
Queue name. Set this parameter to the created DLI queue name.
NOTE:
You are advised to use the queue parameter. The queue and cluster_name parameters cannot coexist. |
args |
No |
Array of Strings |
Input parameters of the main class, that is, application parameters. |
sc_type |
No |
String |
Compute resource type. Currently, resource types A, B, and C are available. If this parameter is not specified, the minimum configuration (type A) is used. For details about resource types, see Table 3. |
jars |
No |
Array of Strings |
Name of the package that is of the JAR type and has been uploaded to the DLI resource management system. You can also specify an OBS path, for example, obs://Bucket name/Package name. |
pyFiles |
No |
Array of Strings |
Name of the package that is of the PyFile type and has been uploaded to the DLI resource management system. You can also specify an OBS path, for example, obs://Bucket name/Package name. |
files |
No |
Array of Strings |
Name of the package that is of the file type and has been uploaded to the DLI resource management system. You can also specify an OBS path, for example, obs://Bucket name/Package name. |
modules |
No |
Array of Strings |
Name of the dependent system resource module. You can view the module name using the API related to Querying Resource Packages in a Group (Discarded).
DLI provides dependencies for executing datasource jobs. The following table lists the dependency modules corresponding to different services.
|
resources |
No |
Array of objects |
JSON object list, including the name and type of the JSON package that has been uploaded to the queue. For details, see Table 4. |
groups |
No |
Array of objects |
JSON object list, including the package group resource. For details about the format, see the request example. If the type of the name in resources is not verified, the package with the name exists in the group. For details, see Table 5. |
conf |
No |
Object |
Batch configuration item. For details, see Spark Configuration. |
name |
No |
String |
Batch processing task name. The value contains a maximum of 128 characters. |
driverMemory |
No |
String |
Driver memory of the Spark application, for example, 2 GB and 2048 MB. This configuration item replaces the default parameter in sc_type. The unit must be provided. Otherwise, the startup fails. |
driverCores |
No |
Integer |
Number of CPU cores of the Spark application driver. This configuration item replaces the default parameter in sc_type. |
executorMemory |
No |
String |
Executor memory of the Spark application, for example, 2 GB and 2048 MB. This configuration item replaces the default parameter in sc_type. The unit must be provided. Otherwise, the startup fails. |
executorCores |
No |
Integer |
Number of CPU cores of each Executor in the Spark application. This configuration item replaces the default parameter in sc_type. |
numExecutors |
No |
Integer |
Number of Executors in a Spark application. This configuration item replaces the default parameter in sc_type. |
obs_bucket |
No |
String |
OBS bucket for storing the Spark jobs. Set this parameter when you need to save jobs. |
auto_recovery |
No |
Boolean |
Whether to enable the retry function. If enabled, Spark jobs will be automatically retried after an exception occurs. The default value is false. |
max_retry_times |
No |
Integer |
Maximum retry times. The maximum value is 100, and the default value is 20. |
feature |
No |
String |
Job feature. Type of the Spark image used by a job.
|
spark_version |
No |
String |
Version of the Spark component
|
image |
No |
String |
Custom image. The format is Organization name/Image name:Image version. This parameter is valid only when feature is set to custom. You can use this parameter with the feature parameter to specify a user-defined Spark image for job running. For details about how to use custom images, see Data Lake Insight User Guide. |
catalog_name |
No |
String |
To access metadata, set this parameter to dli. |
Resource Type |
Physical Resource |
driverCores |
executorCores |
driverMemory |
executorMemory |
numExecutor |
---|---|---|---|---|---|---|
A |
8 vCPUs, 32-GB memory |
2 |
1 |
7 GB |
4 GB |
6 |
B |
16 vCPUs, 64-GB memory |
2 |
2 |
7 GB |
8 GB |
7 |
C |
32 vCPUs, 128-GB memory |
4 |
2 |
15 GB |
8 GB |
14 |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
name |
No |
String |
Resource name You can also specify an OBS path, for example, obs://Bucket name/Package name. |
type |
No |
String |
Resource type. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
name |
No |
String |
User group name |
resources |
No |
Array of objects |
User group resource For details, see Table 4. |
Response Parameters
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
id |
No |
String |
ID of a batch processing job. |
appId |
No |
String |
Back-end application ID of a batch processing job. |
name |
No |
String |
Batch processing task name. The value contains a maximum of 128 characters. |
owner |
No |
String |
Owner of a batch processing job. |
proxyUser |
No |
String |
Proxy user (resource tenant) to which a batch processing job belongs. |
state |
No |
String |
Status of a batch processing job. For details, see Table 7. |
kind |
No |
String |
Type of a batch processing job. Only Spark parameters are supported. |
log |
No |
Array of strings |
Last 10 records of the current batch processing job. |
sc_type |
No |
String |
Type of a computing resource. If the computing resource type is customized, value CUSTOMIZED is returned. |
cluster_name |
No |
String |
Queue where a batch processing job is located. |
queue |
Yes |
String |
Queue name. Set this parameter to the name of the created DLI queue.
NOTE:
|
image |
No |
String |
Custom image. The format is Organization name/Image name:Image version. This parameter is valid only when feature is set to custom. You can use this parameter with the feature parameter to specify a user-defined Spark image for job running. For details about how to use custom images, see Data Lake Insight User Guide. |
create_time |
No |
Long |
Time when a batch processing job is created. The timestamp is expressed in milliseconds. |
update_time |
No |
Long |
Time when a batch processing job is updated. The timestamp is expressed in milliseconds. |
duration |
No |
Long |
Job running duration (unit: millisecond) |
Parameter |
Type |
Description |
---|---|---|
starting |
String |
The batch processing job is being started. |
running |
String |
The batch processing job is executing a task. |
dead |
String |
The batch processing job has exited. |
success |
String |
The batch processing job is successfully executed. |
recovering |
String |
The batch processing job is being restored. |
Example Request
Create a Spark job. Set the Spark main class of the job to org.apache.spark.examples.SparkPi, specify the program package to batchTest/spark-examples_2.11-2.1.0.luxor.jar, and load the program package whose type is jar and the resource package whose type is files.
{ "file": "batchTest/spark-examples_2.11-2.1.0.luxor.jar", "className": "org.apache.spark.examples.SparkPi", "sc_type": "A", "jars": ["demo-1.0.0.jar"], "files": ["count.txt"], "resources":[ {"name": "groupTest/testJar.jar", "type": "jar"}, {"name": "kafka-clients-0.10.0.0.jar", "type": "jar"}], "groups": [ {"name": "groupTestJar", "resources": [{"name": "testJar.jar", "type": "jar"}, {"name": "testJar1.jar", "type": "jar"}]}, {"name": "batchTest", "resources": [{"name": "luxor.jar", "type": "jar"}]}], "queue": " test", "name": "TestDemo4", "feature": "basic", "spark_version": "2.3.2" }
The batchTest/spark-examples_2.11-2.1.0.luxor.jar file has been uploaded through API involved in Uploading a Package Group (Discarded).
Example Response
{ "id": "07a3e4e6-9a28-4e92-8d3f-9c538621a166", "appId": "", "name": "", "owner": "test1", "proxyUser": "", "state": "starting", "kind": "", "log": [], "sc_type": "CUSTOMIZED", "cluster_name": "aaa", "queue": "aaa", "create_time": 1607589874156, "update_time": 1607589874156 }
Status Codes
Table 8 describes the status code.
Error Codes
If an error occurs when this API is invoked, the system does not return the result similar to the preceding example, but returns the error code and error information. For details, see Error Codes.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot