Creating and Submitting a Spark Job
Scenario Description
This section describes how to create and submit Spark jobs using APIs. For details on how to call APIs, see Calling APIs.
Constraints
- It takes 6 to 10 minutes to start a job using a new queue for the first time.
Involved APIs
- Creating a Queue: Create a queue.
- Uploading a Package Group: Upload the resource package required by the Spark job.
- Querying Resource Packages in a Group: Check whether the uploaded resource package is correct.
- Creating a Batch Processing Job: Create and submit a Spark batch processing job.
- Querying a Batch Job Status: View the status of a batch processing job.
- Querying Batch Job Logs: View batch processing job logs.
Procedure
- Create a common queue. For details, see Creating a Queue.
- Upload a package group.
- API
URI format: POST /v2.0/{project_id}/resources
- Obtain the value of {project_id} from Obtaining a Project ID.
- For details about the request parameters, see Uploading a Package Group.
- Request example
- Description: Upload resources in the GATK group to the project whose ID is 48cc2c48765f481480c7db940d6409d1.
- Example URL: POST https://{endpoint}/v2.0/48cc2c48765f481480c7db940d6409d1/resources
- Body:
{ "paths": [ "https://test.obs.xxx.com/txr_test/jars/spark-sdv-app.jar" ], "kind": "jar", "group": "gatk", "is_async":"true" }
- Example response
{ "group_name": "gatk", "status": "READY", "resources": [ "spark-sdv-app.jar", "wordcount", "wordcount.py" ], "details": [ { "create_time": 0, "update_time": 0, "resource_type": "jar", "resource_name": "spark-sdv-app.jar", "status": "READY", "underlying_name": "987e208d-d46e-4475-a8c0-a62f0275750b_spark-sdv-app.jar" }, { "create_time": 0, "update_time": 0, "resource_type": "jar", "resource_name": "wordcount", "status": "READY", "underlying_name": "987e208d-d46e-4475-a8c0-a62f0275750b_wordcount" }, { "create_time": 0, "update_time": 0, "resource_type": "jar", "resource_name": "wordcount.py", "status": "READY", "underlying_name": "987e208d-d46e-4475-a8c0-a62f0275750b_wordcount.py" } ], "create_time": 1551334579654, "update_time": 1551345369070 }
- API
- View resource packages in a group.
- API
URI format: GET /v2.0/{project_id}/resources/{resource_name}
- Obtain the value of {project_id} from Obtaining a Project ID.
- For details about the query parameters, see Creating a Table.
- Request example
- Description: Query the resource package named luxor-router-1.1.1.jar in the GATK group under the project whose ID is 48cc2c48765f481480c7db940d6409d1.
- Example URL: GET https://{endpoint}/v2.0/48cc2c48765f481480c7db940d6409d1/resources/luxor-router-1.1.1.jar?group=gatk
- Body:
{}
- Example response
{ "create_time": 1522055409139, "update_time": 1522228350501, "resource_type": "jar", "resource_name": "luxor-router-1.1.1.jar", "status": "uploading", "underlying_name": "7885d26e-c532-40f3-a755-c82c442f19b8_luxor-router-1.1.1.jar", "owner": "****" }
- API
- Create and submit a Spark batch processing job.
- API
URI format: POST /v2.0/{project_id}/batches
- Obtain the value of {project_id} from Obtaining a Project ID.
- For details about the request parameters, see Creating a Batch Processing Job.
- Request example
- Description: In the 48cc2c48765f481480c7db940d6409d1 project, create a batch processing job named TestDemo4 in queue1.
- Example URL: POST https://{endpoint}/v2.0/48cc2c48765f481480c7db940d6409d1/batches
- Body:
{ "sc_type": "A", "jars": [ "spark-examples_2.11-2.1.0.luxor.jar" ], "driverMemory": "1G", "driverCores": 1, "executorMemory": "1G", "executorCores": 1, "numExecutors": 1, "queue": "cce_general", "file": "spark-examples_2.11-2.1.0.luxor.jar", "className": "org.apache.spark.examples.SparkPi", "minRecoveryDelayTime": 10000, "maxRetryTimes": 20 }
- Example response
{ "id": "07a3e4e6-9a28-4e92-8d3f-9c538621a166", "appId": "", "name": "", "owner": "test1", "proxyUser": "", "state": "starting", "kind": "", "log": [], "sc_type": "CUSTOMIZED", "cluster_name": "aaa", "queue": "aaa", "create_time": 1607589874156, "update_time": 1607589874156 }
- API
- Query a batch job status.
- API
URI format: GET /v2.0/{project_id}/batches/{batch_id}/state
- Obtain the value of {project_id} from Obtaining a Project ID.
- For details about the query parameters, see Querying a Batch Job Status.
- Request example
- Description: Query the status of the batch processing job whose ID is 0a324461-d9d9-45da-a52a-3b3c7a3d809e in the project whose ID is 48cc2c48765f481480c7db940d6409d1.
- Example URL: GET https://{endpoint}/v2.0/48cc2c48765f481480c7db940d6409d1/batches/0a324461-d9d9-45da-a52a-3b3c7a3d809e/state
- Body:
{}
- Example response
{ "id":"0a324461-d9d9-45da-a52a-3b3c7a3d809e", "state":"Success" }
- API
- Query batch job logs.
- API
URI format: GET /v2.0/{project_id}/batches/{batch_id}/log
- Obtain the value of {project_id} from Obtaining a Project ID.
- For details about the query parameters, see Querying Batch Job Logs.
- Request example
- Description: Queries for the background logs of the batch processing job with the ID of 0a324461-d9d9-45da-a52a-3b3c7a3d809e in the 48cc2c48765f481480c7db940d6409d1 project.
- Example URL: GET https://{endpoint}/v2.0/48cc2c48765f481480c7db940d6409d1/batches/0a324461-d9d9-45da-a52a-3b3c7a3d809e/log
- Body:
{}
- Example response
{ "id": "0a324461-d9d9-45da-a52a-3b3c7a3d809e", "from": 0, "total": 3, "log": [ "Detailed information about job logs" ] }
- API
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot