Help Center/ Data Lake Insight/ SDK Reference/ Python SDK/ SDKs Related to Spark Jobs
Updated on 2023-07-19 GMT+08:00

SDKs Related to Spark Jobs

For details about the dependencies and complete sample code, see Instructions.

Submitting Batch Jobs

DLI provides an API to perform batch jobs. The example code is as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
def submit_spark_batch_job(dli_client, batch_queue_name, batch_job_info):
    try:
        batch_job = dli_client.submit_spark_batch_job(batch_queue_name, batch_job_info)
    except DliException as e:
        print(e)
        return

    print(batch_job.job_id)
    while True:
        time.sleep(3)
        job_status = batch_job.get_job_status()
        print('Job status: {0}'.format(job_status))
        if job_status == 'dead' or job_status == 'success':
            break

    logs = batch_job.get_driver_log(500)
    for log_line in logs:
        print(log_line)

Canceling a Batch Processing Job

DLI provides an API for canceling batch processing jobs. If the job execution is complete or fails, you cannot cancel this job. The example code is as follows:

1
2
3
4
5
6
7
def del_spark_batch(dli_client, batch_id): 
    try: 
        resp = dli_client.del_spark_batch_job(batch_id) 
        print(resp.msg) 
    except DliException as e: 
        print(e) 
        return

Deleting Batch Processing Jobs

DLI provides an API for deleting batch processing jobs. The following sample code calls the API to delete a batch processing job:
def del_spark_batch(dli_client, batch_id):
    try:
        resp = dli_client.del_spark_batch_job(batch_id)
        print(resp.msg)
    except DliException as e:
        print(e)
        return