Help Center/ Data Lake Insight/ API Reference/ SQL Job-related APIs/ Submitting a SQL Job (Recommended)

Updated on 2026-07-16 GMT+08:00

Submitting a SQL Job (Recommended)

Function

This API is used to submit jobs to a queue using SQL statements.

The job types support DDL, DCL, IMPORT, QUERY, and INSERT. The IMPORT function is the same as that described in Importing Data (Deprecated). The difference lies in the implementation method.

Additionally, you can use other APIs to query and manage jobs. For details, see the following sections:

This API is synchronous if job_type in the response message is DCL.

Authorization

Each account has all the permissions required to call all APIs, but IAM users must be assigned the required permissions.

If you are using role/policy-based authorization, see the required permissions in Introduction.

If you are using identity policy-based authorization, the following identity policy-based permissions are required.

Action	Access Level	Resource Type (*: required)	Condition Key	Alias	Dependency
dli:queue:submitJob	Write	queue *	-	dli:queue:scaleQueue -	-
-	g:EnterpriseProjectId g:ResourceTag/<tag-key>	-

Action

Access Level

Resource Type (*: required)

Condition Key

Alias

Dependency

dli:queue:submitJob

Write

queue *

dli:queue:scaleQueue

URI

URI format
POST /v1.0/{project_id}/jobs/submit-job

Parameter descriptions

**Table 1** URI parameter
Parameter	Mandatory	Type	Description
project_id	Yes	String	Definition Project ID, which is used for resource isolation. For how to obtain a project ID, see Obtaining a Project ID. Example: 48cc2c48765f481480c7db940d6409d1 Constraints N/A Range The value can contain up to 64 characters. Only letters and digits are allowed. Default Value N/A

Request Parameters

**Table 2** Request parameters
Parameter	Mandatory	Type	Description
sql	Yes	String	Definition SQL statement to execute. Constraints N/A Range N/A Default Value N/A
currentdb	No	String	Definition Database where the SQL statement is executed. This parameter does not need to be configured during database creation. Constraints N/A Range The value cannot contain only digits or start with an underscore (_). Only digits, letters, and underscores (_) are allowed. Default Value N/A
current_catalog	No	String	Definition Default catalog of the table where the job is to be submitted. If not specified, the DLI catalog is used by default. Constraints N/A Range The value cannot contain only digits or start with an underscore (_). Only digits, letters, and underscores (_) are allowed. Default Value dli: The DLI catalog is used by default.
queue_name	No	String	Definition Name of the queue to which the job is to be submitted. Constraints N/A Range The name cannot contain only digits or start with an underscore (_). Only digits, letters, and underscores (_) are allowed. Default Value N/A
conf	No	Array of strings	Definition Configuration parameter for the job, which is in the key-value pair format. Constraints N/A Range For details about the supported configuration items, see Table 3. Default Value N/A
tags	No	Array of objects	Definition Job tag. For details, see Table 4. Constraints N/A Range N/A Default Value N/A
engine_type	No	String	Definition Type of the engine that executes jobs. Constraints N/A Range Only the Spark engine is currently supported. The default value is spark. For details about the engine types and descriptions, see DLI Overview. Default Value spark

**Table 3** Configuration parameters description
Parameter	Description
spark.sql.files.maxRecordsPerFile	Definition Maximum number of records to be written into a single file. If the value is zero or negative, there is no limit. Constraints N/A Range N/A Default Value 0
spark.sql.autoBroadcastJoinThreshold	Definition Maximum size of the table that displays all working nodes when a connection is executed. You can set this parameter to -1 to disable the display. Constraints Currently, only the configuration unit metastore table that runs the ANALYZE TABLE COMPUTE statistics noscan command and the file-based data source table that directly calculates statistics based on data files are supported. Range N/A Default Value 209715200
spark.sql.shuffle.partitions	Definition Default number of partitions used to filter data for join or aggregation. Constraints N/A Range N/A Default Value 200
spark.sql.dynamicPartitionOverwrite.enabled	Definition Whether DLI deletes all partitions that meet the conditions before overwriting the partitions. Constraints N/A Range When set to false, DLI will delete all partitions that meet the conditions before overwriting them. For example, if you set false and use INSERT OVERWRITE to write partition 2021-02 to a partitioned table that has the 2021-01 partition, this partition will be deleted. When set to true, DLI will not delete partitions in advance, but will overwrite partitions with data written during runtime. Default Value false
spark.sql.files.maxPartitionBytes	Definition Maximum number of bytes to be packed into a single partition when a file is read. Constraints N/A Range N/A Default Value 134217728
spark.sql.badRecordsPath	Definition Path of bad records. Constraints N/A Range N/A Default Value N/A
spark.sql.legacy.correlated.scalar.query.enabled	Definition Controls the behavior of correlated subqueries. Constraints N/A Range If set to true: When there is no duplicate data in a subquery, executing a correlated subquery does not require deduplication from the subquery's result. If there is duplicate data in a subquery, executing a correlated subquery will result in an error. To resolve this, the subquery's result must be deduplicated using functions such as max() or min(). If set to false: Regardless of whether there is duplicate data in a subquery, executing a correlated subquery requires deduplicating the subquery's result using functions such as max() or min(). Otherwise, an error will occur. Default Value false
dli.jobs.sql.resubmit.enable	Definition Whether Spark SQL jobs are resubmitted in the event of driver failure or queue restart. Constraints If set to true, there may be data consistency issues when performing idempotent operations such as INSERT (for example, insert into, load data, update). This means that if the driver fails and the job is retried, the data that was already inserted before the driver failure may be overwritten again. Range false: Disables job retry, all types of commands will not be resubmitted, and the job will be marked as failed once the driver fails. true: Enables job retry, meaning all types of jobs will be resubmitted in the event of driver failure. Default Value null
spark.sql.optimizer.dynamicPartitionPruning.enabled	Definition Whether to enable dynamic partition pruning. Dynamic partition pruning can help reduce the amount of data that needs to be scanned and improve query performance when executing SQL queries. Constraints N/A Range When set to true, dynamic partition pruning is enabled. SQL automatically detects and deletes partitions that do not meet the WHERE clause conditions during query. This is useful for tables that have a large number of partitions. If SQL queries contain a large number of nested left join operations and the table has a large number of dynamic partitions, a large number of memory resources may be consumed during data parsing. As a result, the memory of the driver node is insufficient and there are frequent Full GCs. To avoid such issues, you can disable dynamic partition pruning by setting this parameter to false. However, disabling this optimization may reduce query performance. Once disabled, Spark does not automatically prune the partitions that do not meet the requirements. Default Value true

**Table 4** tags parameters
Parameter	Mandatory	Type	Description
key	Yes	String	Definition Tag key. Constraints N/A Range A tag key can contain up to 128 characters, cannot start or end with a space, and cannot start with _sys_. Only letters, digits, spaces, and the following special characters are allowed: _.:+-@ Default Value N/A
value	Yes	String	Definition Tag value. Constraints N/A Range A tag value can contain a maximum of 255 characters. Only letters, digits, spaces, and the following special characters (_.:+-@) are allowed. Default Value N/A

Response Parameters

**Table 5** Response parameters
Parameter	Mandatory	Type	Description
is_success	Yes	Boolean	Definition Whether the request is successfully executed. true indicates that the request is successfully executed. Range N/A
message	Yes	String	Definition System prompt. If the execution succeeds, this parameter may be left blank. Range N/A
job_id	Yes	String	Definition ID of a job returned after a job is generated and submitted using SQL statements. The job ID can be used to query the job status and results. Range N/A
job_type	Yes	String	Definition Job type. Range DDL, DCL, IMPORT, EXPORT, QUERY, and INSERT. DDL: jobs that create, modify, and delete metadata files DCL: jobs that grant and revoke permissions When job_type is set to DCL, the operation is synchronous. IMPORT: jobs that import external data into the database EXPORT: jobs that export data to an external database QUERY: jobs that run query statements INSERT: jobs that add new data to tables
schema	No	Array of Map	Definition If the statement type is DDL, the column name and type of DDL are displayed. Range N/A
rows	No	Array of objects	Definition When the statement type is DDL and dli.sql.sqlasync.enabled is set to false, the execution results are returned directly. However, only a maximum of 1,000 rows can be returned. If there are more than 1,000 rows, obtain the results asynchronously. That is, when submitting the job, set xxxx to true, and then obtain the results from the job bucket configured by DLI. The path of the results on the job bucket can be obtained from the result_path in the return value of the ShowSqlJobStatus API. The full data of the results will be automatically exported to the job bucket. Range N/A
job_mode	No	String	Definition Job execution mode. Options: async: asynchronous sync: synchronous Range N/A

Example Request

Submit a SQL job. The job execution database and queue are db1 and default, respectively. Then, add the tags workspace=space1 and jobName=name1 for the job.

{
    "currentdb": "db1",
    "sql": "desc table1",
    "queue_name": "default",
    "conf": [
        "spark.sql.shuffle.partitions = 200"
    ],
    "tags": [
            {
              "key": "workspace",
              "value": "space1"
             },
            {
              "key": "jobName",
              "value": "name1"
             }
      ]
}

Example Response

{
  "is_success": true,
  "message": "",
  "job_id": "8ecb0777-9c70-4529-9935-29ea0946039c",
  "job_type": "DDL",
  "job_mode":"sync",
  "schema": [
    {
      "col_name": "string"
    },
    {
      "data_type": "string"
    },
    {
      "comment": "string"
    }
  ],
  "rows": [
    {
      "col_name": "c1",
      "data_type": "int",
       "comment": null
    },
    {
      "col_name": "c2",
      "data_type": "string",
      "comment": null
    }
  ]
}

Status Codes

Table 6 describes status codes.

**Table 6** Status codes
Status Code	Description
200	Submitted successfully.
400	Request error.
500	Internal server error.

Error Codes

If an error occurs when this API is called, the system does not return the result similar to the preceding example, but returns an error code and error message. For details, see Error Codes.

Parent topic: SQL Job-related APIs

Previous topic: SQL Job-related APIs

Next topic: Canceling a Job (Recommended)

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

For any further questions, feel free to contact us through the chatbot.

Chatbot