Updated on 2024-03-25 GMT+08:00

Creating a SQL Job

Function

This API is used to create a Flink streaming SQL job.

URI

  • URI format

    POST /v1.0/{project_id}/streaming/sql-jobs

  • Parameter description
    Table 1 URI parameter

    Parameter

    Mandatory

    Type

    Description

    project_id

    Yes

    String

    Project ID, which is used for resource isolation. For details about how to obtain its value, see Obtaining a Project ID.

Request

Table 2 Request parameters

Parameter

Mandatory

Type

Description

name

Yes

String

Name of the job. The value can contain 1 to 57 characters.

desc

No

String

Job description. Length range: 0 to 512 characters.

template_id

No

Integer

Template ID.

If both template_id and sql_body are specified, sql_body is used. If template_id is specified but sql_body is not, fill sql_body with the template_id value.

queue_name

No

String

Name of a queue. The value can contain 0 to 128 characters.

sql_body

No

String

Stream SQL statement, which includes at least the following three parts: source, query, and sink. Length range: 1,024 x 1,024 characters.

run_mode

No

String

Job running mode. The options are as follows:

  • shared_cluster: indicates that the job is running on a shared cluster.
  • exclusive_cluster: indicates that the job is running on an exclusive cluster.
  • edge_node: indicates that the job is running on an edge node.

The default value is shared_cluster.

cu_number

No

Integer

Number of CUs selected for a job. The default value is 2.

Sum of the number of compute units and job manager CUs of DLI. CU is also the billing unit of DLI. One CU equals one vCPU and 4 GB. The value is the number of CUs required for job running and cannot exceed the number of CUs in the bound queue. For details about how to set the number of CUs of JobManager, see manager_cu_number.

parallel_number

No

Integer

Number of parallel jobs set by a user. The default value is 1.

Number of Flink SQL jobs that run at the same time. Properly increasing the number of parallel threads improves the overall computing capability of the job. However, the switchover overhead caused by the increase of threads must be considered. This value cannot be greater than four times the compute units (number of CUs minus the number of JobManager CUs).

For details about how to set the number of JobManager CUs, see manager_cu_number.

checkpoint_enabled

No

Boolean

Whether to enable the automatic job snapshot function.

  • true: indicates to enable the automatic job snapshot function.
  • false: indicates to disable the automatic job snapshot function.
  • Default value: false

checkpoint_mode

No

Integer

Snapshot mode. There are two options:

  • 1: ExactlyOnce, indicates that data is processed only once.
  • 2: AtLeastOnce, indicates that data is processed at least once.

The default value is 1.

checkpoint_interval

No

Integer

Snapshot interval. The unit is second. The default value is 10.

obs_bucket

No

String

OBS path where users are authorized to save the snapshot. This parameter is valid only when checkpoint_enabled is set to true.

OBS path where users are authorized to save the snapshot. This parameter is valid only when log_enabled is set to true.

log_enabled

No

Boolean

Whether to enable the function of uploading job logs to users' OBS buckets. The default value is false.

smn_topic

No

String

SMN topic. If a job fails, the system will send a message to users subscribed to the SMN topic.

restart_when_exception

No

Boolean

Whether to enable the function of automatically restarting a job upon job exceptions. The default value is false.

idle_state_retention

No

Integer

Retention time of the idle state. The unit is second. The default value is 3600.

job_type

No

String

Job type. This parameter can be set to flink_sql_job, and flink_opensource_sql_job.

  • If run_mode is set to exclusive_cluster, job_type must be set to flink_sql_job or flink_opensource_sql_job.
  • The default value is flink_sql_job.

dirty_data_strategy

No

String

Dirty data policy of a job.

  • 2:obsDir: Save. obsDir specifies the path for storing dirty data.
  • 1: Trigger a job exception
  • 0: Ignore

The default value is 0.

udf_jar_url

No

String

Name of the resource package that has been uploaded to the DLI resource management system. The UDF Jar file of the SQL job is specified by this parameter.

manager_cu_number

No

Integer

Number of CUs in the JobManager selected for a job. The default value is 1.

tm_cus

No

Integer

Number of CUs for each TaskManager. The default value is 1.

tm_slot_num

No

Integer

Number of slots in each TaskManager. The default value is (parallel_number*tm_cus)/(cu_number-manager_cu_number).

resume_checkpoint

No

Boolean

Whether the abnormal restart is recovered from the checkpoint.

resume_max_num

No

Integer

Maximum number of retry times upon exceptions. The unit is times/hour. Value range: -1 or greater than 0. The default value is -1, indicating that the number of times is unlimited.

tags

No

Array of Objects

Label of a Flink SQL job. For details, see Table 3.

runtime_config

No

String

Customizes optimization parameters when a Flink job is running.

flink_version

No

String

Flink version. Currently, only 1.10 and 1.12 are supported.

Table 3 tags parameters

Parameter

Mandatory

Type

Description

key

Yes

String

Tag key.

NOTE:

A tag key can contain a maximum of 128 characters, including letters, numbers, spaces, and special characters (_.:=+-@), but cannot start or end with a space or start with _sys_.

value

Yes

String

Tag key.

NOTE:

A tag value can contain a maximum of 255 characters. Only letters, digits, spaces, and special characters (_.:=+-@) are allowed. The value cannot start or end with a space.

Response

Table 4 Response parameters

Parameter

Mandatory

Type

Description

is_success

No

Boolean

Indicates whether the request is successfully executed. Value true indicates that the request is successfully executed.

message

No

String

Message content.

job

No

Object

Information about the job status. For details, see Table 5.

Table 5 job parameters

Parameter

Mandatory

Type

Description

job_id

Yes

Long

Job ID.

status_name

No

String

Name of job status. For details, see the description of the status field in Querying Job Details.

status_desc

No

String

Status description. Causes and suggestions for the abnormal status.

Example Request

Use the template whose ID is 100000 to create a Flink SQL job named myjob. The job runs in dedicated mode on the testQueue queue.

{
    "name": "myjob",
    "desc": "This is a job used for counting characters.",
    "template_id": 100000,
    "queue_name": "testQueue",
    "sql_body": "select * from source_table",
    "run_mode": "exclusive_cluster",
    "cu_number": 2,
    "parallel_number": 1,
    "checkpoint_enabled": false,
    "checkpoint_mode": "exactly_once",
    "checkpoint_interval": 0,
    "obs_bucket": "my_obs_bucket",
    "log_enabled": false,
    "restart_when_exception": false,
    "idle_state_retention": 3600,
    "job_type": "flink_sql_job",
    "dirty_data_strategy": "0",
    "udf_jar_url": "group/test.jar"
}

Example Response

{
    "is_success": "true",
    "message": "A DLI job is created successfully.",
    "job": {
        "job_id": 148,
        "status_name": "job_init",
        "status_desc": ""
    }
}

Status Codes

Table 6 describes status codes.

Table 6 Status codes

Status Code

Description

200

The job is created successfully.

400

The input parameter is invalid.

Error Codes

If an error occurs when this API is invoked, the system does not return the result similar to the preceding example, but returns the error code and error information. For details, see Error Codes.