Help Center> Data Lake Insight> User Guide> DLI Monitoring Metrics

DLI Monitoring Metrics

Description

This section describes metrics reported by DLI to Cloud Eye as well as their namespaces and dimensions. You can use the management console or APIs provided by Cloud Eye to query the metrics of the monitored object and alarms generated for DLI.

Namespace

SYS.DLI

Metric

Table 1 DLI monitoring metrics

Metric ID

Name

Description

Value Range

Monitored Object and Dimension

Monitoring Period (Raw Data)

queue_job_launching_num

Number of Jobs Being Submitted

Displays the number of jobs in the Submitting state in the user queue.

≥0

Monitored object: queue

Dimension: queue_id

5 minutes

queue_job_running_num

Number of Running Jobs

Displays the number of running jobs in the user queue.

≥0

Monitored object: queue

Dimension: queue_id

5 minutes

queue_job_succeed_num

Number of Finished Jobs

Displays the number of completed jobs in the user queue.

≥0

Monitored object: queue

Dimension: queue_id

5 minutes

queue_job_failed_num

Failed Jobs

Displays the number of failed jobs in the user queue.

≥0

Monitored object: queue

Dimension: queue_id

5 minutes

queue_job_cancelled_num

Number of Canceled Jobs

Displays the number of canceled jobs in the user queue.

≥0

Monitored object: queue

Dimension: queue_id

5 minutes

queue_cpu_usage

Queue CPU Usage

Displays the CPU usage of user queues.

0~100

Monitored object: queue

Dimension: queue_id

5 minutes

queue_disk_usage

Queue Disk Usage

Displays the disk usage of user queues.

0~100

Monitored object: queue

Dimension: queue_id

5 minutes

queue_mem_usage

Queue Memory Usage

Displays the memory usage of user queues.

0–100

Monitored object: queue

Dimension: queue_id

5 minutes

flink_read_records_per_second

Flink Job Data Read Rate

Displays the data input rate of a Flink job for monitoring and debugging.

≥0

Monitored object: Flink job

Dimension: flink_job_id

10 seconds

flink_write_records_per_second

Flink Job Data Write Rate

Displays the data output rate of a Flink job for monitoring and debugging.

≥0

Monitored object: Flink job

Dimension: flink_job_id

10 seconds

flink_read_records_total

Flink Job Total Data Read

Displays the total number of data inputs of a Flink job for monitoring and debugging.

≥0

Monitored object: Flink job

Dimension: flink_job_id

10 seconds

flink_write_records_total

Flink Job Total Data Write

Displays the total number of output data records of a Flink job for monitoring and debugging.

≥0

Monitored object: Flink job

Dimension: flink_job_id

10 seconds

flink_read_bytes_per_second

Flink Job Byte Read Rate

Displays the number of input bytes per second of a Flink job.

≥0

Monitored object: Flink job

Dimension: flink_job_id

10 seconds

flink_write_bytes_per_second

Flink Job Byte Write Rate

Displays the number of output bytes per second of a Flink job.

≥0

Monitored object: Flink job

Dimension: flink_job_id

10 seconds

flink_read_bytes_total

Flink Job Total Read Byte

Displays the total number of input bytes of a Flink job.

≥0

Monitored object: Flink job

Dimension: flink_job_id

10 seconds

flink_write_bytes_total

Flink Job Total Write Byte

Displays the total number of output bytes of a Flink job.

≥0

Monitored object: Flink job

Dimension: flink_job_id

10 seconds

flink_cpu_usage

Flink Job CPU Usage

Displays the CPU usage of Flink jobs.

0–100

Monitored object: Flink job

Dimension: flink_job_id

10 seconds

flink_mem_usage

Flink Job Memory Usage

Displays the memory usage of Flink jobs.

0–100

Monitored object: Flink job

Dimension: flink_job_id

10 seconds

flink_max_op_latency

Flink Job Max Operator Latency

Displays the maximum operator delay of a Flink job. The unit is ms.

≥0

Monitored object: Flink job

Dimension: flink_job_id

10 seconds

flink_max_op_backpressure_level

Flink Job Maximum Operator Backpressure

Displays the maximum operator backpressure value of a Flink job. A larger value indicates severer backpressure.

0: OK

50: low

100: high

0~100

Monitored object: Flink job

Dimension: flink_job_id

10 seconds

Dimension

Table 2 Dimension

Key

Value

queue_id

Queue

flink_job_id

Flink job