DLI Monitoring Metrics
Description
This section describes metrics reported by DLI to Cloud Eye as well as their namespaces and dimensions. You can use the management console or APIs provided by Cloud Eye to query the metrics of the monitored object and alarms generated for DLI.
Namespace
SYS.DLI
Metric
| Metric ID | Name | Description | Value Range | Monitored Object and Dimension | Monitoring Period (Raw Data) |
|---|---|---|---|---|---|
| queue_job_launching_num | Number of Jobs Being Submitted | Displays the number of jobs in the Submitting state in the user queue. | ≥0 | Monitored object: queue Dimension: queue_id | 5 minutes |
| queue_job_running_num | Number of Running Jobs | Displays the number of running jobs in the user queue. | ≥0 | Monitored object: queue Dimension: queue_id | 5 minutes |
| queue_job_succeed_num | Number of Finished Jobs | Displays the number of completed jobs in the user queue. | ≥0 | Monitored object: queue Dimension: queue_id | 5 minutes |
| queue_job_failed_num | Failed Jobs | Displays the number of failed jobs in the user queue. | ≥0 | Monitored object: queue Dimension: queue_id | 5 minutes |
| queue_job_cancelled_num | Number of Canceled Jobs | Displays the number of canceled jobs in the user queue. | ≥0 | Monitored object: queue Dimension: queue_id | 5 minutes |
| queue_cpu_usage | Queue CPU Usage | Displays the CPU usage of user queues. | 0~100 | Monitored object: queue Dimension: queue_id | 5 minutes |
| queue_disk_usage | Queue Disk Usage | Displays the disk usage of user queues. | 0~100 | Monitored object: queue Dimension: queue_id | 5 minutes |
| queue_mem_usage | Queue Memory Usage | Displays the memory usage of user queues. | 0–100 | Monitored object: queue Dimension: queue_id | 5 minutes |
| flink_read_records_per_second | Flink Job Data Read Rate | Displays the data input rate of a Flink job for monitoring and debugging. | ≥0 | Monitored object: Flink job Dimension: flink_job_id | 10 seconds |
| flink_write_records_per_second | Flink Job Data Write Rate | Displays the data output rate of a Flink job for monitoring and debugging. | ≥0 | Monitored object: Flink job Dimension: flink_job_id | 10 seconds |
| flink_read_records_total | Flink Job Total Data Read | Displays the total number of data inputs of a Flink job for monitoring and debugging. | ≥0 | Monitored object: Flink job Dimension: flink_job_id | 10 seconds |
| flink_write_records_total | Flink Job Total Data Write | Displays the total number of output data records of a Flink job for monitoring and debugging. | ≥0 | Monitored object: Flink job Dimension: flink_job_id | 10 seconds |
| flink_read_bytes_per_second | Flink Job Byte Read Rate | Displays the number of input bytes per second of a Flink job. | ≥0 | Monitored object: Flink job Dimension: flink_job_id | 10 seconds |
| flink_write_bytes_per_second | Flink Job Byte Write Rate | Displays the number of output bytes per second of a Flink job. | ≥0 | Monitored object: Flink job Dimension: flink_job_id | 10 seconds |
| flink_read_bytes_total | Flink Job Total Read Byte | Displays the total number of input bytes of a Flink job. | ≥0 | Monitored object: Flink job Dimension: flink_job_id | 10 seconds |
| flink_write_bytes_total | Flink Job Total Write Byte | Displays the total number of output bytes of a Flink job. | ≥0 | Monitored object: Flink job Dimension: flink_job_id | 10 seconds |
| flink_cpu_usage | Flink Job CPU Usage | Displays the CPU usage of Flink jobs. | 0–100 | Monitored object: Flink job Dimension: flink_job_id | 10 seconds |
| flink_mem_usage | Flink Job Memory Usage | Displays the memory usage of Flink jobs. | 0–100 | Monitored object: Flink job Dimension: flink_job_id | 10 seconds |
| flink_max_op_latency | Flink Job Max Operator Latency | Displays the maximum operator delay of a Flink job. The unit is ms. | ≥0 | Monitored object: Flink job Dimension: flink_job_id | 10 seconds |
| flink_max_op_backpressure_level | Flink Job Maximum Operator Backpressure | Displays the maximum operator backpressure value of a Flink job. A larger value indicates severer backpressure. 0: OK 50: low 100: high | 0~100 | Monitored object: Flink job Dimension: flink_job_id | 10 seconds |
Dimension
| Key | Value |
|---|---|
| queue_id | Queue |
| flink_job_id | Flink job |
Last Article: DLI Operations That Can Be Recorded by CTS
Next Article: Custom Image
Did this article solve your problem?
Thank you for your score!Your feedback would help us improve the website.