Updated on 2024-08-20 GMT+08:00

Viewing Monitoring Metrics

Description

This section describes metrics reported by DLI to Cloud Eye as well as their namespaces and dimensions. You can use the management console or APIs provided by Cloud Eye to query the metrics of the monitored object and alarms generated for DLI.

Namespace

SYS.DLI

Metric

Table 1 DLI metrics

Metric ID

Name

Description

Value Range

Monitored Object

Monitoring Period (Raw Data)

queue_cu_num

Queue CU Usage

Displays the number of CUs applied by the user queue

≥ 0

Queues

5 minutes

queue_job_launching_num

Number of Jobs Being Submitted

Displays the number of jobs in the Submitting state in the user queue.

≥ 0

Queues

5 minutes

queue_job_running_num

Number of Running Jobs

Displays the number of running jobs in the user queue.

≥ 0

Queues

5 minutes

queue_job_succeed_num

Number of Finished Jobs

Displays the number of completed jobs in the user queue.

≥ 0

Queues

5 minutes

queue_job_failed_num

Failed Jobs

Displays the number of failed jobs in the user queue.

≥ 0

Queues

5 minutes

queue_job_cancelled_num

Number of Canceled Jobs

Displays the number of canceled jobs in the user queue.

≥ 0

Queues

5 minutes

queue_alloc_cu_num

Allocated CUs (queue)

Displays the CU allocation for user queues.

≥ 0

Queues

5 minutes

queue_min_cu_num

Minimum CUs for Queue

Displays the minimum number of CUs for a user queue.

≥ 0

Queues

5 minutes

queue_max_cu_num

Maximum CUs for Queue

Displays the maximum number of CUs for a user queue.

≥ 0

Queues

5 minutes

queue_priority

Queue Priority

Displays the priority of a user queue.

1–100

Queues

5 minutes

queue_cpu_usage

Queue CPU Usage

Displays the CPU usage of user queues.

0–100

Queues

5 minutes

queue_disk_usage

Queue Disk Usage

Displays the disk usage of user queues.

0–100

Queues

5 minutes

queue_disk_used

Max Disk Usage

Displays the maximum disk usage of user queues.

0–100

Queues

5 minutes

queue_mem_usage

Queue Memory Usage

Displays the memory usage of user queues.

0–100

Queues

5 minutes

queue_mem_used

Used Memory

Displays the memory usage rate of the user queues.

≥ 0

Queues

5 minutes

flink_read_records_per_second

Flink Job Data Read Rate

Displays the data input rate of a Flink job for monitoring and debugging.

≥ 0

Flink jobs

10 seconds

flink_write_records_per_second

Flink Job Data Write Rate

Displays the data output rate of a Flink job for monitoring and debugging.

≥ 0

Flink jobs

10 seconds

flink_read_records_total

Flink Job Total Data Read

Displays the total number of data inputs of a Flink job for monitoring and debugging.

≥ 0

Flink jobs

10 seconds

flink_write_records_total

Flink Job Total Data Write

Displays the total number of output data records of a Flink job for monitoring and debugging.

≥ 0

Flink jobs

10 seconds

flink_read_bytes_per_second

Flink Job Byte Read Rate

Displays the number of input bytes per second of a Flink job.

≥ 0

Flink jobs

10 seconds

flink_write_bytes_per_second

Flink Job Byte Write Rate

Displays the number of output bytes per second of a Flink job.

≥ 0

Flink jobs

10 seconds

flink_read_bytes_total

Flink Job Total Read Byte

Displays the total number of input bytes of a Flink job.

≥ 0

Flink jobs

10 seconds

flink_write_bytes_total

Flink Job Total Write Byte

Displays the total number of output bytes of a Flink job.

≥ 0

Flink jobs

10 seconds

flink_cpu_usage

Flink Job CPU Usage

Displays the CPU usage of Flink jobs.

0–100

Flink jobs

10 seconds

flink_mem_usage

Flink Job Memory Usage

Displays the memory usage of Flink jobs.

0–100

Flink jobs

10 seconds

flink_max_op_latency

Flink Job Max Operator Latency

Displays the maximum operator delay of a Flink job. The unit is ms.

≥ 0

Flink jobs

10 seconds

flink_max_op_backpressure_level

Flink Job Maximum Operator Backpressure

Displays the maximum operator backpressure value of a Flink job. A larger value indicates severer backpressure.

0: OK

50: low

100: high

0–100

Flink jobs

10 seconds

elastic_resource_pool_cpu_usage

CPU Usage of Elastic Resource Pool

Displays the CPU usage of elastic resource pools.

0–100

Elastic resource pools

5 minutes

elastic_resource_pool_mem_usage

Memory Usage of Elastic Resource Pool

Displays the memory usage of elastic resource pools.

0–100

Elastic resource pools

5 minutes

elastic_resource_pool_disk_usage

Disk Usage of Elastic Resource Pool

Displays the disk usage of elastic resource pools.

0–100

Elastic resource pools

5 minutes

elastic_resource_pool_disk_max_usage

Maximum Disk Usage of Elastic Resource Pool

Displays the maximum disk usage of elastic resource pools.

0–100

Elastic resource pools

5 minutes

elastic_resource_pool_cu_num

CU Usage of Elastic Resource Pool

Displays the CU usage of elastic resource pools.

≥ 0

Elastic resource pools

5 minutes

elastic_resource_pool_alloc_cu_num

Allocated CUs of Elastic Resource Pool

Displays the CU allocation of elastic resource pools.

≥ 0

Elastic resource pools

5 minutes

elastic_resource_pool_min_cu_num

Minimum CUs of Elastic Resource Pool

Displays the minimum number of CUs of elastic resource pools.

≥ 0

Elastic resource pools

5 minutes

elastic_resource_pool_max_cu_num

Maximum CUs of Elastic Resource Pool

Displays the maximum number of CUs of elastic resource pools.

≥ 0

Elastic resource pools

5 minutes

Dimension

Table 2 Dimension

Key

Value

queue_id

Queue

flink_job_id

Flink job

Viewing DLI Monitoring Metrics on Cloud Eye

  1. Search for Cloud Eye on the management console.
  2. In the navigation pane on the left of the Cloud Eye console, click Cloud Service Monitoring > Data Lake Insight.
  3. Select a queue to view its information.