Updated on 2024-05-27 GMT+08:00

Basic Metrics: Node Exporter Metrics

This section describes the types, names, and meanings of metrics reported by Node Exporter to AOM.

Table 1 Metrics of containers running in CCE or on-premises Kubernetes clusters

Job

Metric

Description

node-exporter

node_filesystem_size_bytes

Consumed space of a file system

node_filesystem_readonly

Read-only file system

node_filesystem_free_bytes

Remaining space of a file system

node_filesystem_avail_bytes

File system space that is available for use

node_cpu_seconds_total

Seconds each CPU spent doing each type of work

node_network_receive_bytes_total

Total amount of received data

node_network_receive_errs_total

Cumulative number of errors encountered during reception

node_network_transmit_bytes_total

Total amount of transmitted data

node_network_receive_packets_total

Cumulative number of packets received

node_network_transmit_drop_total

Cumulative number of dropped packets during transmission

node_network_transmit_errs_total

Cumulative number of errors encountered during transmission

node_network_up

NIC status

node_network_transmit_packets_total

Cumulative number of packets transmitted

node_network_receive_drop_total

Cumulative number of packets dropped during reception

go_gc_duration_seconds

This value is obtained by calling the debug.ReadGCStats() function. When this function is called, the PauseQuantile field of the GCStats structure is set to 5. In this way, the function returns 5 GC pause time percentiles (the minimum percentile, 25%, 50%, 75%, and maximum percentile). Then, the Prometheus Go client creates a summary metric based on the returned GC pause time percentile, NumGC, and PauseTotal.

node_load5

5-minute average CPU load

node_filefd_allocated

Allocated file descriptors

node_exporter_build_info

Node exporter build information

node_disk_written_bytes_total

Number of bytes that are written

node_disk_writes_completed_total

Number of writes completed

node_disk_write_time_seconds_total

Number of seconds spent by all writes

node_nf_conntrack_entries

Number of currently allocated flow entries for connection tracking

node_nf_conntrack_entries_limit

Maximum size of a connection tracking table

node_processes_max_processes

PID limit value

node_processes_pids

Number of PIDs

node_sockstat_TCP_alloc

Number of allocated TCP sockets

node_sockstat_TCP_inuse

Number of TCP sockets in use

node_sockstat_TCP_tw

Number of TCP sockets in the TIME_WAIT state

node_timex_offset_seconds

Time offset

node_timex_sync_status

Synchronization status of node clocks

node_uname_info

Labeled system information as provided by the uname system call

node_vmstat_pgfault

Number of page faults the system has made per second in /proc/vmstat

node_vmstat_pgmajfault

Number of major faults per second in /proc/vmstat

node_vmstat_pgpgin

Number of page in between main memory and block device in /proc/vmstat

node_vmstat_pgpgout

Number of page out between main memory and block device in /proc/vmstat

node_disk_reads_completed_total

Number of reads completed

node_disk_read_time_seconds_total

Number of seconds spent by all reads

process_cpu_seconds_total

The value is obtained based on the utime parameter (the number of ticks executed by the Go process in user mode) and the stime parameter (the number of ticks executed by the Go process in kernel mode, for example, during system invocation). Unit: jiffies, which measure the tick time between two system timer interruptions. process_cpu_seconds_total = (utime + stime)/USER_HZ Based on the preceding formula, you can obtain the total time (unit: seconds) for a process to run on the OS.

node_disk_read_bytes_total

Number of bytes that are read

node_disk_io_time_weighted_seconds_total

The weighted number of seconds spent doing I/Os

node_disk_io_time_seconds_total

Total seconds spent doing I/Os

node_disk_io_now

Number of I/Os in progress

node_context_switches_total

Number of context switches

node_boot_time_seconds

Node boot time

process_resident_memory_bytes

Resident set size (RSS), which is the memory actually used by a process. It includes the shared memory, but does not include the allocated but unused memory or swapped-out memory.

node_intr_total

Number of interruptions that occurred

node_load1

1-minute average CPU load

go_goroutines

This value is obtained by calling runtime.NumGoroutine() and calculated based on the sched scheduler structure and global allglen variable. Fields in the sched structure may change concurrently. Therefore, the system checks whether the calculated value is less than 1. If the value is less than 1, the system returns 1.

scrape_duration_seconds

Time spent on collecting information about the scrape target

node_load15

15-minute average CPU load

scrape_samples_post_metric_relabeling

Number of remaining samples after metrics are relabeled

node_netstat_Tcp_PassiveOpens

Number of TCP connections that directly change from the LISTEN state to the SYN-RCVD state

scrape_samples_scraped

Number of samples scraped

node_netstat_Tcp_CurrEstab

Number of TCP connections in the ESTABLISHED or CLOSE-WAIT state

scrape_series_added

Number of series added to the scrape target

node_netstat_Tcp_ActiveOpens

Number of TCP connections that directly change from the CLOSED state to the SYN-SENT state

node_memory_MemTotal_bytes

Total memory of a node

node_memory_MemFree_bytes

Free memory of a node

node_memory_MemAvailable_bytes

Available memory of a node

node_memory_Cached_bytes

Memory for the node page cache

up

Scrape target status

node_memory_Buffers_bytes

Memory of the node buffer