Pod Resource Monitoring Metric

CCI supports basic monitoring of pod resources with multiple metrics, such as metrics for CPU, memory, disk, and network.

Pods have built-in system agents, which provide pod and container monitoring metrics in HTTP services by default. Reserve 30 MB for the agent that integrated into a pod.

Resource Metrics

Basic monitoring metrics include CPU, memory, and disk usage. For details, see Resource Metrics.

**Table 1** Resource metrics
Category	Metric	Description
CPU	container_cpu_system_seconds_total	Cumulative system CPU time consumed (unit: second)
	container_cpu_usage_seconds_total	Cumulative time that the container consumed on all CPU cores (unit: second)
	container_cpu_user_seconds_total	Cumulative user CPU time consumed (unit: second)
	container_cpu_cfs_periods_total	Number of elapsed enforcement period intervals
	container_cpu_cfs_throttled_periods_total	Number of throttled period intervals
	container_cpu_cfs_throttled_seconds_total	Total time duration the container has been throttled (unit: second)
File system and disk I/O	container_fs_inodes_free	Number of available inodes in the file system
	container_fs_usage_bytes	File system usage (unit: byte)
	container_fs_inodes_total	Total number of inodes in the file system
	container_fs_io_current	Number of I/Os currently in progress in the disk or file system
	container_fs_io_time_seconds_total	Cumulative seconds spent on doing I/Os by the disk or file system
	container_fs_io_time_weighted_seconds_total	Cumulative weighted I/O time of the disk or file system
	container_fs_limit_bytes	Total disk or file system capacity that can be consumed by the container (unit: byte)
	container_fs_reads_bytes_total	Cumulative amount of disk or file system data read by the container (unit: byte)
	container_fs_read_seconds_total	Cumulative count of seconds the container spent on reading disk or file system data
	container_fs_reads_merged_total	Cumulative count of merged disk or file system reads made by the container
	container_fs_reads_total	Cumulative count of disk or file system reads completed by the container
	container_fs_sector_reads_total	Cumulative count of sector reads completed by the container in the disk or file system
	container_fs_sector_writes_total	Cumulative count of sector writes completed by the container to the disk or file system
	container_fs_writes_bytes_total	Total amount of data written by the container to the disk or file system (unit: byte)
	container_fs_write_seconds_total	Cumulative count of seconds the container spent on writing data to the disk or file system
	container_fs_writes_merged_total	Cumulative count of merged container writes to the disk or file system
	container_fs_writes_total	Cumulative count of completed container writes to the disk or file system
	container_blkio_device_usage_total	Blkio device usage (unit: byte)
Memory	container_memory_failures_total	Cumulative count of container memory allocation failures
	container_memory_failcnt	Number of memory usage hits limits
	container_memory_cache	Total page cache memory of the container (unit: byte)
	container_memory_mapped_file	Size of memory mapped files (unit: byte)
	container_memory_max_usage_bytes	Maximum memory usage recorded for the container (unit: byte)
	container_memory_rss	Size of the resident memory set for the container (unit: byte)
	container_memory_swap	Container swap usage (unit: byte)
	container_memory_usage_bytes	Current memory usage of the container (unit: byte)
	container_memory_working_set_bytes	Memory usage of the working set of the container (unit: byte)
Network	container_network_receive_bytes_total	Total volume of data received by the container network (unit: byte)
	container_network_receive_errors_total	Cumulative count of errors encountered during reception
	container_network_receive_packets_dropped_total	Cumulative count of packets dropped during reception
	container_network_receive_packets_total	Cumulative count of packets received
	container_network_transmit_bytes_total	Total volume of data transmitted on the container network (unit: byte)
	container_network_transmit_errors_total	Cumulative count of errors encountered during transmission
	container_network_transmit_packets_dropped_total	Cumulative count of packets dropped during transmission
	container_network_transmit_packets_total	Cumulative count of packets transmitted
Process	container_processes	Number of processes running inside the container
	container_sockets	Number of open sockets for the container
	container_file_descriptors	Number of open file descriptors for the container
	container_threads	Number of threads running inside the container
	container_threads_max	Maximum number of threads allowed inside the container
	container_ulimits_soft	Soft ulimit value of process 1 in the container. Unlimited if the value is -1, except priority and nice.
	container_spec_cpu_period	CPU period of the container
	container_spec_cpu_shares	CPU share of the container
	container_spec_memory_limit_bytes	Memory limit for the container
	container_spec_memory_reservation_limit_bytes	Memory reservation limit for the container
	container_spec_memory_swap_limit_bytes	Memory swap limit for the container
	container_start_time_seconds	Running time of the container (unit: second)
	container_last_seen	Last time a container was seen by the exporter
gpu	container_accelerator_memory_used_bytes	GPU accelerator memory that is being used by the container (unit: byte)
	container_accelerator_memory_total_bytes	Total available GPU accelerator memory (unit: byte)
	container_accelerator_duty_cycle	Percentage of time when the GPU accelerator is actually running

The total number of monitoring metrics is 59, which is the same as that provided by cAdvisor.

For details about the metrics, see the cAdvisor document.

Basic Configuration

The following example describes how to configure pod resource monitoring metrics, including enabling or disabling pod-level features and customizing ports.

kind: Deployment
apiVersion: apps/v1
metadata:
  name: nginx-exporter
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx-exporter
  template:
    metadata:
      labels:
        app: nginx-exporter
      annotations:
        monitoring.cci.io/enable-pod-metrics: "true"
        monitoring.cci.io/metrics-port: "19100"
    spec:
      containers:
        - name: container-0
          image: 'nginx:alpine'
          resources:
            limits:
              cpu: 1000m
              memory: 2048Mi
            requests:
              cpu: 1000m
              memory: 2048Mi
      imagePullSecrets:
        - name: imagepull-secret

**Table 2** Parameter description
Annotation	Function	Available Value	Default Value
monitoring.cci.io/enable-pod-metrics	Whether to enable the monitoring metrics	true, false (case insensitive)	true
monitoring.cci.io/metrics-port	Listening port of the pod exporter	Valid ports (1 to 65535)	19100

Advanced Configuration

Creating a Secret

A secret is a resource object for encrypted storage. You can save the authentication information, certificates, and private keys in a secret for configuring sensitive data such as passwords, tokens, and keys.

The secret defined in the following example contains three key-value pairs.

apiVersion: v1
kind: Secret
metadata:
  name: cert
type: Opaque
data:
   ca.crt: ...
   server.crt: ...
   server.key: ...

Configuring a TLS Certificate

You can configure annotations to specify the TLS certificate suite of the exporter server for encrypted communication and use the file mounting mode to associate the certificate secret. Example:

kind: Deployment
apiVersion: apps/v1
metadata:
  name: nginx-tls
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx-tls
  template:
    metadata:
      labels:
        app: nginx-tls
      annotations:
        monitoring.cci.io/enable-pod-metrics: "true"
        monitoring.cci.io/metrics-port: "19100"
        monitoring.cci.io/metrics-tls-cert-reference: cert/server.crt
        monitoring.cci.io/metrics-tls-key-reference: cert/server.key
        monitoring.cci.io/metrics-tls-ca-reference: cert/ca.crt
        sandbox-volume.openvessel.io/volume-names: cert
    spec:
      volumes:
        - name: cert
          secret:
            secretName: cert
            defaultMode: 384
      containers:
        - name: container-0
          image: 'nginx:alpine'  
          resources:
            limits:
              cpu: 1000m
              memory: 2048Mi
            requests:
              cpu: 1000m
              memory: 2048Mi
          volumeMounts:
            - name: cert
              mountPath: /tmp/secret0
      imagePullSecrets:
        - name: imagepull-secret

**Table 3** TLS certificate parameters
Annotation	Function	Available Value	Default Value
monitoring.cci.io/metrics-tls-cert-reference	TLS certificate volume reference	${volume-name}/${volume-keyOrPath} (Volume/Path)	None (HTTP is used.)
monitoring.cci.io/metrics-tls-key-reference	TLS private key volume reference	${volume-name}/${volume-keyOrPath}	None (HTTP is used.)
monitoring.cci.io/metrics-tls-ca-reference	TLS CA volume reference	${volume-name}/${volume-keyOrPath}	None (HTTP is used.)