Help Center/ Cloud Container Instance/ Developer Guide/ Pod Resource Monitoring Metric
Updated on 2023-09-26 GMT+08:00

Pod Resource Monitoring Metric

CCI supports basic monitoring of pod resources with multiple metrics, such as metrics for CPU, memory, disk, and network.

Pods have built-in system agents, which provide pod and container monitoring metrics in HTTP services by default. Reserve 30 MB for the agent that integrated into a pod.

Resource Metrics

Basic monitoring metrics include CPU, memory, and disk usage. For details, see Resource Metrics.

Table 1 Resource metrics

Category

Metric

Description

CPU

container_cpu_system_seconds_total

Cumulative system CPU time consumed (unit: second)

container_cpu_usage_seconds_total

Cumulative time that the container consumed on all CPU cores (unit: second)

container_cpu_user_seconds_total

Cumulative user CPU time consumed (unit: second)

container_cpu_cfs_periods_total

Number of elapsed enforcement period intervals

container_cpu_cfs_throttled_periods_total

Number of throttled period intervals

container_cpu_cfs_throttled_seconds_total

Total time duration the container has been throttled (unit: second)

File system and disk I/O

container_fs_inodes_free

Number of available inodes in the file system

container_fs_usage_bytes

File system usage (unit: byte)

container_fs_inodes_total

Total number of inodes in the file system

container_fs_io_current

Number of I/Os currently in progress in the disk or file system

container_fs_io_time_seconds_total

Cumulative seconds spent on doing I/Os by the disk or file system

container_fs_io_time_weighted_seconds_total

Cumulative weighted I/O time of the disk or file system

container_fs_limit_bytes

Total disk or file system capacity that can be consumed by the container (unit: byte)

container_fs_reads_bytes_total

Cumulative amount of disk or file system data read by the container (unit: byte)

container_fs_read_seconds_total

Cumulative count of seconds the container spent on reading disk or file system data

container_fs_reads_merged_total

Cumulative count of merged disk or file system reads made by the container

container_fs_reads_total

Cumulative count of disk or file system reads completed by the container

container_fs_sector_reads_total

Cumulative count of sector reads completed by the container in the disk or file system

container_fs_sector_writes_total

Cumulative count of sector writes completed by the container to the disk or file system

container_fs_writes_bytes_total

Total amount of data written by the container to the disk or file system (unit: byte)

container_fs_write_seconds_total

Cumulative count of seconds the container spent on writing data to the disk or file system

container_fs_writes_merged_total

Cumulative count of merged container writes to the disk or file system

container_fs_writes_total

Cumulative count of completed container writes to the disk or file system

container_blkio_device_usage_total

Blkio device usage (unit: byte)

Memory

container_memory_failures_total

Cumulative count of container memory allocation failures

container_memory_failcnt

Number of memory usage hits limits

container_memory_cache

Total page cache memory of the container (unit: byte)

container_memory_mapped_file

Size of memory mapped files (unit: byte)

container_memory_max_usage_bytes

Maximum memory usage recorded for the container (unit: byte)

container_memory_rss

Size of the resident memory set for the container (unit: byte)

container_memory_swap

Container swap usage (unit: byte)

container_memory_usage_bytes

Current memory usage of the container (unit: byte)

container_memory_working_set_bytes

Memory usage of the working set of the container (unit: byte)

Network

container_network_receive_bytes_total

Total volume of data received by the container network (unit: byte)

container_network_receive_errors_total

Cumulative count of errors encountered during reception

container_network_receive_packets_dropped_total

Cumulative count of packets dropped during reception

container_network_receive_packets_total

Cumulative count of packets received

container_network_transmit_bytes_total

Total volume of data transmitted on the container network (unit: byte)

container_network_transmit_errors_total

Cumulative count of errors encountered during transmission

container_network_transmit_packets_dropped_total

Cumulative count of packets dropped during transmission

container_network_transmit_packets_total

Cumulative count of packets transmitted

Process

container_processes

Number of processes running inside the container

container_sockets

Number of open sockets for the container

container_file_descriptors

Number of open file descriptors for the container

container_threads

Number of threads running inside the container

container_threads_max

Maximum number of threads allowed inside the container

container_ulimits_soft

Soft ulimit value of process 1 in the container. Unlimited if the value is -1, except priority and nice.

container_spec_cpu_period

CPU period of the container

container_spec_cpu_shares

CPU share of the container

container_spec_memory_limit_bytes

Memory limit for the container

container_spec_memory_reservation_limit_bytes

Memory reservation limit for the container

container_spec_memory_swap_limit_bytes

Memory swap limit for the container

container_start_time_seconds

Running time of the container (unit: second)

container_last_seen

Last time a container was seen by the exporter

gpu

container_accelerator_memory_used_bytes

GPU accelerator memory that is being used by the container (unit: byte)

container_accelerator_memory_total_bytes

Total available GPU accelerator memory (unit: byte)

container_accelerator_duty_cycle

Percentage of time when the GPU accelerator is actually running

The total number of monitoring metrics is 59, which is the same as that provided by cAdvisor.

For details about the metrics, see the cAdvisor document.

Basic Configuration

The following example describes how to configure pod resource monitoring metrics, including enabling or disabling pod-level features and customizing ports.

kind: Deployment
apiVersion: apps/v1
metadata:
  name: nginx-exporter
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx-exporter
  template:
    metadata:
      labels:
        app: nginx-exporter
      annotations:
        monitoring.cci.io/enable-pod-metrics: "true"
        monitoring.cci.io/metrics-port: "19100"
    spec:
      containers:
        - name: container-0
          image: 'nginx:alpine'
          resources:
            limits:
              cpu: 1000m
              memory: 2048Mi
            requests:
              cpu: 1000m
              memory: 2048Mi
      imagePullSecrets:
        - name: imagepull-secret
Table 2 Parameter description

Annotation

Function

Available Value

Default Value

monitoring.cci.io/enable-pod-metrics

Whether to enable the monitoring metrics

true, false (case insensitive)

true

monitoring.cci.io/metrics-port

Listening port of the pod exporter

Valid ports (1 to 65535)

19100

Advanced Configuration

Creating a Secret

A secret is a resource object for encrypted storage. You can save the authentication information, certificates, and private keys in a secret for configuring sensitive data such as passwords, tokens, and keys.

The secret defined in the following example contains three key-value pairs.

apiVersion: v1
kind: Secret
metadata:
  name: cert
type: Opaque
data:
   ca.crt: ...
   server.crt: ...
   server.key: ...

Configuring a TLS Certificate

You can configure annotations to specify the TLS certificate suite of the exporter server for encrypted communication and use the file mounting mode to associate the certificate secret. Example:

kind: Deployment
apiVersion: apps/v1
metadata:
  name: nginx-tls
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx-tls
  template:
    metadata:
      labels:
        app: nginx-tls
      annotations:
        monitoring.cci.io/enable-pod-metrics: "true"
        monitoring.cci.io/metrics-port: "19100"
        monitoring.cci.io/metrics-tls-cert-reference: cert/server.crt
        monitoring.cci.io/metrics-tls-key-reference: cert/server.key
        monitoring.cci.io/metrics-tls-ca-reference: cert/ca.crt
        sandbox-volume.openvessel.io/volume-names: cert
    spec:
      volumes:
        - name: cert
          secret:
            secretName: cert
            defaultMode: 384
      containers:
        - name: container-0
          image: 'nginx:alpine'  
          resources:
            limits:
              cpu: 1000m
              memory: 2048Mi
            requests:
              cpu: 1000m
              memory: 2048Mi
          volumeMounts:
            - name: cert
              mountPath: /tmp/secret0
      imagePullSecrets:
        - name: imagepull-secret
Table 3 TLS certificate parameters

Annotation

Function

Available Value

Default Value

monitoring.cci.io/metrics-tls-cert-reference

TLS certificate volume reference

${volume-name}/${volume-keyOrPath} (Volume/Path)

None (HTTP is used.)

monitoring.cci.io/metrics-tls-key-reference

TLS private key volume reference

${volume-name}/${volume-keyOrPath}

None (HTTP is used.)

monitoring.cci.io/metrics-tls-ca-reference

TLS CA volume reference

${volume-name}/${volume-keyOrPath}

None (HTTP is used.)

The values of the preceding parameters are the names and paths of the storage volume where the TLS certificate, private key, and CA file are located.

Obtaining Resource Monitoring Metrics

After configuring the preceding monitoring attributes, run the following command in a VPC that can access the pod to obtain the pod monitoring data:

curl $podIP:$port/metrics

<podIP> indicates the IP address of the pod, and <port> indicates the listening port, for example, curl 192.168.XXX.XXX:19100/metrics.