Pod Resource Metrics
CCI supports pod resource monitoring with multiple metrics, such as metrics for CPU, memory, disk, and network.
Pods have built-in system agents, which provide pod and container metrics in HTTP services by default. Reserve 30 MB for the agent that integrated into a pod.
Resource Metrics
Basic metrics cover CPU, memory, and disk. For details, see Resource Metrics.
Category |
Metric |
Description |
---|---|---|
CPU metrics |
container_cpu_system_seconds_total |
Cumulative system CPU time consumed (unit: second) |
container_cpu_usage_seconds_total |
Cumulative time that the container consumed on all CPU cores (unit: second) |
|
container_cpu_user_seconds_total |
Cumulative user CPU time consumed (unit: second) |
|
container_cpu_cfs_periods_total |
Number of elapsed enforcement period intervals |
|
container_cpu_cfs_throttled_periods_total |
Number of throttled period intervals |
|
container_cpu_cfs_throttled_seconds_total |
Total time duration a container has been throttled (unit: second) |
|
File system or disk I/O metrics |
container_fs_inodes_free |
Number of available inodes in a file system |
container_fs_usage_bytes |
File system usage (unit: byte) |
|
container_fs_inodes_total |
Total number of inodes in a file system |
|
container_fs_io_current |
Number of I/Os currently in progress in a disk or file system |
|
container_fs_io_time_seconds_total |
Cumulative seconds spent on doing I/Os by a disk or file system |
|
container_fs_io_time_weighted_seconds_total |
Cumulative weighted I/O time of a disk or file system |
|
container_fs_limit_bytes |
Total disk or file system capacity that can be consumed by a container (unit: byte) |
|
container_fs_reads_bytes_total |
Cumulative amount of disk or file system data read by a container (unit: byte) |
|
container_fs_read_seconds_total |
Cumulative count of seconds a container spent on reading disk or file system data |
|
container_fs_reads_merged_total |
Cumulative count of merged disk or file system reads made by a container |
|
container_fs_reads_total |
Cumulative count of disk or file system reads completed by a container |
|
container_fs_sector_reads_total |
Cumulative count of sector reads completed by a container in a disk or file system |
|
container_fs_sector_writes_total |
Cumulative count of sector writes completed by a container to a disk or file system |
|
container_fs_writes_bytes_total |
Total amount of data written by a container to a disk or file system (unit: byte) |
|
container_fs_write_seconds_total |
Cumulative count of seconds a container spent on writing data to a disk or file system |
|
container_fs_writes_merged_total |
Cumulative count of merged container writes to a disk or file system |
|
container_fs_writes_total |
Cumulative count of completed container writes to a disk or file system |
|
container_blkio_device_usage_total |
Blkio device usage (unit: byte) |
|
Memory metrics |
container_memory_failures_total |
Cumulative count of container memory allocation failures |
container_memory_failcnt |
Number of memory usage hits limits |
|
container_memory_cache |
Total page cache memory of a container (unit: byte) |
|
container_memory_mapped_file |
Size of memory mapped files (unit: byte) |
|
container_memory_max_usage_bytes |
Maximum memory usage recorded for a container (unit: byte) |
|
container_memory_rss |
Size of the resident memory set for a container (unit: byte) |
|
container_memory_swap |
Container swap usage (unit: byte) |
|
container_memory_usage_bytes |
Current memory usage of a container (unit: byte) |
|
container_memory_working_set_bytes |
Memory usage of the working set of a container (unit: byte) |
|
Network metrics |
container_network_receive_bytes_total |
Total volume of data received by the container network (unit: byte) |
container_network_receive_errors_total |
Cumulative count of errors encountered during reception |
|
container_network_receive_packets_dropped_total |
Cumulative count of packets dropped during reception |
|
container_network_receive_packets_total |
Cumulative count of packets received |
|
container_network_transmit_bytes_total |
Total volume of data transmitted on a container network (unit: byte) |
|
container_network_transmit_errors_total |
Cumulative count of errors encountered during transmission |
|
container_network_transmit_packets_dropped_total |
Cumulative count of packets dropped during transmission |
|
container_network_transmit_packets_total |
Cumulative count of packets transmitted |
|
Process metrics |
container_processes |
Number of processes running inside a container |
container_sockets |
Number of open sockets for a container |
|
container_file_descriptors |
Number of open file descriptors for a container |
|
container_threads |
Number of threads running inside a container |
|
container_threads_max |
Maximum number of threads allowed inside a container |
|
container_ulimits_soft |
Soft ulimit value of process 1 in a container. Unlimited if the value is -1, except priority and nice. |
|
container_spec_cpu_period |
CPU period of the container |
|
container_spec_cpu_shares |
CPU share of a container |
|
container_spec_memory_limit_bytes |
Memory limit for a container |
|
container_spec_memory_reservation_limit_bytes |
Memory reservation limit for a container |
|
container_spec_memory_swap_limit_bytes |
Memory swap limit for a container |
|
container_start_time_seconds |
Running time of a container (unit: second) |
|
container_last_seen |
Last time a container was seen by the exporter |
|
GPU metrics |
container_accelerator_memory_used_bytes |
GPU accelerator memory that is being used by the container (unit: byte) |
container_accelerator_memory_total_bytes |
Total available GPU accelerator memory (unit: byte) |
|
container_accelerator_duty_cycle |
Percentage of time when the GPU accelerator is actually running |
|
Volume metrics |
volume_stats_capacity_bytes |
Total volume capacity (unit: byte) |
volume_stats_available_bytes |
Available volume capacity (unit: byte) |
|
volume_stats_used_bytes |
Used volume capacity (unit: byte) |
|
volume_stats_inodes |
Total number of inodes in a volume |
|
volume_stats_inodes_free |
Available inodes in a volume |
|
volume_stats_inodes_used |
Used inodes in a volume |
Volume metrics are custom metrics. You can obtain the metrics of inline EVS volumes or disk-backed emptyDir volumes.
Except volume metrics, other metrics are the same as those provided by cAdvisor. For details about the metrics, see the cAdvisor document.
Basic Configuration
The following example describes how to configure pod resource monitoring metrics, including enabling or disabling pod-level features and customizing ports.
kind: Deployment apiVersion: apps/v1 metadata: name: nginx-exporter spec: replicas: 1 selector: matchLabels: app: nginx-exporter template: metadata: labels: app: nginx-exporter annotations: monitoring.cci.io/enable-pod-metrics: "true" monitoring.cci.io/metrics-port: "19100" spec: containers: - name: container-0 image: 'nginx:alpine' resources: limits: cpu: 1000m memory: 2048Mi requests: cpu: 1000m memory: 2048Mi imagePullSecrets: - name: imagepull-secret
Annotation |
Function |
Available Value |
Default Value |
---|---|---|---|
monitoring.cci.io/enable-pod-metrics |
Whether to enable the monitoring metrics |
true or false (case insensitive) |
true |
monitoring.cci.io/metrics-port |
Listening port of the pod exporter |
Valid ports (1 to 65535) |
19100 |
Advanced Configuration
Creating a Secret
A secret is a resource object for encrypted storage. You can save the authentication information, certificates, and private keys in a secret for configuring sensitive data such as passwords, tokens, and keys.
The secret defined in the following example contains three key-value pairs.
apiVersion: v1 kind: Secret metadata: name: cert type: Opaque data: ca.crt: ... server.crt: ... server.key: ...
Configuring a TLS Certificate
You can configure annotations to specify the TLS certificate suite of the exporter server for encrypted communication and use the file mounting mode to associate the certificate secret. Example:
kind: Deployment apiVersion: apps/v1 metadata: name: nginx-tls spec: replicas: 1 selector: matchLabels: app: nginx-tls template: metadata: labels: app: nginx-tls annotations: monitoring.cci.io/enable-pod-metrics: "true" monitoring.cci.io/metrics-port: "19100" monitoring.cci.io/metrics-tls-cert-reference: cert/server.crt monitoring.cci.io/metrics-tls-key-reference: cert/server.key monitoring.cci.io/metrics-tls-ca-reference: cert/ca.crt sandbox-volume.openvessel.io/volume-names: cert spec: volumes: - name: cert secret: secretName: cert defaultMode: 384 containers: - name: container-0 image: 'nginx:alpine' resources: limits: cpu: 1000m memory: 2048Mi requests: cpu: 1000m memory: 2048Mi volumeMounts: - name: cert mountPath: /tmp/secret0 imagePullSecrets: - name: imagepull-secret
Annotation |
Function |
Available Value |
Default Value |
---|---|---|---|
monitoring.cci.io/metrics-tls-cert-reference |
TLS certificate volume reference |
${volume-name}/${volume-keyOrPath} (Volume/Path) |
None (HTTP is used.) |
monitoring.cci.io/metrics-tls-key-reference |
TLS private key volume reference |
${volume-name}/${volume-keyOrPath} |
None (HTTP is used.) |
monitoring.cci.io/metrics-tls-ca-reference |
TLS CA volume reference |
${volume-name}/${volume-keyOrPath} |
None (HTTP is used.) |
The values of the preceding parameters are the names and paths of the storage volume where the TLS certificate, private key, and CA file are located.
Obtaining Resource Monitoring Metrics
After configuring the preceding monitoring attributes, run the following command in a VPC that can access the pod to obtain the pod monitoring data:
curl $podIP:$port/metrics
<podIP> indicates the IP address of the pod, and <port> indicates the listening port, for example, curl 192.168.XXX.XXX:19100/metrics.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot