Pod Resource Monitoring Metric
CCI supports basic monitoring of pod resources with multiple metrics, such as metrics for CPU, memory, disk, and network.
Pods have built-in system agents, which provide pod and container monitoring metrics in HTTP services by default. Reserve 30 MB for the agent that integrated into a pod.
Resource Metrics
Basic monitoring metrics include CPU, memory, and disk usage. For details, see Resource Metrics.
Category |
Metric |
Description |
---|---|---|
CPU |
container_cpu_system_seconds_total |
Cumulative system CPU time consumed (unit: second) |
container_cpu_usage_seconds_total |
Cumulative time that the container consumed on all CPU cores (unit: second) |
|
container_cpu_user_seconds_total |
Cumulative user CPU time consumed (unit: second) |
|
container_cpu_cfs_periods_total |
Number of elapsed enforcement period intervals |
|
container_cpu_cfs_throttled_periods_total |
Number of throttled period intervals |
|
container_cpu_cfs_throttled_seconds_total |
Total time duration the container has been throttled (unit: second) |
|
File system and disk I/O |
container_fs_inodes_free |
Number of available inodes in the file system |
container_fs_usage_bytes |
File system usage (unit: byte) |
|
container_fs_inodes_total |
Total number of inodes in the file system |
|
container_fs_io_current |
Number of I/Os currently in progress in the disk or file system |
|
container_fs_io_time_seconds_total |
Cumulative seconds spent on doing I/Os by the disk or file system |
|
container_fs_io_time_weighted_seconds_total |
Cumulative weighted I/O time of the disk or file system |
|
container_fs_limit_bytes |
Total disk or file system capacity that can be consumed by the container (unit: byte) |
|
container_fs_reads_bytes_total |
Cumulative amount of disk or file system data read by the container (unit: byte) |
|
container_fs_read_seconds_total |
Cumulative count of seconds the container spent on reading disk or file system data |
|
container_fs_reads_merged_total |
Cumulative count of merged disk or file system reads made by the container |
|
container_fs_reads_total |
Cumulative count of disk or file system reads completed by the container |
|
container_fs_sector_reads_total |
Cumulative count of sector reads completed by the container in the disk or file system |
|
container_fs_sector_writes_total |
Cumulative count of sector writes completed by the container to the disk or file system |
|
container_fs_writes_bytes_total |
Total amount of data written by the container to the disk or file system (unit: byte) |
|
container_fs_write_seconds_total |
Cumulative count of seconds the container spent on writing data to the disk or file system |
|
container_fs_writes_merged_total |
Cumulative count of merged container writes to the disk or file system |
|
container_fs_writes_total |
Cumulative count of completed container writes to the disk or file system |
|
container_blkio_device_usage_total |
Blkio device usage (unit: byte) |
|
Memory |
container_memory_failures_total |
Cumulative count of container memory allocation failures |
container_memory_failcnt |
Number of memory usage hits limits |
|
container_memory_cache |
Total page cache memory of the container (unit: byte) |
|
container_memory_mapped_file |
Size of memory mapped files (unit: byte) |
|
container_memory_max_usage_bytes |
Maximum memory usage recorded for the container (unit: byte) |
|
container_memory_rss |
Size of the resident memory set for the container (unit: byte) |
|
container_memory_swap |
Container swap usage (unit: byte) |
|
container_memory_usage_bytes |
Current memory usage of the container (unit: byte) |
|
container_memory_working_set_bytes |
Memory usage of the working set of the container (unit: byte) |
|
Network |
container_network_receive_bytes_total |
Total volume of data received by the container network (unit: byte) |
container_network_receive_errors_total |
Cumulative count of errors encountered during reception |
|
container_network_receive_packets_dropped_total |
Cumulative count of packets dropped during reception |
|
container_network_receive_packets_total |
Cumulative count of packets received |
|
container_network_transmit_bytes_total |
Total volume of data transmitted on the container network (unit: byte) |
|
container_network_transmit_errors_total |
Cumulative count of errors encountered during transmission |
|
container_network_transmit_packets_dropped_total |
Cumulative count of packets dropped during transmission |
|
container_network_transmit_packets_total |
Cumulative count of packets transmitted |
|
Process |
container_processes |
Number of processes running inside the container |
container_sockets |
Number of open sockets for the container |
|
container_file_descriptors |
Number of open file descriptors for the container |
|
container_threads |
Number of threads running inside the container |
|
container_threads_max |
Maximum number of threads allowed inside the container |
|
container_ulimits_soft |
Soft ulimit value of process 1 in the container. Unlimited if the value is -1, except priority and nice. |
|
container_spec_cpu_period |
CPU period of the container |
|
container_spec_cpu_shares |
CPU share of the container |
|
container_spec_memory_limit_bytes |
Memory limit for the container |
|
container_spec_memory_reservation_limit_bytes |
Memory reservation limit for the container |
|
container_spec_memory_swap_limit_bytes |
Memory swap limit for the container |
|
container_start_time_seconds |
Running time of the container (unit: second) |
|
container_last_seen |
Last time a container was seen by the exporter |
|
gpu |
container_accelerator_memory_used_bytes |
GPU accelerator memory that is being used by the container (unit: byte) |
container_accelerator_memory_total_bytes |
Total available GPU accelerator memory (unit: byte) |
|
container_accelerator_duty_cycle |
Percentage of time when the GPU accelerator is actually running |
The total number of monitoring metrics is 59, which is the same as that provided by cAdvisor.
For details about the metrics, see the cAdvisor document.
Basic Configuration
The following example describes how to configure pod resource monitoring metrics, including enabling or disabling pod-level features and customizing ports.
kind: Deployment apiVersion: apps/v1 metadata: name: nginx-exporter spec: replicas: 1 selector: matchLabels: app: nginx-exporter template: metadata: labels: app: nginx-exporter annotations: monitoring.cci.io/enable-pod-metrics: "true" monitoring.cci.io/metrics-port: "19100" spec: containers: - name: container-0 image: 'nginx:alpine' resources: limits: cpu: 1000m memory: 2048Mi requests: cpu: 1000m memory: 2048Mi imagePullSecrets: - name: imagepull-secret
Annotation |
Function |
Available Value |
Default Value |
---|---|---|---|
monitoring.cci.io/enable-pod-metrics |
Whether to enable the monitoring metrics |
true, false (case insensitive) |
true |
monitoring.cci.io/metrics-port |
Listening port of the pod exporter |
Valid ports (1 to 65535) |
19100 |
Advanced Configuration
Creating a Secret
A secret is a resource object for encrypted storage. You can save the authentication information, certificates, and private keys in a secret for configuring sensitive data such as passwords, tokens, and keys.
The secret defined in the following example contains three key-value pairs.
apiVersion: v1 kind: Secret metadata: name: cert type: Opaque data: ca.crt: ... server.crt: ... server.key: ...
Configuring a TLS Certificate
You can configure annotations to specify the TLS certificate suite of the exporter server for encrypted communication and use the file mounting mode to associate the certificate secret. Example:
kind: Deployment apiVersion: apps/v1 metadata: name: nginx-tls spec: replicas: 1 selector: matchLabels: app: nginx-tls template: metadata: labels: app: nginx-tls annotations: monitoring.cci.io/enable-pod-metrics: "true" monitoring.cci.io/metrics-port: "19100" monitoring.cci.io/metrics-tls-cert-reference: cert/server.crt monitoring.cci.io/metrics-tls-key-reference: cert/server.key monitoring.cci.io/metrics-tls-ca-reference: cert/ca.crt sandbox-volume.openvessel.io/volume-names: cert spec: volumes: - name: cert secret: secretName: cert defaultMode: 384 containers: - name: container-0 image: 'nginx:alpine' resources: limits: cpu: 1000m memory: 2048Mi requests: cpu: 1000m memory: 2048Mi volumeMounts: - name: cert mountPath: /tmp/secret0 imagePullSecrets: - name: imagepull-secret
Annotation |
Function |
Available Value |
Default Value |
---|---|---|---|
monitoring.cci.io/metrics-tls-cert-reference |
TLS certificate volume reference |
${volume-name}/${volume-keyOrPath} (Volume/Path) |
None (HTTP is used.) |
monitoring.cci.io/metrics-tls-key-reference |
TLS private key volume reference |
${volume-name}/${volume-keyOrPath} |
None (HTTP is used.) |
monitoring.cci.io/metrics-tls-ca-reference |
TLS CA volume reference |
${volume-name}/${volume-keyOrPath} |
None (HTTP is used.) |
The values of the preceding parameters are the names and paths of the storage volume where the TLS certificate, private key, and CA file are located.
Obtaining Resource Monitoring Metrics
After configuring the preceding monitoring attributes, run the following command in a VPC that can access the pod to obtain the pod monitoring data:
curl $podIP:$port/metrics
<podIP> indicates the IP address of the pod, and <port> indicates the listening port, for example, curl 192.168.XXX.XXX:19100/metrics.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot