CCE Network Metrics Exporter
Introduction
Dolphin is an add-on for monitoring and managing container network traffic. The current version of dolphin can collect traffic statistics of containers that do not use the host network mode in CCE Turbo clusters and performs nodewide container connectivity check.
You can use podSelector to select the monitoring backend. Multiple monitoring tasks and optional monitoring metrics are supported. You can also obtain the label information of pods. The monitoring information has been adapted to the Prometheus format. You can call the Prometheus API to view monitoring data.
Constraints
- This add-on can be installed only in CCE Turbo clusters of version 1.19 or later and deployed only on x86 nodes running EulerOS.
- This add-on can be installed on nodes that use the containerd or Docker container engine. In containerd nodes, it can trace pod updates in real time. In Docker nodes, it can query pod updates in polling mode.
- Only traffic statistics of secure containers (Kata as the container runtime) and common containers (runC as the container runtime) in a CCE Turbo cluster can be collected.
- After the add-on is installed, traffic is by default not monitored. You need to create a MonitorPolicy to configure a monitoring task for traffic monitoring.
- Pods using the host network mode cannot be monitored.
- Ensure that there are sufficient resources on a node for installing the add-on.
- The source of monitoring labels and user labels must be already available before a pod is created.
- You can specify a maximum of five labels. You cannot specify the labels used by the system. Labels used by the system include pod, task, ipfamily, srcip, dstip, srcport, dstport, and protocol.
Installing the Add-on
- Log in to the CCE console and click the CCE Turbo cluster name to access the cluster. Click Add-ons in the navigation pane, locate CCE Network Metrics Exporter on the right, and click Install.
- On the Install Add-on page, view the add-on configuration.
No parameter can be configured for the current add-on.
- Click Install.
After the add-on is installed, select the cluster and click Add-ons in the navigation pane. On the displayed page, view the add-on in the Add-ons Installed area.
Components
Component |
Description |
Resource Type |
---|---|---|
dolphin |
Used to monitor the container network traffic of CCE Turbo clusters |
DaemonSet |
Monitoring Metrics of dolphin
You can deliver a monitoring task by creating a MonitorPolicy. A MonitorPolicy can be created by calling an API or using the kubectl apply command after logging in to a worker node. A MonitorPolicy represents a monitoring task and provides optional parameters such as selector and podLabel. The following table describes the supported monitoring metrics.
Monitoring Metric |
Monitoring Item |
Granularity |
Supported Runtime |
Supported Cluster Version |
Supported Add-on Version |
Supported OS |
---|---|---|---|---|---|---|
Number of IPv4 packets sent to the Internet |
dolphin_ip4_send_pkt_internet |
Pod |
runC/Kata |
v1.19 or later |
1.1.2 |
EulerOS 2.9 EulerOS 2.10 |
Number of IPv4 bytes sent to the Internet |
dolphin_ip4_send_byte_internet |
Pod |
runC/Kata |
v1.19 or later |
1.1.2 |
EulerOS 2.9 EulerOS 2.10 |
Number of received IPv4 packets |
dolphin_ip4_rcv_pkt |
Pod |
runC/Kata |
v1.19 or later |
1.1.2 |
EulerOS 2.9 EulerOS 2.10 |
Number of received IPv4 bytes |
dolphin_ip4_rcv_byte |
Pod |
runC/Kata |
v1.19 or later |
1.1.2 |
EulerOS 2.9 EulerOS 2.10 |
Number of sent IPv4 packets |
dolphin_ip4_send_pkt |
Pod |
runC/Kata |
v1.19 or later |
1.1.2 |
EulerOS 2.9 EulerOS 2.10 |
Number of sent IPv4 bytes |
dolphin_ip4_send_byte |
Pod |
runC/Kata |
v1.19 or later |
1.1.2 |
EulerOS 2.9 EulerOS 2.10 |
Health status of the latest health check |
dolphin_health_check_status |
Pod |
runC/Kata |
v1.19 or later |
1.2.2 |
EulerOS 2.9 EulerOS 2.10 |
Total number of successful health checks |
dolphin_health_check_successful_counter |
Pod |
runC/Kata |
v1.19 or later |
1.2.2 |
EulerOS 2.9 EulerOS 2.10 |
Total number of failed health checks |
dolphin_health_check_failed_counter |
Pod |
runC/Kata |
v1.19 or later |
1.2.2 |
EulerOS 2.9 EulerOS 2.10 |
Delivering a Monitoring Task
The template for creating a MonitorPolicy is as follows:
apiVersion: crd.dolphin.io/v1 kind: MonitorPolicy metadata: name: example-task # Monitoring task name. namespace: kube-system # The value must be kube-system. This field is mandatory. spec: selector: # (Optional) Backend monitored by the dolphin add-on, for example, labelSelector. By default, all containers on the node are monitored. matchLabels: app: nginx matchExpressions: - key: app operator: In values: - nginx podLabel: [app] # (Optional) Pod label. ip4Tx: # (Optional) Indicates whether to collect statistics about the number of sent IPv4 packets and the number of sent IPv4 bytes. This function is disabled by default. enable: true ip4Rx: # (Optional) Indicates whether to collect statistics about the number of received IPv4 packets and the number of received IPv4 bytes. This function is disabled by default. enable: true ip4TxInternet: # (Optional) Indicates whether to collect statistics about the number of sent IPv4 packets and the number of sent IPv4 bytes. This function is disabled by default. enable: true healthCheck: # (Optional) Whether to collect statistics about whether the latest health check result is healthy and the total number of healthy times and unhealthy times in the pod health checks of the local node. This function is disabled by default. enable: true # true false failureThreshold: 3 # (Optional) Number of failures that determine the health check is unhealthy. One check failure is considered as unhealthy by default. periodSeconds: 5 # (Optional) Interval between health checks, in seconds. The default value is 60. command: "" # (Optional) Health check command. The value can be ping (default), arping, or curl. ipFamilies: [""] # (Optional) Health check IP address family. The value is IPv4 by default. port: 80 # (Optional) Port number, which is mandatory when curl is used. path: "" # (Optional) HTTP API path, which is mandatory when curl is used.
PodLabel: You can enter the labels of multiple pods and separate them with commas (,), for example, [app, version].
Labels must comply with the following rules. The corresponding regular expression is (^[a-zA-Z_]$)|(^([a-zA-Z][a-zA-Z0-9_]|_[a-zA-Z0-9])([a-zA-Z0-9_]){0,254}$).
- A maximum of five labels can be entered. A label can contain a maximum of 256 characters.
- The value cannot start with a digit or double underscores (_).
- The format of a single label must comply with A-Za-z_0-9.
- If you modify or delete a monitoring task, monitoring data collected by the monitoring task will be lost. Therefore, exercise caution when performing this operation.
- If the add-on is uninstalled, the MonitorPolicy of the monitoring task will be removed together with the add-on.
Example application scenarios:
- The example below monitors all pods with label app=nginx selected by the labelselector on a node and generates the three health check metrics. By default, the ping command is used to detect local pods. If the monitored container contains the test and app labels, the key-value information of the corresponding label is carried in the monitoring metrics. Otherwise, the value of the corresponding label is not found.
apiVersion: crd.dolphin.io/v1 kind: MonitorPolicy metadata: name: example-task namespace: kube-system spec: selector: matchLabels: app: nginx podLabel: [test, app] healthCheck: enable: true failureThreshold: 3 periodSeconds: 5
- The example below monitors all pods with label app=nginx selected by the labelselector on a node and generates the three health check metrics. Customized curl command is used, which considers only the network connectivity. That is, no matter what the HTTP code is returned by the program, the pod is considered healthy as long as the network is connected. If the monitored container contains the test and app labels, the key-value information of the corresponding label is carried in the monitoring metrics. Otherwise, the value of the corresponding label is not found.
apiVersion: crd.dolphin.io/v1 kind: MonitorPolicy metadata: name: example-task namespace: kube-system spec: selector: matchLabels: app: nginx podLabel: [test, app] healthCheck: enable: true failureThreshold: 3 periodSeconds: 5 command: "curl" port: 80 path: "healthz"
- The example below monitors all pods on a node and generates the number of sent IPv4 packets and the number of sent IPv4 bytes. If the monitored container contains the app label, the key-value information of the corresponding label is carried in the monitoring metrics. Otherwise, the value of the corresponding label is not found.
apiVersion: crd.dolphin.io/v1 kind: MonitorPolicy metadata: name: example-task namespace: kube-system spec: podLabel: [app] ip4Tx: enable: true
- The example below monitors all pods with label app=nginx selected by the labelselector on a node and generates the number of sent IPv4 packets, received IPv4 packets, sent IPv4 bytes, received IPv4 bytes, IPv4 packets sent to the public network, and IPv4 bytes sent to the public network. If the monitored container contains the test and app labels, the key-value information of the corresponding label is carried in the monitoring metrics. Otherwise, the value of the corresponding label is not found.
apiVersion: crd.dolphin.io/v1 kind: MonitorPolicy metadata: name: example-task namespace: kube-system spec: selector: matchLabels: app: nginx podLabel: [test, app] ip4Tx: enable: true ip4Rx: enable: true ip4TxInternet: enable: true
Checking Traffic Statistics
The monitoring data collected by this add-on is exported in Prometheus exporter format, which can be obtained in either of the following ways:
- Directly access service port 10001 provided by the dolphin add-on, for example, http://{POD_IP}:10001/metrics.
Note that if you access the dolphin service port on a node, allow access from the security group of the node and pod.
Examples of the monitored information:
- Example 1 (number of IPv4 packets sent to the Internet):
dolphin_ip4_send_pkt_internet{app="nginx",pod="default/nginx-66c9c65dbf-zjg24",task="kube-system/example-task "} 241
In the preceding example, the namespace of the pod is default, the pod name is nginx-66c9c65dbf-zjg24, the label is app, and the value is nginx. This metric is created by monitoring task example-task, and the number of IPv4 packets sent by the pod to the public network is 241.
- Example 2 (number of IPv4 bytes sent to the Internet):
dolphin_ip4_send_byte_internet{app="nginx",pod="default/nginx-66c9c65dbf-zjg24",task="kube-system/example-task" } 23618
In the preceding example, the namespace of the pod is default, the pod name is nginx-66c9c65dbf-zjg24, the label is app, and the value is nginx. This metric is created by monitoring task example-task, and the number of IPv4 bytes sent by the pod to the public network is 23618.
- Example 3 (number of sent IPv4 packets):
dolphin_ip4_send_pkt{app="nginx",pod="default/nginx-66c9c65dbf-zjg24",task="kube-system/example-task "} 379
In the preceding example, the namespace of the pod is default, the pod name is nginx-66c9c65dbf-zjg24, the label is app, and the value is nginx. This metric is created by monitoring task example-task, and the number of IPv4 packets sent by the pod is 379.
- Example 4 (number of sent IPv4 bytes):
dolphin_ip4_send_byte{app="nginx",pod="default/nginx-66c9c65dbf-zjg24",task="kube-system/example-task "} 33129
In the preceding example, the namespace of the pod is default, the pod name is nginx-66c9c65dbf-zjg24, the label is app, and the value is nginx. This metric is created by monitoring task example-task, and the number of IPv4 bytes sent by the pod is 33129.
- Example 5 (number of received IPv4 packets):
dolphin_ip4_rcv_pkt{app="nginx",pod="default/nginx-66c9c65dbf-zjg24",task="kube-system/example-task "} 464
In the preceding example, the namespace of the pod is default, the pod name is nginx-66c9c65dbf-zjg24, the label is app, and the value is nginx. This metric is created by monitoring task example-task, and the number of IPv4 packets received by the pod is 464.
- Example 6 (number of received IPv4 bytes):
dolphin_ip4_rcv_byte{app="nginx",pod="default/nginx-66c9c65dbf-zjg24",task="kube-system/example-task "} 34654
In the preceding example, the namespace of the pod is default, the pod name is nginx-66c9c65dbf-zjg24, the label is app, and the value is nginx. This metric is created by monitoring task example-task, and the number of IPv4 bytes received by the pod is 34654.
- Example 7 (health check status)
dolphin_health_check_status{app="nginx",pod="default/nginx-b74766f5f-7582p",task="kube-system/example-task"} 0
In the preceding example, the namespace of the pod is kube-system, the pod name is default/nginx-deployment-b74766f5f-7582p, the label is app, and the value is nginx. This metric is created by monitoring task example-task, and the network health status of the pod is 0 (healthy). If the network status is unhealthy, the value will be 1.
- Example 8 (number of successful health checks)
dolphin_health_check_successful_counter{app="nginx",pod="default/nginx-b74766f5f-7582p",task="kube-system/example-task"} 5
In the preceding example, the namespace of the pod is kube-system, the pod name is default/nginx-deployment-b74766f5f-7582p, the label is app, and the value is nginx. This metric is created by monitoring task example-task, and the number of successful network health checks for the pod is 5.
- Example 9 (number of failed health check failures)
dolphin_health_check_failed_counter{app="nginx",pod="default/nginx-b74766f5f-7582p",task="kube-system/example-task"} 0
In the preceding example, the namespace of the pod is kube-system, the pod name is default/nginx-deployment-b74766f5f-7582p, the label is app, and the value is nginx. This metric is created by monitoring task example-task, and the number of failed network health checks for the pod is 0.
If the container does not contain the specified label, the label value in the response body is not found. The format is as follows:
dolphin_ip4_send_byte_internet{test="not found", pod="default/nginx-66c9c65dbf-zjg24",task="default" } 23618
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot