部署Hubble实现DataPlane V2网络可观测性
在开启了DataPlane V2的集群中,部署开源的可观测项目Hubble可以将容器网络流量进行可视化展示,实现容器网络可观测性。
前提条件
- DataPlane V2特性由CCE受限开放,使用前请提交工单给CCE服务进行申请。
- 目前只有v1.27.16-r50、v1.28.15-r40、v1.29.15-r0、v1.30.14-r0、v1.31.10-r0、v1.32.6-r0及以上开启了DataPlane V2的CCE Standard集群版本支持部署Hubble。
步骤一:配置ConfigMap并重建yangtse-cilium
- 请参见通过kubectl连接集群,使用kubectl工具连接到您的集群。
- 创建以下cilium社区原生的ConfigMap配置。
apiVersion: v1 kind: ConfigMap metadata: name: cilium-config namespace: kube-system data: enable-hubble: "true" hubble-disable-tls: "true" hubble-listen-address: :4244 hubble-metrics: dns drop tcp flow port-distribution icmp http hubble-metrics-server: :9965
表1 参数说明 参数
描述
备注
enable-hubble
启用Hubble网络流量可观测,设置为"true"表示启用,设置为"false"表示不启用。
本文设置为"true"
hubble-disable-tls
关闭Hubble服务端的tls,设置为"true"表示关闭,设置为"false"表示不关闭。
本文设置为"true"
hubble-listen-address
Hubble服务端监听地址。
本文设置为":4244",冒号不可省略
hubble-metrics
Hubble需要采集的Metrics,以空格分割。
Hubble支持采集的Metrics全量列表为"dns drop tcp flow flows-to-world httpV2 icmp kafka port-distribution" ,详情请参见社区支持的指标。
如果您启用Metrics数量比较多,则会对Hubble性能产生一定影响。
hubble-metrics-server
Hubble Metrics暴露的地址。
本文设置为":9965",冒号不可省略
- 完成cilium-config配置后,通过执行以下命令滚动重建yangtse-cilium即可生效。
uuid=$(uuidgen) kubectl patch daemonset -nkube-system yangtse-cilium --type='json' -p="[{\"op\": \"add\", \"path\": \"/spec/template/metadata/annotations/change-id\", \"value\": \"$uuid\"}]"
步骤二:部署Hubble组件
- 创建hubble.yaml文件。
内容如下:
请修改以下配置文件中valid-cluster字符为您当前的集群名。
--- apiVersion: v1 automountServiceAccountToken: false kind: ServiceAccount metadata: name: hubble-relay namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: hubble-ui namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: hubble-ui roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: hubble-ui subjects: - kind: ServiceAccount name: hubble-ui namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: hubble-ui rules: - apiGroups: - networking.k8s.io resources: - networkpolicies verbs: - get - list - watch - apiGroups: - "" resources: - componentstatuses - endpoints - namespaces - nodes - pods - services verbs: - get - list - watch - apiGroups: - apiextensions.k8s.io resources: - customresourcedefinitions verbs: - get - list - watch - apiGroups: - cilium.io resources: - '*' verbs: - get - list - watch --- apiVersion: v1 kind: Service metadata: labels: app: hubble-relay name: hubble-relay namespace: kube-system spec: internalTrafficPolicy: Cluster ipFamilies: - IPv4 ipFamilyPolicy: SingleStack ports: - port: 80 protocol: TCP targetPort: grpc selector: app: hubble-relay sessionAffinity: None type: ClusterIP --- apiVersion: v1 kind: Service metadata: annotations: prometheus.io/port: "9965" prometheus.io/scrape: "true" labels: app: hubble name: hubble-metrics namespace: kube-system spec: clusterIP: None clusterIPs: - None internalTrafficPolicy: Cluster ipFamilies: - IPv4 ipFamilyPolicy: SingleStack ports: - name: hubble-metrics port: 9965 protocol: TCP targetPort: hubble-metrics selector: app: yangtse-cilium sessionAffinity: None type: ClusterIP --- apiVersion: v1 kind: Service metadata: labels: app: cilium name: hubble-peer namespace: kube-system spec: internalTrafficPolicy: Local ipFamilies: - IPv4 ipFamilyPolicy: SingleStack ports: - name: peer-service port: 80 protocol: TCP targetPort: 4244 selector: app: yangtse-cilium sessionAffinity: None type: ClusterIP --- apiVersion: v1 data: nginx.conf: "server {\n listen 8081;\n listen [::]:8081;\n server_name \ localhost;\n root /app;\n index index.html;\n client_max_body_size 1G;\n\n location / {\n proxy_set_header Host $host;\n proxy_set_header X-Real-IP $remote_addr;\n\n location /api {\n proxy_http_version 1.1;\n proxy_pass_request_headers on;\n proxy_pass http://127.0.0.1:8090;\n \ }\n location / {\n # double `/index.html` is required here \n try_files $uri $uri/ /index.html /index.html;\n }\n\n \ # Liveness probe\n location /healthz {\n access_log off;\n add_header Content-Type text/plain;\n return 200 'ok';\n }\n }\n}" kind: ConfigMap metadata: name: hubble-ui-nginx namespace: kube-system --- apiVersion: v1 data: # 请将valid-cluster字符修改为您当前的集群名 config.yaml: "cluster-name: valid-cluster\npeer-service: \"hubble-peer.kube-system.svc.cluster.local.:80\"\nlisten-address: :4245\ngops: true\ngops-port: \"9893\"\nretry-timeout: \nsort-buffer-len-max: \nsort-buffer-drain-timeout: \ndisable-client-tls: true\n\ndisable-server-tls: true\n" kind: ConfigMap metadata: name: hubble-relay-config namespace: kube-system --- apiVersion: apps/v1 kind: Deployment metadata: labels: app: hubble-relay name: hubble-relay namespace: kube-system spec: progressDeadlineSeconds: 600 replicas: 1 revisionHistoryLimit: 10 selector: matchLabels: app: hubble-relay strategy: rollingUpdate: maxSurge: 25% maxUnavailable: 1 type: RollingUpdate template: metadata: creationTimestamp: null labels: app: hubble-relay spec: affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchLabels: app: yangtse-cilium topologyKey: kubernetes.io/hostname automountServiceAccountToken: false containers: - args: - serve command: - hubble-relay image: quay.io/cilium/hubble-relay:v1.17.6 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 12 grpc: port: 4222 service: "" initialDelaySeconds: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 10 name: hubble-relay ports: - containerPort: 4245 name: grpc protocol: TCP readinessProbe: failureThreshold: 3 grpc: port: 4222 service: "" periodSeconds: 10 successThreshold: 1 timeoutSeconds: 3 resources: {} securityContext: capabilities: drop: - ALL runAsGroup: 65532 runAsNonRoot: true runAsUser: 65532 startupProbe: failureThreshold: 20 grpc: port: 4222 service: "" initialDelaySeconds: 10 periodSeconds: 3 successThreshold: 1 timeoutSeconds: 1 terminationMessagePath: /dev/termination-log terminationMessagePolicy: FallbackToLogsOnError volumeMounts: - mountPath: /etc/hubble-relay name: config readOnly: true dnsPolicy: ClusterFirst restartPolicy: Always schedulerName: default-scheduler securityContext: fsGroup: 65532 serviceAccount: hubble-relay serviceAccountName: hubble-relay terminationGracePeriodSeconds: 1 volumes: - configMap: defaultMode: 420 items: - key: config.yaml path: config.yaml name: hubble-relay-config name: config --- apiVersion: apps/v1 kind: Deployment metadata: labels: app: hubble-ui name: hubble-ui namespace: kube-system spec: progressDeadlineSeconds: 600 replicas: 1 revisionHistoryLimit: 10 selector: matchLabels: app: hubble-ui strategy: rollingUpdate: maxSurge: 25% maxUnavailable: 1 type: RollingUpdate template: metadata: creationTimestamp: null labels: app: hubble-ui spec: automountServiceAccountToken: true containers: - image: quay.io/cilium/hubble-ui:v0.13.2 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 3 httpGet: path: /healthz port: 8081 scheme: HTTP periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 name: frontend ports: - containerPort: 8081 name: http protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: / port: 8081 scheme: HTTP periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 resources: {} terminationMessagePath: /dev/termination-log terminationMessagePolicy: FallbackToLogsOnError volumeMounts: - mountPath: /etc/nginx/conf.d/default.conf name: hubble-ui-nginx-conf subPath: nginx.conf - mountPath: /tmp name: tmp-dir - env: - name: EVENTS_SERVER_PORT value: "8090" - name: FLOWS_API_ADDR value: hubble-relay:80 image: quay.io/cilium/hubble-ui-backend:v0.13.2 imagePullPolicy: IfNotPresent name: backend ports: - containerPort: 8090 name: grpc protocol: TCP resources: {} terminationMessagePath: /dev/termination-log terminationMessagePolicy: FallbackToLogsOnError dnsPolicy: ClusterFirst restartPolicy: Always schedulerName: default-scheduler securityContext: fsGroup: 1001 runAsGroup: 1001 runAsUser: 1001 serviceAccount: hubble-ui serviceAccountName: hubble-ui terminationGracePeriodSeconds: 30 volumes: - configMap: defaultMode: 420 name: hubble-ui-nginx name: hubble-ui-nginx-conf - emptyDir: {} name: tmp-dir
- 部署组件。
kubectl apply -f hubble.yaml
步骤三:部署hubble-ui服务
您可以通过部署NodePort类型或者LoadBalancer类型的Service,来对外提供hubble-ui服务。
本文通过NodePort来对外提供hubble-ui服务
- 创建hubble-ui.yaml文件。
内容如下:
apiVersion: v1 kind: Service metadata: labels: app: hubble-ui name: hubble-ui namespace: kube-system spec: externalTrafficPolicy: Cluster internalTrafficPolicy: Cluster ipFamilies: - IPv4 ipFamilyPolicy: SingleStack ports: - name: http nodePort: 32222 port: 80 protocol: TCP targetPort: 8081 selector: app: hubble-ui sessionAffinity: None type: NodePort
- 创建NodePort。
kubectl apply -f hubble-ui.yaml
步骤四:访问hubble-ui
在浏览器中打开hubble-ui的对外访问地址,例如“http://{节点的EIP}:32222”。
步骤五:安装相关插件
- 在集群中安装云原生监控插件和Grafana。
- 创建hubble-monitor.yaml文件。
内容如下:
apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: hubble namespace: monitoring spec: endpoints: - honorLabels: true path: /metrics port: hubble-metrics relabelings: - action: replace replacement: ${1} sourceLabels: - __meta_kubernetes_pod_node_name targetLabel: node scheme: http jobLabel: hubble namespaceSelector: matchNames: - kube-system selector: matchLabels: app: hubble
- 创建监控任务。
kubectl create -f hubble-monitor.yaml
步骤六:配置Hubble指标看板
Cilium官网提供了Hubble指标的Grafana看板,相比Prometheus单调的指标数据,Grafana看板更加美观。您可以导入相关看板进行查看。
- 准备Hubble指标看板的配置文件,打开Cilium开源社区的Prometheus样例文件,找到文件中的hubble-dashboard.json并复制。
- 登录Grafana,在Grafana中选择Dashboards,导入获取的hubble-dashboard.json文件。
如果您在步骤五:安装相关插件中安装Grafana插件时启用了“数据源对接AOM”,请将hubble-dashboard.json文件中的datasource字段修改为prometheus-aom,然后再进行导入。
- 您可以通过看板查看Hubble指标,Hubble指标的说明详见hubble-metrics。