Overview
Observability is an approach that engineers use to monitor the infrastructure and applications in a cloud native environment with the help of a variety of tools and techniques. By analyzing the collected metrics, logs, and traces, engineers can gain insights into the applications for easier troubleshooting. This section describes the observability architecture of CCE and main observability capabilities.
The observability architecture consists of four parts: compute base, data collection, monitoring and logging, and O&M.
Compute Base
CCE allows you to create CCE Turbo clusters or CCE standard clusters as required. CCE provides a unified data collection solution for different cluster types, which ensures a consistent experience in cloud native observability. For details about CCE clusters, see CCE Service Overview.
Data Collection
Metric collection: An add-on based on Prometheus is provided for cloud native cluster monitoring. This add-on is much more lightweight and can be used out of the box. For details, see Cloud Native Cluster Monitoring.
Log collection: An add-on based on Fluent Bit and OpenTelemetry is provided for cloud native logging. This add-on features high performance and low resource usage. There are also CRD-based log collection policies, which are more flexible and easy to use. For details, see Cloud Native Logging.
Monitoring and Logging
Application Operations Management (AOM) is a one-stop, multi-dimensional O&M management platform for cloud applications. It monitors applications and related cloud resources in real time, analyzes application health, and provides flexible data visualization functions to help you detect faults in a timely manner.
Log Tank Service (LTS) collects log data from hosts and cloud services. LTS can process a massive number of logs efficiently, securely, and in real time, which enables you to gain insights into cloud services and applications and optimize their availability and performance. It also helps you in real-time decision-making, device O&M management, and service trend analysis.
Cloud Native Observability
CCE provides Health Center, Monitoring Center, Logging, and Alarm Center for cloud native observability.
- Health Center
Health diagnosis carefully monitors cluster health by leveraging the experience of our container O&M experts to detect cluster faults and identify risks in a timely manner. It provides rectification suggestions too.
- Monitoring Center
Monitoring Center provides functions such as multi-dimensional data insights and dashboard. Monitoring Center provides monitoring views from dimensions such as clusters, nodes, workloads, and pods. It supports multi-level drill-down and association analysis. Dashboard gives you monitoring graphs for items such as the API server, CoreDNS, and PVC.
- Logging
CCE works with LTS to collect logs of control plane components (kube-apiserver, kube-controller-manager, and kube-scheduler), Kubernetes audit logs, Kubernetes events, and container logs (stdout logs, text logs, and node logs).
- Alarm Center
Alarm Center works with AOM 2.0 to allow you to create alarm rules and view alarms of clusters and containers.
Resource Permissions
Health Center, Monitoring Center, Logging, and Alarm Center work closely with cloud services for cluster monitoring, alarm reporting, and notification. When you access Health Center, Monitoring Center, Logging, or Alarm Center for the first time, the system will request permissions to access the cloud services in the region where you run your applications.
The following table lists the permissions.
Assigned To |
Permission |
Description |
---|---|---|
CCE |
IAM ReadOnlyAccess |
IAM users need to access Monitoring Center and Alarm Center. |
CCE |
Tenant Guest |
Monitoring Center and Alarm Center check the configurations of global resources associated with clusters such as OBS and DNS resources to identify incorrect configurations. |
CCE |
CCE Administrator |
Monitoring Center and Alarm Center need to access CCE to obtain information about clusters, nodes, and workloads so that they can help ensure resource health. |
CCE |
SWR Administrator |
Monitoring Center and Alarm Center need to access SWR to obtain image information. |
CCE |
SMN Administrator |
Monitoring Center and Alarm Center need to access SMN to obtain contact group information. |
CCE |
AOM Administrator |
Monitoring Center and Alarm Center need to access AOM to obtain metrics. |
CCE |
LTS Administrator |
Monitoring Center and Alarm Center need to access LTS to obtain logs. |
AOM |
DMS UserAccess |
AOM obtains subscription data from DMS. |
AOM |
ECS CommonOperations |
AOM obtains system metrics and logs using UniAgents and ICAgents installed on ECSs. |
AOM |
CES ReadOnlyAccess |
AOM synchronizes metrics from Cloud Eye. |
AOM |
CCE FullAccess |
AOM synchronizes container metrics from CCE. |
AOM |
RMS ReadOnlyAccess |
AOM CMDB manages cloud service instance data. |
AOM |
ECS ReadOnlyAccess |
AOM obtains system metrics and logs using UniAgents and ICAgents installed on ECSs. |
AOM |
LTS FullAccess |
AOM obtains logs from LTS. |
AOM |
CCI FullAccess |
AOM synchronizes container metrics from CCI. |
After you agree to the authorization, agencies are automatically created in IAM to delegate required resource operation permissions in your account to Huawei Cloud CCE and AOM. For details about agencies, see Cloud Service Delegation. The following are agencies automatically created in IAM:
- cia_admin_trust
This agency has the Tenant Guest and IAM ReadOnlyAccess permissions in global projects as well as the Tenant Guest, CCE Administrator, and SWR Administrator permissions in regional projects. These permissions are required by Health Center, Monitoring Center, Logging, or Alarm Center to access other cloud services.
To use Health Center, Monitoring Center, Logging, or Alarm Center in multiple regions, you need to apply for the Tenant Guest, CCE Administrator, and SWR Administrator permissions in each region. (Go to the IAM console, choose Agencies, and click cia_admin_trust to view the authorization records in each region.)
- aom_admin_trust
For details about the aom_admin_trust agency, see AOM Cloud Service Authorization.
Health Center, Monitoring Center, Logging, or Alarm Center may fail to run as expected if the required permissions are not assigned. When using Health Center, Monitoring Center, Logging, or Alarm Center, do not delete or modify cia_admin_trust and aom_admin_trust.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot