Updated on 2024-07-09 GMT+08:00

Application Scenarios

O&M Situation Awareness BI Dashboard

The dedicated O&M BI dashboard caters to various O&M roles, aiding in optimization, insight generation, and decision-making.

Rich metrics: COC provides 30+ preset O&M metrics, delivering insights into your cloud resources across seven perspective-based dashboards and a comprehensive enterprise-level O&M sandbox.

Figure 1 O&M sandbox

Full-Lifecycle Resource Management

Full-lifecycle resource management is available, and includes actions such as resource defining, requesting, provisioning, O&M, changing, configuration, renewal, and recycling; building a unified resource management center.

  • Full-lifecycle management: eliminates breakpoints across the entire user resource management journey, ensuring smooth user resource management and efficient O&M.
  • Resource management center: enables visualized management of your resources from a global perspective, and supports multi-cloud and cross-account centralized O&M.
Figure 2 Full-lifecycle resource management

Change Risk Control and Operations Trustworthiness

Management and control models that integrate Huawei SRE best practices in secure production provide you with trustworthy, stable, and reliable O&M capabilities.

  • All-round operations trustworthiness ensures operational security before, during, and after changes, is supported by personnel risk assessment capabilities, and offers high-risk command alerts, and automated inspection.
  • AI-powered risk assessment: helps you identify and mitigate operation risks using the innovative method of building AI-powered models for assessing personnel competences and OREO algorithms for identifying high-risk operations.
Figure 3 Change risk control and operations trustworthiness

Standardized Fault Management

The standardized fault management process and WarRoom facilitate efficient fault synergy and rapid fault recovery.
  • Standard process: provides a standardized troubleshooting process on Huawei Cloud. Bolstered by contingency plans and the WarRoom-based synergy of O&M engineers, R&D teams, and other personnel, this standardized process helps you handle faults encountered with ease.
  • O&M knowledge base: enables the swift handling of faults. A rich repository of O&M knowledge, derived from handling historical faults and the accumulation of experience in handling unknown faults, increases efficiency during fault handling process.
Figure 4 Standardized fault management

Intelligent Chaos Drills

Full-stack chaos engineering solutions enable you to quickly evaluate the potential resilience risks of applications and continuously monitor application architectures.
  • E2E chaos engineering solutions: provide E2E chaos drill capabilities based on your service scenarios from four dimensions: risk analysis, contingency plans, drill execution, and drill review.
  • Fault mode library: introduces the methodology of analyzing fault scenarios from the perspective of fault tolerance, and leverages Huawei Cloud SREs' years of accumulated experience in fault handling through the failure mode library.
Figure 5 Intelligent chaos drills