Updated on 2024-07-09 GMT+08:00

What Is COC?

Cloud Operations Center (COC) is a secure and efficient O&M platform, offering one-stop, AI-powered solutions for all your centralized O&M needs. It encompasses Huawei Cloud deterministic operations scenarios and features essential functionalities such as fault management, batch O&M, and chaos drills, to improve cloud O&M efficiency while ensuring security compliance.

Figure 1 COC service overview

Unified Resource Management

  • Application management: provides the capability of modeling the association between applications and resources to fulfill your requirements in centralized cloud resource management and cost reduction management.
  • Resource management: synchronizes and manages the resource instances used on various cloud platforms to build a resource O&M capability foundation.
  • Configuration management: manages applications and resources, and centrally monitors their parameter configurations throughout their lifecycles.
  • Compliance management: provides batch patch scanning and repair capabilities for resource O&M, ensuring both security compliance and efficiency.

Comprehensive Change Management

  • Solution review: enables Standard Operating Procedure (SOP) for change solutions, clarifying and electronizing change solutions and archiving them after review. Rules and processes can be decoupled to ensure that a change execution process is correct and that the change solution can be accumulated.
  • Change review: reviews change tickets according to the preset review process to ensure the reliability, efficiency, and process compliance of change solutions.
  • Risk assessment: manages changes based on scenario rules, process rules, and business rules to identify and prevent change risks in advance. The change calendar is used to identify change conflicts and reduce change risks caused by change dependencies between services.
  • Assurance implementation: presets changes solutions, standardizes change steps, enables change operation observation, and ensures timely handling of change exceptions, delivering controllable, visible, and manageable change processes.

Deterministic Fault Management

  • Unified incident center: provides an E2E and standard incident handling mechanism, covering incident discovery, incident handling, recovery verification, and continuous improvement.
  • WarRoom and fault backtracking capabilities: triggers WarRoom requests intelligently for live-network incidents, shortening troubleshooting time. In addition, you can observe the troubleshooting progress in real time from the command center. Fault backtracking facilitates issue summary and experience accumulation, preventing issues from recurring and shortening the MTTR.
  • Contingency plans: enables you to develop contingency plans for known faults and handle deterministic issues using the contingency plan automation mechanism.
  • Failure modes: leverages professional risk analysis methods and expert knowledge bases to accumulate a failure mode base, helping you analyze potential risks of cloud applications and pass on O&M experience.

Resilience Center Optimization

  • Full-lifecycle risk management: encompasses risk management in both application deployment and running scenarios throughout the lifecycles of applications and resources, serving you based on years of dynamic risk management experience accumulated on Huawei Cloud.
  • Proactive O&M: promotes the quality and resilience of your key services through proactive O&M methods, including performance pressure tests, emergency drills/chaotic engineering, and resilience evaluation.
  • Rich fault drill tools: uses over 50 built-in drill attack tools based on Huawei Cloud best practices, enabling you to simulate complex and diversified service exception scenarios and develop countermeasures.
  • Application HA improvement: The Production Readiness Review (PRR) feature leverages the SREs' best practices on cloud application rollout review and provides online review e-flows and review items, enhancing application HA.

Access Methods

You can access COC through the web-based management console or HTTPS-based application programming interfaces (APIs).

  • Accessing COC Through APIs

    Use this method to access COC if you need to integrate COC into a third-party system for secondary development. For detailed operations, see the Cloud Operations Center API Reference.

  • Accessing COC Through the Management Console

    Use this method if you do not need to integrate COC into a third-party platform.

    Ensure that you have registered on Huawei Cloud. For details about how to register an account, see Registering a HUAWEI ID and Enabling HUAWEI CLOUD Services. Then, log in to the management console and click Cloud Operations Center.