OPS04-04 Automating O&M Tasks
You should automate as many tasks as you can in your daily development work to ease management and reduce human errors. To maximize the value of automation investment, prioritize simple, procedural, and long-term tasks. Application automation is not an all-or-nothing strategy. Even workflows that require manual intervention, such as decision-making points, can benefit from automation.
- Risk level
High
- Key strategies
Focus on tasks that benefit most from automation:
- Tasks that are highly procedural and prone to human error: These tasks are clearly defined, highly automated, have no variables that add complexity, and are executed as part of the normal path. Examples: restarting servers, creating accounts, and transferring logs to data storage. These tasks may be performed as scheduled in response to events or monitoring alarms, or initiated as required due to external factors.
- Tasks that O&M engineers currently handle: Automated services are provided for application DevOps teams to automatically execute O&M operations using scripts. For example, in a multi-tenant solution, database administrators frequently get asked to create new databases. If a self-service portal is built for operation personnel, they can independently create empty databases.
- Tasks that can greatly enhance efficiency once automated: High-value automation requires the least management overhead and greatly enhances efficiency. For example, if automating database entries can save the operations team an hour each day, they will have extra time to focus on enhancing processes through automation.
- Design suggestions
- Pipeline definition, execution, and management: Use continuous integration and continuous delivery (CI/CD) tools, such as CodeArts Pipeline, to automatically define pipelines and their running modes.
- Deployment: Use tools such as Huawei Cloud Resource Formation Service (RFS), Terraform, and Ansible to automate workload development and release. The infrastructure as code (IaC) approach can be used to deploy and optimize infrastructure using the same automation platform.
- Testing: Many tools are available to automate the testing process. These tools can reduce the burden of the quality assurance team and maintain that tests are standardized and reliable.
- Scaling: Use the functions provided by the platform and other tools, such as RFS, to automatically scale the infrastructure when the load increases or decreases.
- Monitoring and alerting: Use tools provided by Cloud Operations Center (COC) and Cloud Eye to automatically register newly deployed resources and configure alarm-triggered operations to speed up problem rectification.
- Self-healing: Use alarms generated by Cloud Eye to automatically perform operations and recover faulty components or jobs.
- Configuration management: Use orchestration and policy tools to maintain that all resources run the same configuration and adhere to compliance requirements throughout the workload.
- Other management tasks: Use scripts to automatically perform repetitive tasks, such as updating database records or DNS records.
- Approval: Enable the system to automatically make approval decisions based on predefined rules to improve the efficiency of workflows with approval checkpoints. This approach promotes the use of standardized forms and templates to enhance process efficiency. Automated approvals can be risky in high-risk environments. Monitor and test your automatic approvals to maintain that specific criteria are defined for granting approvals.
- Onboarding new user and new employee onboarding: You can automate many tasks associated with onboarding new application users or new employees, such as database updates and credential creation.
- Related cloud services and tools
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot