Drill Template Description
In this section, you will find a standard drill template library covering multiple scenarios, including 12 types of core templates, such as emergency handling, process deduction, and contingency plan practice.
All templates are designed based on industry best practices and have complete structure and reusable content. There are standard frameworks such as drill background, process nodes, roles and responsibilities. You can modify scenario parameters, risk elements, and handling procedures based on actual requirements. The instructions and error-prone prompts can help you customize your drill tasks efficiently using these templates.
Template Name |
Description |
Label |
Level |
Task Group Name |
Attack Scenario |
---|---|---|---|---|---|
Cross-AZ DR |
This drill simulates how a DR failover is performed for the target service and its antecedent middleware when an AZ is faulty or the network is abnormal in the DR deployment architecture. |
DR |
Advanced |
Cross-AZ DR |
Server disconnection |
Powering off a DCS AZ |
|||||
Initial Chaos Drill |
This is essential for beginners to experience the chaos drill process. |
Nodes |
Basic |
Initial Chaos Drill |
Qualifying practice |
High System Resource Usage |
This drill specifies the system resource usage to test the service performance in high pressure scenarios. When host resources are insufficient, you can handle the problem in advance. |
Nodes |
Medium |
Disk Stress |
Disk usage increase |
Memory Stress |
Memory usage increase |
||||
CPU Stress |
CPU usage increase |
||||
HPA Configuration in Kubernetes |
In the cloud native architecture, auto scaling is an important feature. This drill simulates scale-up after pod resource usage (such as memory) increases in a short period of time and scale-down after resource usage decreases. |
Containers and clusters |
Advanced |
HPA Configuration in Kubernetes |
Pod memory usage increase |
Data Storage Exception |
Generally, service records are stored on the host or middleware where the service is located. Logs are stored on the disk of the host, and data is stored on the middleware such as DDS. This drill simulates the scenario where the ECS disk I/O is high and the primary/standby switchover is performed. |
Services and data |
Medium |
Data Storage Exception |
Disk I/O pressure increase |
Forcibly promoting a standby node to primary |
|||||
Automatic Pod Recovery and Scheduling |
Kubernetes schedules workloads based on pods. When workloads are generated, the scheduler automatically allocates pods in the workloads. For example, the scheduler distributes pods to nodes that have enough resources. |
Clusters |
Medium |
Automatic Pod Recovery and Scheduling |
Memory usage increase |
Forcible pod stopping |
|||||
Network Instability Affecting Service Performance |
This drill injects a network delay to the NIC of the service host to simulate the impact on services when the network is unstable. |
Networks |
Medium |
Network Instability Affecting Service Performance |
Network latency |
Environment Overload in the Microservice Architecture |
Microservices are the mainstream architecture. The core value of microservices is to shorten the service release period and ensure reliable system operation. However, microservices also bring many challenges, such as how to locate and rectify faults in the microservice architecture. This drill simulates overloaded nodes of multiple microservices for your reference. |
DR |
Medium |
Environment Overload in the Microservice Architecture |
CPU usage increase |
Connection exhaustion |
|||||
Process killing |
|||||
Abnormal Server Power-off |
This drill simulates whether services can be recovered with no data loss after a server is powered off. In this drill, you can use the corresponding preset contingency plan to recover services after a node is powered off. |
Services and data |
Medium |
Abnormal Server Power-off |
Device shutdown |
Data Loss in Service Middleware Cache |
In large-scale concurrent data query scenarios where high data query efficiency is required, Redis has become an essential service for internet applications due to its significant speed advantages over traditional databases. However, it may face issues related to data consistency and reliability. This chaos drill aims to verify whether service operations remain normal after clearing Redis data. |
DR |
Medium |
Data Loss in Service Middleware Cache |
DCS instance restart |
Misoperations in the Host Configuration File |
It is a high risk for O&M personnel to directly perform black screen operations on the service host. If the permission of the service configuration file is directly modified, the service process may not be able to read or write the file. This chaos drill uses a custom script to perform operations (modifying or removing permissions) on the host configuration file. You can use the prepared contingency plan to recover the service. |
Services and data |
Medium |
Misoperations in the Host Configuration File |
Custom scripts |
Automatic Workload Switchover |
FlexusL instances are new-generation out-of-the-box lightweight application cloud servers designed for developers and small- and medium-sized enterprises. You can deploy databases or service applications on FlexusL instances. This drill simulates service workload switchover when processes disappear and database nodes are disconnected. |
Networks |
Advanced |
Automatic Workload Switchover |
Process killing |
Network disconnection |
Helpful Links
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot