Attack Scenarios
Scenarios
Chaos drills support multiple attack scenarios, including disruptors for practicing, host resources, host processes, host networks, user-defined faults, and resource O&M. By integrating disruptor modules and functions, you can accurately simulate faults in the actual environment and identify system availability issues as early as possible, continuously improving application resilience. IPv6 fault drills of ECSs, BMSs, and on-premises IDC devices are supported. The drills of host network disruptors help you quickly master fault locating and emergency response capabilities in IPv6 networking environments, ensuring high network availability and security.
Constraints and Limitations
- FlexusL instance (HCSS) scenario: A drill task can be executed only on a single FlexusL host. High availability (HA) is not supported.
- Cloud Container Engine (CCE) scenario: The Kubernetes version supported by the drill task must be the same as that supported by CCE. For details, see Kubernetes Version Policy.
Attack Scenario Description
Source of Attack Target |
Attack Scenario |
Description |
|
---|---|---|---|
ECSs |
Disruptors for practicing |
Qualifying practice |
You can familiarize yourself with the chaos engineering process without worrying real faults. |
Host resources |
CPU usage increase |
Simulate CPU usage surge. The drill can be terminated in an emergency scenario. |
|
Memory usage increase |
Simulate the memory usage surg. The drill can be terminated in an emergency scenario. |
||
Disk usage increase |
Simulate the disk usage surge. The drill can be terminated in an emergency scenario. |
||
Disk I/O pressure increase |
Continuously read and write files to increase disk I/O pressure. The drill can be terminated in an emergency scenario. |
||
Host process |
Process ID exhaustion |
The system process IDs (PIDs) are exhausted. Drills cannot be terminated in an emergency scenario. |
|
Process killing |
Kill processes repeatedly during the fault duration. The drill can be terminated in an emergency scenario. After the emergency termination or drill is complete, the drill system does not start the processes. The service needs to ensure that the processes are restored. |
||
Host network |
Network latency |
Simulate network faults to increase link latency. The drill can be terminated in an emergency scenario. |
|
Network packet loss |
Simulate network faults to cause packet loss on links. The drill can be terminated in an emergency scenario. The drill cannot be terminated when the packet loss rate is 100%. |
||
Network error packets |
Simulate network faults to cause error packets on links. The drill can be terminated in an emergency scenario. The drill cannot be terminated when the packet error rate reaches 100%. |
||
Duplicate packets |
Simulate duplicate packets generated on a link due to a network fault. The drill can be terminated in an emergency scenario. |
||
Network packet disorder |
Simulate packet disorder generated on a link due to a network fault. The drill can be terminated in an emergency scenario. |
||
Network disconnection |
Simulate the network disconnection between nodes. The drill can be terminated in an emergency scenario. Do not enter the IP addresses of the drill system and UniAgent server. Otherwise, the drill may fail. To interrupt an established persistent connection, select All for the interruption direction. |
||
NIC break-down |
Simulate the NIC break-down scenario. The NIC may fail to be started after the NIC breaks down due to different network configurations of hosts. Therefore, prepare a contingency plan for network recovery. The drill cannot be terminated in an emergency scenario. |
||
DNS tempering |
Tamper with the domain name address mapping. The drill can be terminated in an emergency scenario. |
||
Port occupation |
Simulate the scenario where network ports of the system are occupied (a maximum of 100 ports can be occupied). The drill can be terminated in an emergency scenario. |
||
Server disconnection |
Simulate the scenario where the entire server is disconnected, reject all TCP, UDP, and ICMP data packets, and open only ports 22, 8002, 39604, 33552, 33554, 33557, 32552, 32554, and 32557. The drill can be terminated in an emergency scenario. |
||
NIC bandwidth limiting |
Limit the NIC bandwidth, support multiple NICs. The drill can be terminated in an emergency scenario. |
||
Connection exhaustion |
Create a large number of socket connections to the specified server end (combination of the IP address and port number) to exhaust the connections. As a result, normal requests of the node cannot connect to the server (the requests of other nodes on the server may also be affected). The drill can be terminated in an emergency scenario. |
||
Customizing a fault |
Customizing a script |
Users can create scripts using automated O&M scripts and run the scripts to simulate faults. The drill can be terminated in an emergency scenario. |
|
Resource O&M |
Device startup |
Start ECSs, BMSs, and Flexus instances in batches. Status may not be synchronized in a timely manner. The drill can be terminated in an emergency scenario. |
|
Device shutdown |
Shut down ECSs, BMSs, and Flexus instances in batches. Status may not be synchronized in a timely manner. The drill can be terminated in an emergency scenario. |
||
Device restart |
Restart ECSs, BMSs, and Flexus instances in batches. Status may not be synchronized in a timely manner. The drill can be terminated in an emergency scenario. |
||
BMSs |
Disruptors for practicing |
Qualifying practice |
You can familiarize yourself with the chaos engineering process without worrying real faults. |
Host resources |
CPU usage increase |
Simulate CPU usage surge. The drill can be terminated in an emergency scenario. |
|
Memory usage increase |
Simulate the memory usage surg. The drill can be terminated in an emergency scenario. |
||
Disk usage increase |
Simulate the disk usage surge. The drill can be terminated in an emergency scenario. |
||
Disk I/O pressure increase |
Continuously read and write files to increase disk I/O pressure. The drill can be terminated in an emergency scenario. |
||
Host process |
Process ID exhaustion |
The system process IDs (PIDs) are exhausted. The drill cannot be terminated in an emergency scenario. |
|
Process killing |
Kill processes repeatedly during the fault duration. The drill can be terminated in an emergency scenario. After the emergency termination or drill is complete, the drill system does not start the processes. The service needs to ensure that the processes are restored. |
||
Host network |
Network latency |
Simulate network faults to increase link latency. The drill can be terminated in an emergency scenario. |
|
Network packet loss |
Simulate network faults to cause packet loss on links. The drill can be terminated in an emergency scenario. The drill cannot be terminated when the packet loss rate is 100%. |
||
Network error packets |
Simulate network faults to cause error packets on links. The drill can be terminated in an emergency scenario. The drill cannot be terminated when the packet error rate reaches 100%. |
||
Duplicate packets |
Simulate duplicate packets generated on a link due to a network fault. The drill can be terminated in an emergency scenario. |
||
Network packet disorder |
Simulate packet disorder generated on a link due to a network fault. The drill can be terminated in an emergency scenario. |
||
Network disconnection |
Simulate the network disconnection between nodes. The drill can be terminated in an emergency scenario. Do not enter the IP addresses of the drill system and UniAgent server. Otherwise, the drill may fail. To interrupt an established persistent connection, select All for the interruption direction. |
||
NIC break-down |
Simulate the NIC break-down scenario. The NIC may fail to be started after the NIC breaks down due to different network configurations of hosts. Therefore, prepare a contingency plan for network recovery. The drill cannot be terminated in an emergency scenario. |
||
DNS tempering |
Tamper with the domain name address mapping. The drill can be terminated in an emergency scenario. |
||
Port occupation |
Simulate the scenario where network ports of the system are occupied (a maximum of 100 ports can be occupied). The drill can be terminated in an emergency scenario. |
||
Server disconnection |
Simulate the scenario where the entire server is disconnected, reject all TCP, UDP, and ICMP data packets, and open only ports 22, 8002, 39604, 33552, 33554, 33557, 32552, 32554, and 32557. The drill can be terminated in an emergency scenario. |
||
Resource O&M |
Device startup |
Start ECSs, BMSs, and Flexus instances in batches. Status may not be synchronized in a timely manner. The drill can be terminated in an emergency scenario. |
|
Device shutdown |
Shut down ECSs, BMSs, and Flexus instances in batches. Status may not be synchronized in a timely manner. The drill can be terminated in an emergency scenario. |
||
Device restart |
Restart ECSs, BMSs, and Flexus instances in batches. Status may not be synchronized in a timely manner. The drill can be terminated in an emergency scenario. |
||
FlexusL instances (HCSS) |
Disruptors for practicing |
Qualifying practice |
You can familiarize yourself with the chaos engineering process without worrying real faults. |
Host resources |
CPU usage increase |
Simulate CPU usage surge. The drill can be terminated in an emergency scenario. |
|
Memory usage increase |
Simulate the memory usage surg. The drill can be terminated in an emergency scenario. |
||
Disk usage increase |
Simulate the disk usage surge. The drill can be terminated in an emergency scenario. |
||
Disk I/O pressure increase |
Continuously read and write files to increase disk I/O pressure. The drill can be terminated in an emergency scenario. |
||
Host process |
Process ID exhaustion |
The system process IDs (PIDs) are exhausted. The drill cannot be terminated in an emergency scenario. |
|
Process killing |
Kill HCSS processes repeatedly during the fault duration. The drill can be terminated in an emergency scenario. After the emergency termination or drill is complete, the drill system does not start the processes. The service needs to ensure that the processes are restored. |
||
Host network |
Network latency |
Simulate network faults to increase link latency. The drill can be terminated in an emergency scenario. |
|
Network packet loss |
Simulate network faults to cause packet loss on links. The drill can be terminated in an emergency scenario. The drill cannot be terminated when the packet loss rate is 100%. |
||
Network error packets |
Simulate network faults to cause error packets on links. The drill can be terminated in an emergency scenario. The drill cannot be terminated when the packet error rate reaches 100%. |
||
Duplicate packets |
Simulate duplicate packets generated on a link due to a network fault. The drill can be terminated in an emergency scenario. |
||
Network packet disorder |
Simulate packet disorder generated on a link due to a network fault. The drill can be terminated in an emergency scenario. |
||
Network disconnection |
Simulate the network disconnection between nodes. The drill can be terminated in an emergency scenario. Do not enter the IP addresses of the drill system and UniAgent server. Otherwise, the drill may fail. To interrupt an established persistent connection, select All for the interruption direction. |
||
NIC break-down |
Simulate the NIC break-down scenario. The NIC may fail to be started after the NIC breaks down due to different network configurations of hosts. Therefore, prepare a contingency plan for network recovery. The drill cannot be terminated in an emergency scenario. |
||
DNS tempering |
Tamper with the domain name address mapping. The drill can be terminated in an emergency scenario. |
||
Port occupation |
Simulate the scenario where network ports of the system are occupied (a maximum of 100 ports can be occupied). The drill can be terminated in an emergency scenario. |
||
Server disconnection |
Simulate the scenario where the entire server is disconnected, reject all TCP, UDP, and ICMP data packets, and open only ports 22, 8002, 39604, 33552, 33554, 33557, 32552, 32554, and 32557. The drill can be terminated in an emergency scenario. |
||
Resource O&M |
Device startup |
Start ECSs, BMSs, and Flexus instances in batches. Status may not be synchronized in a timely manner. The drill can be terminated in an emergency scenario. |
|
Device shutdown |
Shut down ECSs, BMSs, and Flexus instances in batches. Status may not be synchronized in a timely manner. The drill can be terminated in an emergency scenario. |
||
Device restart |
Restart ECSs, BMSs, and Flexus instances in batches. Status may not be synchronized in a timely manner. The drill can be terminated in an emergency scenario. |
||
CCE nodes |
Disruptors for practicing |
Qualifying practice |
You can familiarize yourself with the chaos engineering process without worrying real faults. |
Host resources |
CPU usage increase |
Simulate CPU usage surge. The drill can be terminated in an emergency scenario. |
|
Memory usage increase |
Simulate the memory usage surg. The drill can be terminated in an emergency scenario. |
||
Disk usage increase |
Simulate the disk usage surge. The drill can be terminated in an emergency scenario. |
||
Disk I/O pressure increase |
Continuously read and write files to increase disk I/O pressure. The drill can be terminated in an emergency scenario. |
||
Host process |
Process ID exhaustion |
The system process IDs (PIDs) are exhausted. The drill cannot be terminated in an emergency scenario. |
|
Process killing |
Kill processes repeatedly during the fault duration. The drill can be terminated in an emergency scenario. After the emergency termination or drill is complete, the drill system does not start the processes. The service needs to ensure that the processes are restored. |
||
Host network |
Network latency |
Simulate network faults to increase link latency. The drill can be terminated in an emergency scenario. |
|
Network packet loss |
Simulate network faults to cause packet loss on links. The drill can be terminated in an emergency scenario. The drill cannot be terminated when the packet loss rate is 100%. |
||
Network error packets |
Simulate network faults to cause error packets on links. The drill can be terminated in an emergency scenario. The drill cannot be terminated when the packet error rate reaches 100%. |
||
Duplicate packets |
Simulate duplicate packets generated on a link due to a network fault. The drill can be terminated in an emergency scenario. |
||
Network packet disorder |
Simulate packet disorder generated on a link due to a network fault. The drill can be terminated in an emergency scenario. |
||
Network disconnection |
Simulate the network disconnection between nodes. The drill can be terminated in an emergency scenario. Do not enter the IP addresses of the drill system and UniAgent server. Otherwise, the drill may fail. To interrupt an established persistent connection, select All for the interruption direction. |
||
NIC break-down |
Simulate the NIC break-down scenario. The NIC may fail to be started after the NIC breaks down due to different network configurations of hosts. Therefore, prepare a contingency plan for network recovery. The drill cannot be terminated in an emergency scenario. |
||
DNS tempering |
Tamper with the domain name address mapping. The drill can be terminated in an emergency scenario. |
||
Port occupation |
Simulate the scenario where network ports of the system are occupied (a maximum of 100 ports can be occupied). The drill can be terminated in an emergency scenario. |
||
Server disconnection |
Simulate the scenario where the entire server is disconnected, reject all TCP, UDP, and ICMP data packets, and open only ports 22, 8002, 39604, 33552, 33554, 33557, 32552, 32554, and 32557. The drill can be terminated in an emergency scenario. |
||
NIC bandwidth limiting |
Limit the NIC bandwidth, support multiple NICs. The drill can be terminated in an emergency scenario. |
||
Connection exhaustion |
Create a large number of socket connections to the specified server end (combination of the IP address and port number) to exhaust the connections. As a result, normal requests of the node cannot connect to the server (the requests of other nodes on the server may also be affected). The drill can be terminated in an emergency scenario. |
||
CCE pods |
Pod resources |
Increases pod CPU usage |
Simulate a pod CPU usage surge. Ensure the attack target is writable. If it is not, the drill will fail. If the drill fails, you can use the emergency termination function. |
Pod memory usage increase |
Simulate a pod memory usage surge. Ensure the attack target is writable. If it is not, the drill will fail. In this case, you can use the emergency termination function. |
||
Pod disk I/O pressure |
Continuously simulates I/O reads and writes. The drill can be terminated in an emergency scenario. |
||
Pod disk usage increase |
Writes large files to a specified directory to simulate the pressure increase of the Kubernetes container file system. The drill can be terminated in an emergency scenario. |
||
Pod process |
Forcible pod stopping |
Forcibly stop a pod. The drill cannot be terminated in an emergency scenario. |
|
Forcibly killing containers in a pod |
Forcibly kill containers in a pod. The drill cannot be terminated in an emergency scenario. |
||
Pod network |
Pod network latency |
Simulate a network fault that incurs the network latency increase in a pod. Drills can be terminated in an emergency scenario. Drills cannot be terminated when the latency reaches 30,000 ms. |
|
Pod network packet loss |
Simulate a network fault that incurs packet loss in a pod. Drills can be terminated in an emergency scenario. |
||
Pod network interruption |
Simulate a network disconnection between a POD and other IP addresses. The drill can be terminated in an emergency scenario. To interrupt an established persistent connection, select all directions as the directions to be interrupted. |
||
Pod network packet disorder |
Simulate packet disorder generated on a link due to a pod network fault. Drills can be terminated in an emergency scenario. |
||
Duplicate pod network packets |
Simulate duplicate packets generated on a link due to a pod network fault. Drills can be terminated in an emergency scenario. |
||
Pod DNS tampering |
If the address mapping of the domain name is tampered with in the pod, ensure that the running user of the attack target is root. Otherwise, the drill will fail due to insufficient permission. The drill can be terminated in an emergency scenario. |
||
Pod port masking |
Simulate disabling of a pod port. The drill can be terminated in an emergency scenario. |
||
Pod network isolation |
Simulate the scenario where access from a pod to another IP address networks is directly rejected. The drill can be terminated in an emergency scenario. If you need to reject established persistent connections, select All for Direction. |
||
RDS instances |
Instances |
RDS active/standby switchover |
Only MySQL and PostgreSQL engines in HA mode are supported. This operation is not allowed during creating and restarting instances, upgrading databases, recovering and modifying ports, as well as creating and deleting accounts. Active/standby switchover cannot change the IP address of the internal network of an instance. The drill cannot be terminated in an emergency scenario. |
Stopping an RDS instance |
Stop both the primary and read-only instances. After the fault duration ends, start the instance. The drill can be terminated in an emergency. |
||
DCS instances |
Instances |
DCS active/standby switchover |
Switch the primary and standby DB instance nodes. This operation is supported only for primary/standby DB instances. The drill cannot be terminated in an emergency scenario. |
DCS instance restart |
Restart a running DCS instance. If you clear data of a Redis 4.0, 5.0, or 6.0 instance, the cleared data cannot be restored. Exercise caution when performing this operation. The drill cannot be terminated in an emergency scenario. |
||
Powering off a DCS AZ |
All nodes in the AZ are powered off centrally. The drill cannot be terminated in an emergency scenario. This disruptor is not supported in some areas. |
||
CSS instances |
Instances |
Restarting a CSS cluster |
Restart the CSS cluster that is in the available status. During the restart, Kibana and Cerebro may fail to be accessed. The drill cannot be terminated in an emergency scenario. |
DDS instances |
Instances |
Forcibly promoting a standby node to primary |
Supported forcible promotion of standby nodes to primary for backup sets, shards, and config nodes. However, there is a risk of failure when the primary/standby latency is large. The drill cannot be terminated in an emergency scenario. |
IDC offline resource VMs |
Disruptors for practicing |
Qualifying practice |
You can familiarize yourself with the chaos engineering process without worrying real faults. |
Host resources |
CPU usage increase |
Simulate CPU usage surge. The drill can be terminated in an emergency scenario. |
|
Memory usage increase |
Simulate the memory usage surg. The drill can be terminated in an emergency scenario. |
||
Disk usage increase |
Simulate the disk usage surge. The drill can be terminated in an emergency scenario. |
||
Disk I/O pressure increase |
Continuously read and write files to increase disk I/O pressure. The drill can be terminated in an emergency scenario. |
||
Host process |
Process ID exhaustion |
The system process IDs (PIDs) are exhausted. The drill cannot be terminated in an emergency scenario. |
|
Process killing |
Kill processes repeatedly during the fault duration. The drill can be terminated in an emergency scenario. After the emergency termination or drill is complete, the drill system does not start the processes. The service needs to ensure that the processes are restored. |
||
Host network |
Network latency |
Simulate network faults to increase link latency. The drill can be terminated in an emergency scenario. |
|
Network packet loss |
Simulate network faults to cause packet loss on links. The drill can be terminated in an emergency scenario. The drill cannot be terminated when the packet loss rate is 100%. |
||
Network error packets |
Simulate network faults to cause error packets on links. The drill can be terminated in an emergency scenario. The drill cannot be terminated when the packet error rate reaches 100%. |
||
Duplicate packets |
Simulate duplicate packets generated on a link due to a network fault. The drill can be terminated in an emergency scenario. |
||
Network packet disorder |
Simulate packet disorder generated on a link due to a network fault. The drill can be terminated in an emergency scenario. |
||
Network disconnection |
Simulate the network disconnection between nodes. The drill can be terminated in an emergency scenario. Do not enter the IP addresses of the drill system and UniAgent server. Otherwise, the drill may fail. To interrupt an established persistent connection, select All for the interruption direction. |
||
NIC break-down |
Simulate the NIC break-down scenario. The NIC may fail to be started after the NIC breaks down due to different network configurations of hosts. Therefore, prepare a contingency plan for network recovery. The drill cannot be terminated in an emergency scenario. |
||
DNS tempering |
Tamper with the domain name address mapping. The drill can be terminated in an emergency scenario. |
||
Port occupation |
Simulate the scenario where network ports of the system are occupied (a maximum of 100 ports can be occupied). The drill can be terminated in an emergency scenario. |
||
Server disconnection |
Simulate the scenario where the entire server is disconnected, reject all TCP, UDP, and ICMP data packets, and open only ports 22, 8002, 39604, 33552, 33554, 33557, 32552, 32554, and 32557. The drill can be terminated in an emergency scenario. |
||
NIC bandwidth limiting |
Limit the NIC bandwidth, support multiple NICs. The drill can be terminated in an emergency scenario. |
||
Connection exhaustion |
Create a large number of socket connections to the specified server end (combination of the IP address and port number) to exhaust the connections. As a result, normal requests of the node cannot connect to the server (the requests of other nodes on the server may also be affected). The drill can be terminated in an emergency scenario. |
||
Alibaba Cloud server |
Disruptors for practicing |
Qualifying practice |
You can familiarize yourself with the chaos engineering process without worrying real faults. |
Host resources |
CPU usage increase |
Simulate CPU usage surge. The drill can be terminated in an emergency scenario. |
|
Memory usage increase |
Simulate the memory usage surg. The drill can be terminated in an emergency scenario. |
||
Disk usage increase |
Simulate the disk usage surge. The drill can be terminated in an emergency scenario. |
||
Disk I/O pressure increase |
Continuously read and write files to increase disk I/O pressure. The drill can be terminated in an emergency scenario. |
||
Host process |
Process ID exhaustion |
The system process IDs (PIDs) are exhausted. The drill cannot be terminated in an emergency scenario. |
|
Process killing |
Kill processes repeatedly during the fault duration. The drill can be terminated in an emergency scenario. After the emergency termination or drill is complete, the drill system does not start the processes. The service needs to ensure that the processes are restored. |
||
Host network |
Network latency |
Simulate network faults to increase link latency. The drill can be terminated in an emergency scenario. |
|
Network packet loss |
Simulate network faults to cause packet loss on links. The drill can be terminated in an emergency scenario. The drill cannot be terminated when the packet loss rate is 100%. |
||
Network error packets |
Simulate network faults to cause error packets on links. The drill can be terminated in an emergency scenario. The drill cannot be terminated when the packet error rate reaches 100%. |
||
Duplicate packets |
Simulate duplicate packets generated on a link due to a network fault. The drill can be terminated in an emergency scenario. |
||
Network packet disorder |
Simulate packet disorder generated on a link due to a network fault. The drill can be terminated in an emergency scenario. |
||
Network disconnection |
Simulate the network disconnection between nodes. The drill can be terminated in an emergency scenario. Do not enter the IP addresses of the drill system and UniAgent server. Otherwise, the drill may fail. To interrupt an established persistent connection, select All for the interruption direction. |
||
NIC break-down |
Simulate the NIC break-down scenario. The NIC may fail to be started after the NIC breaks down due to different network configurations of hosts. Therefore, prepare a contingency plan for network recovery. The drill cannot be terminated in an emergency scenario. |
||
DNS tempering |
Tamper with the domain name address mapping. The drill can be terminated in an emergency scenario. |
||
Port occupation |
Simulate the scenario where network ports of the system are occupied (a maximum of 100 ports can be occupied). The drill can be terminated in an emergency scenario. |
||
Server disconnection |
Simulate the scenario where the entire server is disconnected, reject all TCP, UDP, and ICMP data packets, and open only ports 22, 8002, 39604, 33552, 33554, 33557, 32552, 32554, and 32557. The drill can be terminated in an emergency scenario. |
||
NIC bandwidth limiting |
Limit the NIC bandwidth, support multiple NICs. The drill can be terminated in an emergency scenario. |
||
Connection exhaustion |
Create a large number of socket connections to the specified server end (combination of the IP address and port number) to exhaust the connections. As a result, normal requests of the node cannot connect to the server (the requests of other nodes on the server may also be affected). The drill can be terminated in an emergency scenario. |
Customizing a Fault
Users can create scripts using automated O&M scripts and run the scripts to simulate faults. The drill can be terminated in an emergency scenario.

A custom fault is determined by the script you compiled. Therefore, when scripts are used to attack ECSs, exceptions such as high resource usage and network faults may occur. As a result, the status of the UniAgent installed on the ECSs may change to offline or abnormal. Exercise caution when performing this operation.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
#!/bin/bash set +x function usage() { echo "Usage: {inject_fault|check_fault_status|rollback|clean}" exit 2 } function inject_fault() { echo "inject fault" } function check_fault_status() { echo "check fault status" } function rollback() { echo "rollback" } function clean() { echo "clean" } case "$ACTION" in inject_fault) inject_fault ;; check_fault_status) check_fault_status ;; rollback) if [[ X"${CAN_ROLLBACK}" == X"true" ]]; then rollback else echo "not support to rollback" fi ;; clean) clean ;; *) usage ;; esac |
You are advised to define a custom fault script based on the preceding script specifications. In the preceding specifications, you can define the fault injection function, fault check function, fault rollback function, and environment clearing function by compiling customized content in the inject_fault(), check_fault_status(), rollback() and clean() functions.
According to the preceding specifications, there are two mandatory script parameters: Whether other script parameters are included depends on your script content.
Parameter |
Value |
Description |
---|---|---|
ACTION |
inject_fault |
Drill operation action. The value is automatically changed by the system background in different drill phases. The options are as follows:
|
CAN_ROLLBACK |
false |
Whether rollback is supported. The options are as follows:
|

In the inject_fault function, add a label indicating that the fault injection is successful, and check whether the label exists in the check_fault_status function.
- If the label exists, the check_fault_status function can return normally (for example, exit 0).
- If the label does not exist, the check_fault_status function will return an abnormality (for example, exit 1).
Custom Script Example
The following is an example of a customized script.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 |
#!/bin/bash set +x PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin:~/bin export PATH function usage() { echo "Usage: {inject_fault|check_fault_status|rollback|clean}" exit 2 } function inject_fault() { echo "============start inject fault============" if [ ! -d "${SCRIPT_PATH}/${DIR_NAME}" ]; then mkdir -p "${SCRIPT_PATH}/${DIR_NAME}" echo "mkdir ${SCRIPT_PATH}/${DIR_NAME} successfully" fi cd "${SCRIPT_PATH}/${DIR_NAME}" if [ ! -f ${FILE} ]; then touch "${FILE}" echo "create tmp file ${FILE}" touch inject.log chmod u+x "${FILE}" chmod u+x inject.log else echo "append content">${FILE} fi echo "successfully inject">${FILE} echo "============end inject fault============" } function check_fault_status() { echo "============start check fault status============" if [ ! -d "${SCRIPT_PATH}/${DIR_NAME}" ]; then echo "inject has been finished" exit 0 fi cd "${SCRIPT_PATH}/${DIR_NAME}" SUCCESS_FLAG="successfully inject" if [ -f ${FILE} ]; then if [[ "$(sed -n '1p' ${FILE})" = "${SUCCESS_FLAG}" ]]; then echo "fault inject successfully" else echo "fault inject failed" exit 1 fi else echo "inject finished" exit 0 fi sleep ${DURATION} echo "============end check fault status============" } function rollback() { echo "============start rollback============" cd "${SCRIPT_PATH}" if [ -d $DIR_NAME ]; then rm -rf "${SCRIPT_PATH}/${DIR_NAME}" fi echo "============end rollback============" } function clean() { echo "============start clean============" cd "${SCRIPT_PATH}" if [ -d $DIR_NAME ]; then rm -rf "${SCRIPT_PATH}/${DIR_NAME}" fi echo "============end clean============" } case "$ACTION" in inject_fault) inject_fault ;; check_fault_status) check_fault_status ;; rollback) if [[ X"${CAN_ROLLBACK}" == X"true" ]]; then rollback else echo "not support to rollback" fi ;; clean) clean ;; *) usage ;; esac |
The input parameters of the script are as follows:
Parameter |
Value |
Description |
---|---|---|
ACTION |
inject_fault |
Drill operation action |
CAN_ROLLBACK |
false |
Rollback is not supported. |
SCRIPT_PATH |
/tmp |
Root directory of the custom fault log |
DIR_NAME |
test_script |
Parent directory of the custom fault log |
FILE |
test.log |
Custom fault log name |
DURATION |
10 |
Duration of a simulated custom fault, in seconds. (This parameter does not take effect when it is placed in the inject_fault function.) |

- In the sample inject_fault function, the injected fault is to create a {FILE} file and add content to the {FILE} file. If successfully inject is entered in the {FILE} file, the fault injection is successful.
- In the example, the check_fault_status function checks whether the {FILE} file exists. If no, the fault may have been rectified. In this case, exit 1 is returned. If yes, check whether the label indicating that the fault injection is successful exists. If the label exists, the fault injection is successful. Here, sleep {DURATION} is used to simulate the fault duration. If the label does not exist, the fault injection fails.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot