Attack Scenarios

Scenarios

Chaos drills support multiple attack scenarios, including disruptors for practicing, host resources, host processes, host networks, user-defined faults, and resource O&M. By integrating disruptor modules and functions, you can accurately simulate faults in the actual environment and identify system availability issues as early as possible, continuously improving application resilience. IPv6 fault drills of ECSs, BMSs, and on-premises IDC devices are supported. The drills of host network disruptors help you quickly master fault locating and emergency response capabilities in IPv6 networking environments, ensuring high network availability and security.

Constraints and Limitations

FlexusL instance (HCSS) scenario: A drill task can be executed only on a single FlexusL host. High availability (HA) is not supported.
Cloud Container Engine (CCE) scenario: The Kubernetes version supported by drill tasks must be the same as that supported by CCE instances. For details, see Kubernetes Version Policy.

Attack Scenario Description

**Table 1** Attack scenario description
Source of Attack Target	Attack Scenario		Description
ECSs	Disruptors for practicing	Qualifying practice	You can familiarize yourself with the chaos engineering process without worrying real faults.
	Host resources	CPU usage increase	Simulate CPU usage surge. The drill can be terminated in an emergency scenario.
		Memory usage increase	Simulate the memory usage surg. The drill can be terminated in an emergency scenario.
		Disk usage increase	Simulate the disk usage surge. The drill can be terminated in an emergency scenario.
		Disk I/O pressure increase	Continuously read and write files to increase disk I/O pressure. The drill can be terminated in an emergency scenario.
	Host process	Process ID exhaustion	The system process IDs (PIDs) are exhausted. Drills cannot be terminated in an emergency scenario.
	Host process	Process killing	Kill processes repeatedly during the fault duration. The drill can be terminated in an emergency scenario. After the emergency termination or drill is complete, the drill system does not start the processes. The service needs to ensure that the processes are restored.
	Host network	Network latency	Simulate network faults to increase link latency. The drill can be terminated in an emergency scenario.
		Network packet loss	Simulate network faults to cause packet loss on links. The drill can be terminated in an emergency scenario. The drill cannot be terminated when the packet loss rate is 100%.
		Network error packets	Simulate network faults to cause error packets on links. The drill can be terminated in an emergency scenario. The drill cannot be terminated when the packet error rate reaches 100%.
		Duplicate packets	Simulate duplicate packets generated on a link due to a network fault. The drill can be terminated in an emergency scenario.
		Network packet disorder	Simulate packet disorder generated on a link due to a network fault. The drill can be terminated in an emergency scenario.
		Network disconnection	Simulate the network disconnection between nodes. The drill can be terminated in an emergency scenario. Do not enter the IP addresses of the drill system and UniAgent server. Otherwise, the drill may fail. To interrupt an established persistent connection, select All for the interruption direction.
		NIC break-down	Simulate the NIC break-down scenario. The NIC may fail to be started after the NIC breaks down due to different network configurations of hosts. Therefore, prepare a contingency plan for network recovery. The drill cannot be terminated in an emergency scenario.
		DNS tempering	Tamper with the domain name address mapping. The drill can be terminated in an emergency scenario.
		Port occupation	Simulate the scenario where network ports of the system are occupied (a maximum of 100 ports can be occupied). The drill can be terminated in an emergency scenario.
		Server disconnection	Simulate the scenario where the entire server is disconnected, reject all TCP, UDP, and ICMP data packets, and open only ports 22, 8002, 39604, 33552, 33554, 33557, 32552, 32554, and 32557. The drill can be terminated in an emergency scenario.
		NIC bandwidth limiting	Limit the NIC bandwidth, support multiple NICs. The drill can be terminated in an emergency scenario.
		Connection exhaustion	Create a large number of socket connections to the specified server end (combination of the IP address and port number) to exhaust the connections. As a result, normal requests of the node cannot connect to the server (the requests of other nodes on the server may also be affected). The drill can be terminated in an emergency scenario.
	Customizing a fault	Customizing a script	Users can create scripts using automated O&M scripts and run the scripts to simulate faults. The drill can be terminated in an emergency scenario.
	Resource O&M	Device startup	Start ECSs, BMSs, and Flexus instances in batches. Status may not be synchronized in a timely manner. The drill can be terminated in an emergency scenario.
		Device shutdown	Shut down ECSs, BMSs, and Flexus instances in batches. Status may not be synchronized in a timely manner. The drill can be terminated in an emergency scenario.
		Device restart	Restart ECSs, BMSs, and Flexus instances in batches. Status may not be synchronized in a timely manner. The drill can be terminated in an emergency scenario.
BMSs	Disruptors for practicing	Qualifying practice	You can familiarize yourself with the chaos engineering process without worrying real faults.
	Host resources	CPU usage increase	Simulate CPU usage surge. The drill can be terminated in an emergency scenario.
		Memory usage increase	Simulate the memory usage surg. The drill can be terminated in an emergency scenario.
		Disk usage increase	Simulate the disk usage surge. The drill can be terminated in an emergency scenario.
		Disk I/O pressure increase	Continuously read and write files to increase disk I/O pressure. The drill can be terminated in an emergency scenario.
	Host process	Process ID exhaustion	The system process IDs (PIDs) are exhausted. The drill cannot be terminated in an emergency scenario.
	Host process	Process killing	Kill processes repeatedly during the fault duration. The drill can be terminated in an emergency scenario. After the emergency termination or drill is complete, the drill system does not start the processes. The service needs to ensure that the processes are restored.
	Host network	Network latency	Simulate network faults to increase link latency. The drill can be terminated in an emergency scenario.
		Network packet loss	Simulate network faults to cause packet loss on links. The drill can be terminated in an emergency scenario. The drill cannot be terminated when the packet loss rate is 100%.
		Network error packets	Simulate network faults to cause error packets on links. The drill can be terminated in an emergency scenario. The drill cannot be terminated when the packet error rate reaches 100%.
		Duplicate packets	Simulate duplicate packets generated on a link due to a network fault. The drill can be terminated in an emergency scenario.
		Network packet disorder	Simulate packet disorder generated on a link due to a network fault. The drill can be terminated in an emergency scenario.
		Network disconnection	Simulate the network disconnection between nodes. The drill can be terminated in an emergency scenario. Do not enter the IP addresses of the drill system and UniAgent server. Otherwise, the drill may fail. To interrupt an established persistent connection, select All for the interruption direction.
		NIC break-down	Simulate the NIC break-down scenario. The NIC may fail to be started after the NIC breaks down due to different network configurations of hosts. Therefore, prepare a contingency plan for network recovery. The drill cannot be terminated in an emergency scenario.
		DNS tempering	Tamper with the domain name address mapping. The drill can be terminated in an emergency scenario.
		Port occupation	Simulate the scenario where network ports of the system are occupied (a maximum of 100 ports can be occupied). The drill can be terminated in an emergency scenario.
		Server disconnection	Simulate the scenario where the entire server is disconnected, reject all TCP, UDP, and ICMP data packets, and open only ports 22, 8002, 39604, 33552, 33554, 33557, 32552, 32554, and 32557. The drill can be terminated in an emergency scenario.
	Resource O&M	Device startup	Start ECSs, BMSs, and Flexus instances in batches. Status may not be synchronized in a timely manner. The drill can be terminated in an emergency scenario.
		Device shutdown	Shut down ECSs, BMSs, and Flexus instances in batches. Status may not be synchronized in a timely manner. The drill can be terminated in an emergency scenario.
		Device restart	Restart ECSs, BMSs, and Flexus instances in batches. Status may not be synchronized in a timely manner. The drill can be terminated in an emergency scenario.
FlexusL instances (HCSS)	Disruptors for practicing	Qualifying practice	You can familiarize yourself with the chaos engineering process without worrying real faults.
	Host resources	CPU usage increase	Simulate CPU usage surge. The drill can be terminated in an emergency scenario.
		Memory usage increase	Simulate the memory usage surg. The drill can be terminated in an emergency scenario.
		Disk usage increase	Simulate the disk usage surge. The drill can be terminated in an emergency scenario.
		Disk I/O pressure increase	Continuously read and write files to increase disk I/O pressure. The drill can be terminated in an emergency scenario.
	Host process	Process ID exhaustion	The system process IDs (PIDs) are exhausted. The drill cannot be terminated in an emergency scenario.
	Host process	Process killing	Kill HCSS processes repeatedly during the fault duration. The drill can be terminated in an emergency scenario. After the emergency termination or drill is complete, the drill system does not start the processes. The service needs to ensure that the processes are restored.
	Host network	Network latency	Simulate network faults to increase link latency. The drill can be terminated in an emergency scenario.
		Network packet loss	Simulate network faults to cause packet loss on links. The drill can be terminated in an emergency scenario. The drill cannot be terminated when the packet loss rate is 100%.
		Network error packets	Simulate network faults to cause error packets on links. The drill can be terminated in an emergency scenario. The drill cannot be terminated when the packet error rate reaches 100%.
		Duplicate packets	Simulate duplicate packets generated on a link due to a network fault. The drill can be terminated in an emergency scenario.
		Network packet disorder	Simulate packet disorder generated on a link due to a network fault. The drill can be terminated in an emergency scenario.
		Network disconnection	Simulate the network disconnection between nodes. The drill can be terminated in an emergency scenario. Do not enter the IP addresses of the drill system and UniAgent server. Otherwise, the drill may fail. To interrupt an established persistent connection, select All for the interruption direction.
		NIC break-down	Simulate the NIC break-down scenario. The NIC may fail to be started after the NIC breaks down due to different network configurations of hosts. Therefore, prepare a contingency plan for network recovery. The drill cannot be terminated in an emergency scenario.
		DNS tempering	Tamper with the domain name address mapping. The drill can be terminated in an emergency scenario.
		Port occupation	Simulate the scenario where network ports of the system are occupied (a maximum of 100 ports can be occupied). The drill can be terminated in an emergency scenario.
		Server disconnection	Simulate the scenario where the entire server is disconnected, reject all TCP, UDP, and ICMP data packets, and open only ports 22, 8002, 39604, 33552, 33554, 33557, 32552, 32554, and 32557. The drill can be terminated in an emergency scenario.
	Resource O&M	Device startup	Start ECSs, BMSs, and Flexus instances in batches. Status may not be synchronized in a timely manner. The drill can be terminated in an emergency scenario.
		Device shutdown	Shut down ECSs, BMSs, and Flexus instances in batches. Status may not be synchronized in a timely manner. The drill can be terminated in an emergency scenario.
		Device restart	Restart ECSs, BMSs, and Flexus instances in batches. Status may not be synchronized in a timely manner. The drill can be terminated in an emergency scenario.
CCE nodes	Disruptors for practicing	Qualifying practice	You can familiarize yourself with the chaos engineering process without worrying real faults.
	Host resources	CPU usage increase	Simulate CPU usage surge. The drill can be terminated in an emergency scenario.
		Memory usage increase	Simulate the memory usage surg. The drill can be terminated in an emergency scenario.
		Disk usage increase	Simulate the disk usage surge. The drill can be terminated in an emergency scenario.
		Disk I/O pressure increase	Continuously read and write files to increase disk I/O pressure. The drill can be terminated in an emergency scenario.
	Host process	Process ID exhaustion	The system process IDs (PIDs) are exhausted. The drill cannot be terminated in an emergency scenario.
	Host process	Process killing	Kill processes repeatedly during the fault duration. The drill can be terminated in an emergency scenario. After the emergency termination or drill is complete, the drill system does not start the processes. The service needs to ensure that the processes are restored.
	Host network	Network latency	Simulate network faults to increase link latency. The drill can be terminated in an emergency scenario.
		Network packet loss	Simulate network faults to cause packet loss on links. The drill can be terminated in an emergency scenario. The drill cannot be terminated when the packet loss rate is 100%.
		Network error packets	Simulate network faults to cause error packets on links. The drill can be terminated in an emergency scenario. The drill cannot be terminated when the packet error rate reaches 100%.
		Duplicate packets	Simulate duplicate packets generated on a link due to a network fault. The drill can be terminated in an emergency scenario.
		Network packet disorder	Simulate packet disorder generated on a link due to a network fault. The drill can be terminated in an emergency scenario.
		Network disconnection	Simulate the network disconnection between nodes. The drill can be terminated in an emergency scenario. Do not enter the IP addresses of the drill system and UniAgent server. Otherwise, the drill may fail. To interrupt an established persistent connection, select All for the interruption direction.
		NIC break-down	Simulate the NIC break-down scenario. The NIC may fail to be started after the NIC breaks down due to different network configurations of hosts. Therefore, prepare a contingency plan for network recovery. The drill cannot be terminated in an emergency scenario.
		DNS tempering	Tamper with the domain name address mapping. The drill can be terminated in an emergency scenario.
		Port occupation	Simulate the scenario where network ports of the system are occupied (a maximum of 100 ports can be occupied). The drill can be terminated in an emergency scenario.
		Server disconnection	Simulate the scenario where the entire server is disconnected, reject all TCP, UDP, and ICMP data packets, and open only ports 22, 8002, 39604, 33552, 33554, 33557, 32552, 32554, and 32557. The drill can be terminated in an emergency scenario.
		NIC bandwidth limiting	Limit the NIC bandwidth, support multiple NICs. The drill can be terminated in an emergency scenario.
		Connection exhaustion	Create a large number of socket connections to the specified server end (combination of the IP address and port number) to exhaust the connections. As a result, normal requests of the node cannot connect to the server (the requests of other nodes on the server may also be affected). The drill can be terminated in an emergency scenario.
CCE pods	Pod resources	Increases pod CPU usage	Simulate a pod CPU usage surge. Ensure the attack target is writable. If it is not, the drill will fail. If the drill fails, you can use the emergency termination function.
		Pod memory usage increase	Simulate a pod memory usage surge. Ensure the attack target is writable. If it is not, the drill will fail. In this case, you can use the emergency termination function.
		Pod disk I/O pressure	Continuously simulates I/O reads and writes. The drill can be terminated in an emergency scenario.
		Pod disk usage increase	Writes large files to a specified directory to simulate the pressure increase of the Kubernetes container file system. The drill can be terminated in an emergency scenario.
	Pod process	Forcible pod stopping	Forcibly stop a pod. The drill cannot be terminated in an emergency scenario.
	Pod process	Forcibly killing containers in a pod	Forcibly kill containers in a pod. The drill cannot be terminated in an emergency scenario.
	Pod network	Pod network latency	Simulate a network fault that incurs the network latency increase in a pod. Drills can be terminated in an emergency scenario. Drills cannot be terminated when the latency reaches 30,000 ms.
		Pod network packet loss	Simulate a network fault that incurs packet loss in a pod. Drills can be terminated in an emergency scenario.
		Pod network interruption	Simulate a network disconnection between a POD and other IP addresses. The drill can be terminated in an emergency scenario. To interrupt an established persistent connection, select all directions as the directions to be interrupted.
		Pod network packet disorder	Simulate packet disorder generated on a link due to a pod network fault. Drills can be terminated in an emergency scenario.
		Duplicate pod network packets	Simulate duplicate packets generated on a link due to a pod network fault. Drills can be terminated in an emergency scenario.
		Pod DNS tampering	If the address mapping of the domain name is tampered with in the pod, ensure that the running user of the attack target is root. Otherwise, the drill will fail due to insufficient permission. The drill can be terminated in an emergency scenario.
		Pod port masking	Simulate disabling of a pod port. The drill can be terminated in an emergency scenario.
		Pod network isolation	Simulate the scenario where access from a pod to another IP network is directly rejected. The drill can be terminated in an emergency scenario. If you need to reject established persistent connections, select All for Direction.
RDS instances	Instances	RDS active/standby switchover	Only MySQL and PostgreSQL engines in HA mode are supported. This operation is not allowed during creating and restarting instances, upgrading databases, recovering and modifying ports, as well as creating and deleting accounts. Active/standby switchover cannot change the IP address of the internal network of an instance. The drill cannot be terminated in an emergency scenario.
RDS instances	Instances	Stopping an RDS instance	Stop both the primary and read-only instances. After the fault duration ends, start the instance. The drill can be terminated in an emergency.
DCS instances	Instances	DCS active/standby switchover	Switch the primary and standby DB instance nodes. This operation is supported only for primary/standby DB instances. The drill cannot be terminated in an emergency scenario.
		DCS instance restart	Restart a running DCS instance. If you clear data of a Redis 4.0, 5.0, or 6.0 instance, the cleared data cannot be restored. Exercise caution when performing this operation. The drill cannot be terminated in an emergency scenario.
		Powering off a DCS AZ	All nodes in the AZ are powered off centrally. The drill cannot be terminated in an emergency scenario. This disruptor is not supported in some areas.
CSS instances	Instances	Restarting a CSS cluster	Restart the CSS cluster that is in the available status. During the restart, Kibana and Cerebro may fail to be accessed. The drill cannot be terminated in an emergency scenario.
DDS instances	Instances	Forcibly promoting a standby node to primary	Supported forcible promotion of standby nodes to primary for backup sets, shards, and config nodes. However, there is a risk of failure when the primary/standby latency is large. The drill cannot be terminated in an emergency scenario.
IDC offline resource VMs	Disruptors for practicing	Qualifying practice	You can familiarize yourself with the chaos engineering process without worrying real faults.
	Host resources	CPU usage increase	Simulate CPU usage surge. The drill can be terminated in an emergency scenario.
		Memory usage increase	Simulate the memory usage surg. The drill can be terminated in an emergency scenario.
		Disk usage increase	Simulate the disk usage surge. The drill can be terminated in an emergency scenario.
		Disk I/O pressure increase	Continuously read and write files to increase disk I/O pressure. The drill can be terminated in an emergency scenario.
	Host process	Process ID exhaustion	The system process IDs (PIDs) are exhausted. The drill cannot be terminated in an emergency scenario.
	Host process	Process killing	Kill processes repeatedly during the fault duration. The drill can be terminated in an emergency scenario. After the emergency termination or drill is complete, the drill system does not start the processes. The service needs to ensure that the processes are restored.
	Host network	Network latency	Simulate network faults to increase link latency. The drill can be terminated in an emergency scenario.
		Network packet loss	Simulate network faults to cause packet loss on links. The drill can be terminated in an emergency scenario. The drill cannot be terminated when the packet loss rate is 100%.
		Network error packets	Simulate network faults to cause error packets on links. The drill can be terminated in an emergency scenario. The drill cannot be terminated when the packet error rate reaches 100%.
		Duplicate packets	Simulate duplicate packets generated on a link due to a network fault. The drill can be terminated in an emergency scenario.
		Network packet disorder	Simulate packet disorder generated on a link due to a network fault. The drill can be terminated in an emergency scenario.
		Network disconnection	Simulate the network disconnection between nodes. The drill can be terminated in an emergency scenario. Do not enter the IP addresses of the drill system and UniAgent server. Otherwise, the drill may fail. To interrupt an established persistent connection, select All for the interruption direction.
		NIC break-down	Simulate the NIC break-down scenario. The NIC may fail to be started after the NIC breaks down due to different network configurations of hosts. Therefore, prepare a contingency plan for network recovery. The drill cannot be terminated in an emergency scenario.
		DNS tempering	Tamper with the domain name address mapping. The drill can be terminated in an emergency scenario.
		Port occupation	Simulate the scenario where network ports of the system are occupied (a maximum of 100 ports can be occupied). The drill can be terminated in an emergency scenario.
		Server disconnection	Simulate the scenario where the entire server is disconnected, reject all TCP, UDP, and ICMP data packets, and open only ports 22, 8002, 39604, 33552, 33554, 33557, 32552, 32554, and 32557. The drill can be terminated in an emergency scenario.
		NIC bandwidth limiting	Limit the NIC bandwidth, support multiple NICs. The drill can be terminated in an emergency scenario.
		Connection exhaustion	Create a large number of socket connections to the specified server end (combination of the IP address and port number) to exhaust the connections. As a result, normal requests of the node cannot connect to the server (the requests of other nodes on the server may also be affected). The drill can be terminated in an emergency scenario.
Alibaba Cloud server	Disruptors for practicing	Qualifying practice	You can familiarize yourself with the chaos engineering process without worrying real faults.
	Host resources	CPU usage increase	Simulate CPU usage surge. The drill can be terminated in an emergency scenario.
		Memory usage increase	Simulate the memory usage surg. The drill can be terminated in an emergency scenario.
		Disk usage increase	Simulate the disk usage surge. The drill can be terminated in an emergency scenario.
		Disk I/O pressure increase	Continuously read and write files to increase disk I/O pressure. The drill can be terminated in an emergency scenario.
	Host process	Process ID exhaustion	The system process IDs (PIDs) are exhausted. The drill cannot be terminated in an emergency scenario.
	Host process	Process killing	Kill processes repeatedly during the fault duration. The drill can be terminated in an emergency scenario. After the emergency termination or drill is complete, the drill system does not start the processes. The service needs to ensure that the processes are restored.
	Host network	Network latency	Simulate network faults to increase link latency. The drill can be terminated in an emergency scenario.
		Network packet loss	Simulate network faults to cause packet loss on links. The drill can be terminated in an emergency scenario. The drill cannot be terminated when the packet loss rate is 100%.
		Network error packets	Simulate network faults to cause error packets on links. The drill can be terminated in an emergency scenario. The drill cannot be terminated when the packet error rate reaches 100%.
		Duplicate packets	Simulate duplicate packets generated on a link due to a network fault. The drill can be terminated in an emergency scenario.
		Network packet disorder	Simulate packet disorder generated on a link due to a network fault. The drill can be terminated in an emergency scenario.
		Network disconnection	Simulate the network disconnection between nodes. The drill can be terminated in an emergency scenario. Do not enter the IP addresses of the drill system and UniAgent server. Otherwise, the drill may fail. To interrupt an established persistent connection, select All for the interruption direction.
		NIC break-down	Simulate the NIC break-down scenario. The NIC may fail to be started after the NIC breaks down due to different network configurations of hosts. Therefore, prepare a contingency plan for network recovery. The drill cannot be terminated in an emergency scenario.
		DNS tempering	Tamper with the domain name address mapping. The drill can be terminated in an emergency scenario.
		Port occupation	Simulate the scenario where network ports of the system are occupied (a maximum of 100 ports can be occupied). The drill can be terminated in an emergency scenario.
		Server disconnection	Simulate the scenario where the entire server is disconnected, reject all TCP, UDP, and ICMP data packets, and open only ports 22, 8002, 39604, 33552, 33554, 33557, 32552, 32554, and 32557. The drill can be terminated in an emergency scenario.
		NIC bandwidth limiting	Limit the NIC bandwidth, support multiple NICs. The drill can be terminated in an emergency scenario.
		Connection exhaustion	Create a large number of socket connections to the specified server end (combination of the IP address and port number) to exhaust the connections. As a result, normal requests of the node cannot connect to the server (the requests of other nodes on the server may also be affected). The drill can be terminated in an emergency scenario.

Customizing a Fault

Users can create scripts using automated O&M scripts and run the scripts to simulate faults. The drill can be terminated in an emergency scenario.

A custom fault is determined by the script you compiled. Therefore, when scripts are used to attack ECSs, exceptions such as high resource usage and network faults may occur. As a result, the status of the UniAgent installed on the ECSs may change to offline or abnormal. Exercise caution when performing this operation.

For details about the custom script specifications, see the following code:

      
       
         
         
           #!/bin/bash
set +x

function usage() {
    echo "Usage: {inject_fault|check_fault_status|rollback|clean}"
    exit 2
}

function inject_fault()
{
    echo "inject fault"
}

function check_fault_status()
{
    echo "check fault status"
}

function rollback()
{
    echo "rollback"
}

function clean()
{
    echo "clean"
}

case "$ACTION" in
    inject_fault)
        inject_fault
    ;;
    check_fault_status)
        check_fault_status
    ;;
    rollback)
        if [[ X"${CAN_ROLLBACK}" == X"true" ]]; then
            rollback
        else
            echo "not support to rollback"
        fi
    ;;
    clean)
        clean
    ;;
    *)
        usage
;;
esac

          

        

      
     

You are advised to define a custom fault script based on the preceding script specifications. In the preceding specifications, you can define the fault injection function, fault check function, fault rollback function, and environment clearing function by compiling customized content in the inject_fault(), check_fault_status(), rollback() and clean() functions.

According to the preceding specifications, there are two mandatory script parameters: Whether other script parameters are included depends on your script content.

**Table 2** Mandatory parameters for customizing a fault script
Parameter	Value	Description
ACTION	inject_fault	Drill operation action. The value is automatically changed by the system background in different drill phases. The options are as follows: inject_fault: The drill is in the fault injection phase. check_fault_status: The drill is in the fault query phase. rollback: The drill is in the phase of canceling the fault injection. clean: The drill is in the environment clearing phase.
CAN_ROLLBACK	false	Whether rollback is supported. The options are as follows: true: When the drill is in the phase of canceling the fault injection, the rollback() function is executed. false: When the drill is in the phase of canceling the fault injection, the rollback() function is not executed.

In the inject_fault function, add a label indicating that the fault injection is successful, and check whether the label exists in the check_fault_status function.

If the label exists, the check_fault_status function can return normally (for example, exit 0).
If the label does not exist, the check_fault_status function will return an abnormality (for example, exit 1).

Custom Script Example

The following is an example of a customized script.

The script content is as follows:

      
       
         
         
           #!/bin/bash
set +x
PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin:~/bin
export PATH


function usage() {
    echo "Usage: {inject_fault|check_fault_status|rollback|clean}"
    exit 2
}

function inject_fault()
{
    echo "============start inject fault============"
    if [ ! -d "${SCRIPT_PATH}/${DIR_NAME}" ]; then
        mkdir -p "${SCRIPT_PATH}/${DIR_NAME}"
        echo "mkdir ${SCRIPT_PATH}/${DIR_NAME} successfully"
    fi

    cd "${SCRIPT_PATH}/${DIR_NAME}"

    if [ ! -f ${FILE} ]; then
        touch "${FILE}"
        echo "create tmp file ${FILE}"
        touch inject.log
        chmod u+x "${FILE}"
        chmod u+x inject.log
    else
        echo "append content">${FILE}
    fi
    echo "successfully inject">${FILE}
    echo "============end inject fault============"
}

function check_fault_status()
{
    echo "============start check fault status============"
    if [ ! -d "${SCRIPT_PATH}/${DIR_NAME}" ]; then
        echo "inject has been finished"
        exit 0
    fi
    cd "${SCRIPT_PATH}/${DIR_NAME}"
    SUCCESS_FLAG="successfully inject"

    if [ -f ${FILE} ]; then
        if [[ "$(sed -n '1p' ${FILE})" = "${SUCCESS_FLAG}" ]]; then
            echo "fault inject successfully"
        else
            echo "fault inject failed"
            exit 1
        fi
    else
        echo "inject finished"
        exit 0
    fi
    sleep ${DURATION}
    echo "============end check fault status============"
}

function rollback()
{
    echo "============start rollback============"
    cd "${SCRIPT_PATH}"
    if [ -d $DIR_NAME ]; then
        rm -rf "${SCRIPT_PATH}/${DIR_NAME}"
    fi
    echo "============end rollback============"
}

function clean()
{
    echo "============start clean============"
    cd "${SCRIPT_PATH}"
    if [ -d $DIR_NAME ]; then
        rm -rf "${SCRIPT_PATH}/${DIR_NAME}"
    fi
    echo "============end clean============"
}

case "$ACTION" in
    inject_fault)
        inject_fault
    ;;
    check_fault_status)
        check_fault_status
    ;;
    rollback)
        if [[ X"${CAN_ROLLBACK}" == X"true" ]]; then
            rollback
        else
            echo "not support to rollback"
        fi
    ;;
    clean)
        clean
    ;;
    *)
        usage
;;
esac

          

        

      
     

The input parameters of the script are as follows:

**Table 3** Script input parameters of the customized script example
Parameter	Value	Description
ACTION	inject_fault	Drill operation action
CAN_ROLLBACK	false	Rollback is not supported.
SCRIPT_PATH	/tmp	Root directory of the custom fault log
DIR_NAME	test_script	Parent directory of the custom fault log
FILE	test.log	Custom fault log name
DURATION	10	Duration of a simulated custom fault, in seconds. (This parameter does not take effect when it is placed in the inject_fault function.)

In the sample inject_fault function, the injected fault is to create a {FILE} file and add content to the {FILE} file. If successfully inject is entered in the {FILE} file, the fault injection is successful.
In the example, the check_fault_status function checks whether the {FILE} file exists. If no, the fault may have been rectified. In this case, exit 1 is returned. If yes, check whether the label indicating that the fault injection is successful exists. If the label exists, the fault injection is successful. Here, sleep {DURATION} is used to simulate the fault duration. If the label does not exist, the fault injection fails.