Updated on 2024-09-06 GMT+08:00

Playbook Overview

Background

A malware attack is a process of spreading malware (such as viruses, worms, Trojans, and ransomware) to users through emails, remote downloads, and malicious advertisements, and executing malicious programs on target hosts. In this way, the attacker can manipulate remote hosts, hack the network system, steal sensitive information, or carry out other malicious activities. Such attacks pose a serious threat to the security of computer systems, networks, and personal devices, and may cause data leakage, system breakdown, personal privacy leakage, financial loss, and other security risks.

To solve the preceding problems, the solution can effectively identify malicious programs such as backdoors, Trojan horses, mining software, worms, and viruses, and detect unknown malicious programs and virus variants on hosts through program feature and behavior detection, AI image fingerprint algorithms, and cloud-based antivirus, it can also detect ransomware embedded in media such as web pages, software, emails, and storage media. It is critical to prevent such attacks and reduce risks.

The following describes how this playbook isolates and kills malware and ransomware.

Response Solutions

This built-in playbook automatically isolates and kills malware detected on servers protected by HSS.

The HSS file isolation and killing playbook has matched the HSS file isolation and killing process. When a malware or ransomware alarm is generated, the system checks the HSS protection version of the attacked asset, if the professional edition or later is used but automatic isolation and killing are not enabled, the isolation and killing conditions are met. If the isolation and killing are manually approved, the alarm is handled through HSS file isolation and killing. If the isolation software is successfully isolated, the alarm is disabled. Failed to isolate the comment. When you add a manually processed comment, a message is displayed, indicating that manual operations are required.

Incident Response

  1. Obtain, store, and record evidence.

    1. Based on your cloud environment configuration, you can configure HSS to detect security threats, such as malware and ransomware, through antivirus and HIPS tests.
    2. You can access the ECS using SSH and view the instance status and monitoring information to check whether any exception occurs. Or receive attack information or ransomware information in other ways to discover potential threats.
    3. Once an attack is confirmed as an incident, the affected scope, attacked machines, affected services, and data information need to be evaluated.
    4. The event conversion capability of SecMaster is used to continuously trace corresponding events and record information about all involved events. For details, see Converting Alerts to Incidents.
    5. In addition, log information can be traced. All related log information can be reviewed through the security analysis capability, and recorded and archived in the event management module for subsequent operation tracing.

  2. Contain incidents.

    1. Determine the attack type, affected hosts, and service processes based on alarms and logs.
    2. The HSS file isolation and killing script is used to perform process killing and software isolation operations on the involved process software. The subsequent impact is reduced.
    3. Check the infection scope. If there is an infection risk, check it. If there is an infection risk, handle it in a timely manner.
    4. Other playbooks and workflows can also be used for risk control, such as host isolation. Security group access control policies can be used to isolate infected machines and contain risks from further spreading.

  3. Eradicate incidents.

    1. Evaluate whether the affected hosts need to be hardened and restored. If the host has been damaged, you need to harden and restore the host based on the source tracing result. If attacks are caused by security credential leakage, delete any unauthorized IAM users, roles, and policies, and revoke credentials to harden the host.
    2. You can check for vulnerabilities, outdated software, and unpatched vulnerabilities on infected machines. These may cause continuous collapse of subsequent machines. You can use the vulnerability management function to check and fix the vulnerabilities of the corresponding machines. Check whether there are risky configurations. You can use the baseline check function to check the host configurations and rectify risky configurations in a timely manner.
    3. Evaluate the impact scope. If other hosts have been affected, handle all affected hosts.

  4. Recover from incident.

    1. Determine the restoration points of all restoration operations performed from the backup.
    2. View the backup policy to determine whether all objects and files can be restored, depending on the lifecycle policy applied to the resource.
    3. Use the forensic method to confirm that the data is secure before the restoration, and then restore the data from the backup or restore the data to an earlier snapshot of the ECS instance.
    4. If you have successfully restored data using any open-source decryption tool, delete the data from the instance and perform necessary analysis to confirm that the data is secure. Then, restore the instance, terminate or isolate the instance, create a new instance, and restore the data to the new instance.
    5. If restoring or decrypting data from a backup is not feasible, evaluate the possibility of restarting in a new environment.

  5. Perform post-incident activities.

    1. Analyze alarm details in the entire alarm handling process, continuously operate and optimize the model, and improve the model alarm accuracy. If it is determined that the alarm is related to a service and there is no risk, the alarm can be directly filtered by using a model.
    2. By tracing alarms, you can better understand the entire process of an event, continuously optimize asset protection policies, reduce resource risks, and reduce the attack surface.
    3. Optimize the automatic processing playbook process based on the actual service scenario. For example, you can replace the manual review policy with the automatic processing policy to improve the alarm accuracy after analysis, improving the processing efficiency and quickly handling risks.
    4. Risk analysis can be performed based on all similar malware and ransomware attack points to control risks before incidents occur.