Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Situation Awareness
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive
Help Center/ Elastic Cloud Server/ Troubleshooting/ Linux ECS Issues/ How Do I Configure atop and kdump on Linux ECSs for Performance Analysis?

How Do I Configure atop and kdump on Linux ECSs for Performance Analysis?

Updated on 2024-08-15 GMT+08:00

Introduction to atop

atop is a monitor for Linux that can report the activity of all processes and resource consumption by all processes at regular intervals. It shows system-level activity related to the CPU, memory, disks, and network layers for every process. It also logs system and process activities daily and saves the logs in disks for long-term analysis.

Preparing for atop Installation

  • Ensure that the target ECS already has an EIP bound.
  • Ensure that the target ECS can access YUM.

Configuring atop for CentOS 7/8, AlmaLinux, and Rocky Linux

  1. Run the following command to install atop:

    yum install -y atop

  2. Run the following command to modify the configuration file of atop:

    vi /etc/sysconfig/atop

    Modify the following parameters, save the modification, and exit:

    • Change the value of LOGINTERVAL to, for example, 15. The default value of LOGINTERVAL is 600, in seconds.
    • Change the value of LOGGENERATIONS to, for example, 3. The default retention period of atop logs is 28 days.
      LOGINTERVAL=15
      LOGGENERATIONS=3 
  1. Run the following command to start atop:

    systemctl start atop

  2. Run the following command to check the status of atop. If active (running) is displayed in the output, atop is running properly.

    systemctl status atop

    atop.service - Atop advanced performance monitor 
     Loaded: loaded (/usr/lib/systemd/system/atop.service; enabled; vendor preset: disabled) 
     Active: active (running) since Sat 2024-03-6 11:49:47 CST; 2h 27min ago 

Configuring atop for CentOS 6

  1. Run the following command to install atop:

    yum install -y atop

  2. Run the following command to modify the configuration file of atop:

    vi /etc/sysconfig/atop

    Modify the following parameters, save the modification, and exit:

    The default value of LOGINTERVAL is 600 (seconds), but you can change it to, for example, 15.

    LOGINTERVAL=15

    vi /etc/logrotate.d/atop

    Modify the following parameters, save the modification, and exit:

    You can change the value of -mtime to, for example, 3. The default retention period of atop logs is 40 days.

        postrotate
          /usr/bin/find /var/log/atop/ -maxdepth 1 -mount -name atop_\[0-9\]\[0-9\]\[0-9\]\[0-9\]\[0-9\]\[0-9\]\[0-9\]\[0-9\]\* -mtime +3 -exec /bin/rm {} \;
        endscript
  3. Run the following command to start atop:

    service atop start

  4. Run the following command to check the status of atop. is running indicates that atop is running properly.

    service atop status

    atop (pid 3170) is running

Configuring atop for Ubuntu 20/22 and Debian 10/11

  1. Run the following command to install atop:

    apt-get install -y atop

  2. Run the following command to modify the configuration file of atop:

    vi /etc/default/atop

    Modify the following parameters, save the modification, and exit:

    • Change the value of LOGINTERVAL to, for example, 15. The default value of LOGINTERVAL is 600, in seconds.
    • Change the value of LOGGENERATIONS to, for example, 3. The default retention period of atop logs is 28 days.
      LOGINTERVAL=15
      LOGGENERATIONS=3 
  1. Run the following command to start atop:

    systemctl start atop

  2. Run the following command to check the status of atop. active (running) indicates that atop is running properly.

    systemctl status atop

    atop.service - Atop advanced performance monitor 
     Loaded: loaded (/etc/init.d/atop; bad; vendor preset: disabled) 
     Active: active (running) since Sat 2024-03-11 14:09:47 CST; 16s ago

Configuring atop for Ubuntu 18 and Debian 8/9

  1. Run the following command to install atop:

    apt-get install -y atop

  2. Run the following command to modify the configuration file of atop:

    vi /usr/share/atop/atop.daily

    Modify the following parameters, save the modification, and exit:

    • The default value of LOGINTERVAL is 600 (seconds), but you can change it to, for example, 15.
    • You can change the value of -mtime to, for example, 3. The default retention period of atop logs is 28 days.
      LOGINTERVAL=15
      ……
      ( (sleep 3; find $LOGPATH -name 'atop_*' -mtime +3 -exec rm {} \;)& ) 
  1. Run the following command to start atop:

    systemctl start atop

  2. Run the following command to check the status of atop. active (running) indicates that atop is running properly.

    systemctl status atop

    atop.service - Atop advanced performance monitor 
     Loaded: loaded (/etc/init.d/atop; bad; vendor preset: disabled) 
     Active: active (running) since Sat 2024-03-6 14:09:47 CST; 15s ago 

Configuring atop for Ubuntu 16

  1. Run the following command to install atop:

    apt-get install -y atop

  2. Run the following command to modify the configuration file of atop:

    vi /etc/default/atop

    Modify the following parameters, save the modification, and exit:

    • The default value of LOGINTERVAL is 600 (seconds), but you can change it to, for example, 15.
    • The default retention period of atop logs is 28 days and cannot be modified.
      LOGINTERVAL=15
  1. Run the following command to start atop:

    systemctl start atop

  2. Run the following command to check the status of atop. active (running) indicates that atop is running properly.

    systemctl status atop

    atop.service - LSB: Monitor for system resources and process activity
       Loaded: loaded (/etc/init.d/atop; bad; vendor preset: enabled)
       Active: active (running) since Mon 2024-04-29 19:33:22 CST; 38s ago

Configuring atop for SUSE 12 or SUSE 15

  1. Run the following command to download the atop source package:

    wget https://www.atoptool.nl/download/atop-2.6.0-1.src.rpm

  2. Run the following command to install the package:

    rpm -ivh atop-2.6.0-1.src.rpm

  3. Run the following command to install atop dependencies.

    zypper -n install rpm-build ncurses-devel zlib-devel

  4. Run the following command to compile atop:

    cd /usr/src/packages/SPECS

    rpmbuild -bb atop-2.6.0.spec

  5. Run the following command to install atop:

    cd /usr/src/packages/RPMS/x86_64

    rpm -ivh atop-2.6.0-1.x86_64.rpm

  6. Run the following command to modify the configuration file of atop:

    vi /etc/default/atop

    Modify the following parameters, save the modification, and exit:

    • Change the value of LOGINTERVAL to, for example, 15. The default value of LOGINTERVAL is 600, in seconds.
    • Change the value of LOGGENERATIONS to, for example, 3. The default retention period of atop logs is 28 days.
    LOGINTERVAL=15
    LOGGENERATIONS=3 
  1. Run the following command to restart atop:

    systemctl restart atop

  2. Run the following command to check the status of atop. active (running) indicates that atop is running properly.

    systemctl status atop

    atop.service - Atop advanced performance monitor 
     Loaded: loaded (/usr/lib/systemd/system/atop.service; enabled; vendor preset: disabled) 
     Active: active (running) since Sat 2021-06-19 16:50:01 CST; 6s ago

Installing atop by Compiling the Source Code (for CentOS Stream 9, openEuler or EulerOS)

  1. Download the atop source package.

    wget https://www.atoptool.nl/download/atop-2.6.0.tar.gz

  1. Decompress the source package.

    tar -zxvf atop-2.6.0.tar.gz

  2. Query the systemctl version.

    systemctl --version

    If the version is 220 or later, go to the next step.

    Otherwise, delete parameter --now from the Makefile of atop.

    vi atop-2.6.0/Makefile

    Delete parameter --now following the systemctl command.

                    then   /bin/systemctl disable  atop     2> /dev/null; \
                            /bin/systemctl disable  atopacct 2> /dev/null; \
                            /bin/systemctl daemon-reload;                   \
                            /bin/systemctl enable   atopacct;          \
                            /bin/systemctl enable   atop;              \
                            /bin/systemctl enable   atop-rotate.timer; \
  3. Install atop dependencies.
    • Installing command for SUSE 12 or SUSE 15

      zypper -n install make gcc zlib-devel ncurses-devel

    • Installing command for EulerOS or Fedora

      yum install make gcc zlib-devel ncurses-devel -y

    • Installing command for Debian 9, Debian 10, or Ubuntu

      apt install make gcc zlib1g-dev libncurses5-dev libncursesw5-dev -y

  4. Run the following commands to compile and install atop.

    cd atop-2.6.0

    make systemdinstall

  5. Modify the configuration file of atop.

    vi /etc/default/atop

    Modify the following parameters, save the modification, and exit:

    • Change the value of LOGINTERVAL to, for example, 15. The default value of LOGINTERVAL is 600, in seconds.
    • Change the value of LOGGENERATIONS to, for example, 3. The default retention period of atop logs is 28 days.
      LOGOPTS=""
      LOGINTERVAL=15
      LOGGENERATIONS=3
      LOGPATH=/var/log/atop 
  1. Restart atop.

    systemctl restart atop

  2. Run the following command to check the status of atop. active (running) indicates that atop is running properly.

    systemctl status atop

    atop.service - Atop advanced performance monitor    Loaded: loaded(/lib/systemd/system/atop.service; enabled)    Active: active (running) since Sun2021-07-25 19:29:40 CST; 4s ago .

Analyzing atop Logs

After startup, atop stores collection records in /var/log/atop.

Run the following command to check the log file:

atop -r /var/log/atop/atop_2024XXXX

  • Common atop commands
    After opening the log file, you can use the following commands to sort data.
    • c: used to sort processes by CPU usage in descending order.
    • m: used to sort processes by memory usage in descending order.
    • d: used to sort processes by disk usage in descending order.
    • a: used to sort processes by the overall resource usage in descending order.
    • n: used to sort processes by network usage in descending order.
    • t: used to go to the next monitoring collection point.
    • T: used to go to the previous monitoring collection point.
    • b: used to specify a time point in the format of YYYYMMDDhhmm.
  • System resource monitoring fields

    The following figure shows some monitoring fields and values. The values vary according to the sampling period and atop version. The figure is for reference only.

    Figure 1 System resource monitoring fields
    Description of major fields is as follows:
    • ATOP row: Specifies the host name and information sampling date and time.
    • PRC row: Specifies the running status of a process.
    • #sys and user: Specifies how long the CPU is occupied when the system is running in kernel mode and user mode.
    • #proc: Specifies the total number of processes.
    • #zombie: Specifies the number of zombie processes.
    • #exit: Specifies the number of processes that exited during the sampling period.
    • CPU row: Specifies the overall CPU usage (multi-core CPU as a whole CPU). The sum of the values in the CPU row is N x 100%. N indicates the number of vCPUs.
    • #sys and user: Specifies the percentage of how long the CPU is occupied when the system is running in kernel mode and user mode.
    • #irq: Specifies the percentage of time when CPU is servicing interrupts.
    • #idle: Specifies the percentage of time when CPU is idle.
    • #wait: Specifies the percentage of time when CPU is idle due to I/O wait.
    • CPL row: Specifies CPU load.
    • #avg1, avg5 and avg15: Specifies the average number of running processes in the past 1, 5, and 15 minutes, respectively.
    • #csw: Specifies the number of context exchanges.
    • #intr: Specifies the number of interruptions.
    • MEM row: Specifies the memory usage.
    • #tot: Specifies the physical memory size.
    • #free: Specifies the size of available physical memory.
    • #cache: Specifies the memory size used for page cache.
    • #buff: Specifies the memory size used for file cache.
    • #slab: Specifies the memory size occupied by the system kernel.
    • SWP row: Specifies the usage of swap space.
    • #tot: Specifies the total swap space.
    • #free: Specifies the size of available swap space.
    • DSK row: Specifies the disk usage. Each disk device corresponds to a column. If there is an sdb device, a DSK row should be added.
    • #sda: Specifies the disk device identifier.
    • #busy: Specifies the percentage of time when the disk is busy.
    • #read and write: Specifies the number of read and write requests.
    • NET row: Displays the network status, covering the transport layer (TCP and UDP), IP layer, and active network ports.
    • #xxxxxi: Specifies the number of packets received by each layer or active network port.
    • #xxxxxo: Specifies the number of packets sent by each layer or active network port.
  • Stopping atop

    Running atop occupies extra system and disk resources. You are not advised to run it for a long time in the service environment. After faults are rectified, run the following command to stop atop:

    systemctl stop atop

    For CentOS 6, run the following command to stop atop:

    service atop stop

Precautions for Configuring kdump

The method for configuring kdump described in this section applies to KVM ECSs running EulerOS or CentOS 7.x. For details, see Documentation for kdump.

Introduction to kdump

kdump is a feature of the Linux kernel that creates crash dumps in the event of a kernel crash. In the event of a kernel crash, kdump boots another Linux kernel and uses it to export an image of RAM, which is known as vmcore and can be used to debug and determine the cause of the crash.

Configuring kdump

  1. Run the following command to check whether kexec-tools is installed:

    rpm -q kexec-tools

    If it is not installed, run the following command to install it:

    yum install -y kexec-tools

  2. Run the following command to enable kdump to run at system startup:

    systemctl enable kdump

  3. Configure the parameters for the crash kernel to reserve the memory for the capture kernel.

    Check whether the parameters are configured.

    grep crashkernel /proc/cmdline

    If the command output is displayed, this parameter has been configured.

    Edit the /etc/default/grub file to configure the following parameters:
    GRUB_TIMEOUT=5
    GRUB_DEFAULT=saved
    GRUB_DISABLE_SUBMENU=true
    GRUB_TERMINAL_OUTPUT="console"
    GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=rhel00/root rd.lvm.lv=rhel00/swap
    rhgb quiet"
    GRUB_DISABLE_RECOVERY="true"

    Locate parameter GRUB_CMDLINE_LINUX and add crashkernel=auto after it.

  4. Run the following command for the configuration to take effect:

    grub2-mkconfig -o /boot/grub2/grub.cfg

  5. Open the /etc/kdump.conf file, locate parameter path, and add /var/crash after it.
    path  /var/crash

    By default, the file is saved in the /var/crash directory.

    You can save the file to another directory, for example, /home/kdump. Then add /home/kdump after parameter path:
    path  /home/kdump
    NOTE:

    There must be enough space in the specified path for storing the vmcore file. It is recommended that the available space be greater than or equal to the RAM size. You can also store the vmcore file on a shared device such as SAN or NFS.

  6. Set the vmcore dump level.

    Add the following content to file /etc/kdump.conf. If the content already exists, skip this step.

    core_collector makedumpfile -d 31 -c

    where

    -c indicates compressing the vmcore file.

    -d indicates leaving out irrelevant data. Generally, the value following -d is 31, which is calculated based on the following values. You can adjust the value if needed.

    zero pages   = 1
    cache pages   = 2
    cache private = 4
    user  pages   = 8
    free  pages   = 16
  7. Run the following command to restart the system for the configurations to take effect:

    reboot

Checking Whether kdump Configurations Have Taken Effect

  1. Run the following command and check whether crashkernel=auto is displayed:

    cat /proc/cmdline |grep crashkernel

    BOOT_IMAGE=/boot/vmlinuz-3.10.0-514.44.5.10.h142.x86_64 root=UUID=6407d6ac-c761-43cc-a9dd-1383de3fc995 ro crash_kexec_post_notifiers softlockup_panic=1 panic=3 reserve_kbox_mem=16M nmi_watchdog=1 rd.shell=0 fsck.mode=auto fsck.repair=yes net.ifnames=0 spectre_v2=off nopti noibrs noibpb crashkernel=auto LANG=en_US.UTF-8
  2. Run the following command and check whether the configuration in the output is correct:

    grep core_collector /etc/kdump.conf |grep -v ^"#"

    core_collector makedumpfile -l --message-level 1 -d 31
  3. Run the following command and check whether the path configuration in the output is correct:

    grep path /etc/kdump.conf |grep -v ^"#"

    path /var/crash
  4. Run the following command and check whether the value of Active in the output is active (exited):

    systemctl status kdump

    ● kdump.service - Crash recovery kernel arming
    Loaded: loaded (/usr/lib/systemd/system/kdump.service; enabled; vendor preset: enabled)
    Active: active (exited) since Tue 2019-04-09 19:30:24 CST; 8min ago
    Process: 495 ExecStart=/usr/bin/kdumpctl start (code=exited, status=0/SUCCESS)
    Main PID: 495 (code=exited, status=0/SUCCESS)
    CGroup: /system.slice/system-hostos.slice/kdump.service
  5. Run the following test command:

    echo c > /proc/sysrq-trigger

    After the command is executed, kdump will be triggered, the system will be restarted, and the generated vmcore file will be saved to the path specified by path.

  6. Run the following command to check whether the vmcore file has been generated in the specified path, for example, /var/crash/:

    ll /var/crash/

We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out more

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback