Help Center> Cloud Eye> User Guide> Server Monitoring> Installing the Plug-in for Collecting GPU and RAID Metrics (Linux)

Installing the Plug-in for Collecting GPU and RAID Metrics (Linux)

Scenarios

This topic describes how to install the plug-in to collect ECS GPU and BMS RAID metrics.

  • ECSs support GPU metrics while BMSs do not.
  • BMSs support RAID metrics while ECSs do not.
  • If the Agent is upgraded to 1.0.5 or later, the corresponding plug-in must use the latest version. Otherwise, the metric collection will fail.

Prerequisites

  • The Agent has been installed and is running properly.
  • GPU metric collection requires ECSs to support GPU.
  • Run the following command to check the Agent version:

    if [[ -f /usr/local/uniagent/extension/install/telescope/bin/telescope ]]; then /usr/local/uniagent/extension/install/telescope/bin/telescope -v; else echo 0; fi

    • If 0 is returned, the old Agent is used.
    • If other values are returned, the new Agent is used.

Procedure

  1. Log in to an ECS as user root.
    • To monitor the BMS software RAID metrics, log in to a BMS.
    • The examples in the following procedure are based on the GPU plug-in installation. The installation for the software RAID plug-in is similar.
  2. Run the following command to go to the Agent installation path /usr/local/telescope:

    Old Agent:

    cd /usr/local/telescope

    New Agent:

    cd /usr/local/uniagent/extension/install/telescope

  3. Run the following command to create the plugins folder:

    mkdir plugins

  4. Run the following command to enter the plugins folder:

    cd plugins

  5. To download the script of the GPU metric collection plug-in, run the following command:

    wget https://telescope.obs.cn-north-1.myhuaweicloud.com/gpu_collector

    Table 1 Obtaining the plug-in installation package

    Name

    Download Path

    Linux 64-bit installation package of the GPU metric collection plug-in

    CN North-Beijing1: https://obs.cn-north-1.myhuaweicloud.com/uniagent-cn-north-1/extension/gpu/gpu_collector

    CN North-Beijing4: https://obs.cn-north-4.myhuaweicloud.com/uniagent-cn-north-4/extension/gpu/gpu_collector

    CN South-Guangzhou: https://obs.cn-south-1.myhuaweicloud.com/uniagent-cn-south-1/extension/gpu/gpu_collector

    CN East-Shanghai2: https://obs.cn-east-2.myhuaweicloud.com/uniagent-cn-east-2/extension/gpu/gpu_collector

    AP-Hong-Kong: https://obs.ap-southeast-1.myhuaweicloud.com/uniagent-ap-southeast-1/extension/gpu/gpu_collector

    AP-Bangkok: https://obs.ap-southeast-2.myhuaweicloud.com/uniagent-ap-southeast-2/extension/gpu/gpu_collector

    AP-Bangkok: https://obs.ap-southeast-3.myhuaweicloud.com/uniagent-ap-southeast-3/extension/gpu/gpu_collector

    Linux 64-bit installation package of the RAID metric collection plug-in

    CN North-Beijing1: https://obs.cn-north-1.myhuaweicloud.com/uniagent-cn-north-1/extension/raid/raid_monitor.sh

    CN North-Beijing4: https://obs.cn-north-4.myhuaweicloud.com/uniagent-cn-north-4/extension/raid/raid_monitor.sh

    CN South-Guangzhou: https://obs.cn-south-1.myhuaweicloud.com/uniagent-cn-south-1/extension/raid/raid_monitor.sh

    CN East-Shanghai2: https://obs.cn-east-2.myhuaweicloud.com/uniagent-cn-east-2/extension/raid/raid_monitor.sh

    AP-Hong-Kong: https://obs.ap-southeast-1.myhuaweicloud.com/uniagent-ap-southeast-1/extension/raid/raid_monitor.sh

    AP-Bangkok: https://obs.ap-southeast-2.myhuaweicloud.com/uniagent-ap-southeast-2/extension/raid/raid_monitor.sh

  6. Run the following command to add the script execution permissions:

    chmod 755 gpu_collector

  7. Run the following command to create the conf.json file, add the configuration content, and configure the plug-in path and metric collection period crontime, which is measured in seconds:

    vi conf.json

    Configuring the GPU metric plug-in old Telescope

    { 
       "plugins": [ 
         { 
           "path": "/usr/local/telescope/plugins/gpu_collector", 
           "crontime": 60 
         }
      ] 
    }

    Configuring the GPU metric plug-in (new Telescope)

    { 
       "plugins": [ 
         { 
           "path": "/usr/local/uniagent/extension/install/telescope/plugins/gpu_collector", 
           "crontime": 60 
         }
      ] 
    }

    Configuring the RAID metric plug-in (old Telescope)

    { 
       "plugins": [ 
         { 
           "path": "/usr/local/telescope/plugins/raid_monitor.sh", 
           "crontime": 60 
         }
      ] 
    }

    Configuring the RAID metric plug-in (new Telescope)

    { 
       "plugins": [ 
         { 
           "path": "/usr/local/uniagent/extension/install/telescope/plugins/raid_monitor.sh", 
           "crontime": 60 
         }
      ] 
    }
    • The parameters gpu_collector and raid_monitor.sh indicate the GPU plug-in and RAID plug-in configuration.
    • The collection period of the plug-in is 60 seconds. If the collection period is incorrectly configured, the metric collection will be abnormal.
    • Do not change the plug-in path without permission. Otherwise, the GPU metric collection will be abnormal.
  8. Open the conf_ces.json file in the /usr/local/telescope/bin directory. Add "EnablePlugin": true to the file to enable the metric collection function of the plug-in.

    The path of the new Telescope is /usr/local/uniagent/extension/install/telescope/bin.

    {    
        "Endpoint": "Region address. Retain the default value.",
        "EnablePlugin": true
    }
  9. Restart the Agent:

    Old Telescope:

    /usr/local/telescope/telescoped restart

    New Telescope:

    ps -ef | grep telescope | grep -v grep | awk '{print $2}' | xargs kill -9