Installing the GPU Metrics Collection Plug-in (Linux)
Scenarios
This topic describes how to install the plug-in to collect GPU and RAID metrics.
- ECSs support GPU metrics while BMSs do not.
- BMSs support RAID metrics while ECSs do not.
- If the Agent is upgraded to 1.0.5 or later, the corresponding plug-in must use the latest version. Otherwise, the metric collection will fail.
Prerequisites
- The Agent has been installed and is running properly.
- GPU metric collection requires ECSs to support GPU.
- Run the following command to check the Agent version:
if [[ -f /usr/local/uniagent/extension/install/telescope/bin/telescope ]]; then /usr/local/uniagent/extension/install/telescope/bin/telescope -v; elif [[ -f /usr/local/telescope/bin/telescope ]]; then echo "old agent"; else echo 0; fi
- If old agent is displayed, the early version of the Agent is used.
- If a version is returned, the new version of the Agent is used.
- If 0 is returned, the Agent is not installed.
Procedure (New Version)
- Log in to an ECS as user root.
- To monitor the BMS software RAID metrics, log in to a BMS.
- The examples in the following procedure are based on the GPU plug-in installation. The installation for the software RAID plug-in is similar.
- Run the following command to go to the Agent installation path /usr/local/telescope:
cd /usr/local/uniagent/extension/install/telescope
- Run the following command to create the plugins folder:
mkdir plugins
- Run the following command to enter the plugins folder:
cd plugins
- To download the script of the GPU metric collection plug-in, run the following command:
wget https://telescope-eu-west-101.obs.eu-west-101.myhuaweicloud.eu/gpu_collector
Table 1 Obtaining the plug-in installation package Name
Download Path
Linux 64-bit installation package of the GPU metric collection plug-in
eu-west-101: https://telescope-eu-west-101.obs.eu-west-101.myhuaweicloud.eu/gpu_collector
- Run the following command to add the script execution permissions:
- Run the following command to create the conf.json file, add the configuration content, and configure the plug-in path and metric collection period crontime, which is measured in seconds:
vi conf.json
GPU metric plug-in configuration
{ "plugins": [ { "path": "/usr/local/uniagent/extension/install/telescope/plugins/gpu_collector", "crontime": 60 } ] }
RAID metric plug-in configuration
{ "plugins": [ { "path": "/usr/local/uniagent/extension/install/telescope/plugins/raid_monitor.sh", "crontime": 60 } ] }
- The parameters gpu_collector and raid_monitor.sh indicate the GPU plug-in and RAID plug-in configuration.
- The collection period of the plug-in is 60 seconds. If the collection period is incorrectly configured, the metric collection will be abnormal.
- Do not change the plug-in path without permission. Otherwise, the metric collection will be abnormal.
- Open the conf_ces.json file in the /usr/local/uniagent/extension/install/telescope/bin directory. Add "EnablePlugin": true to the file to enable the plug-in to collect metric data.
{ "Endpoint": "Region address. Retain the default value.", "EnablePlugin": true }
- Restart the Agent:
ps -ef | grep telescope | grep -v grep | awk '{print $2}' | xargs kill -9
Procedure (for the Early Version of the Agent)
- Log in to an ECS as user root.
- To monitor the BMS software RAID metrics, log in to a BMS.
- The examples in the following procedure are based on the GPU plug-in installation. The installation for the software RAID plug-in is similar.
- Run the following command to go to the Agent installation path /usr/local/telescope:
cd /usr/local/telescope
- Run the following command to create the plugins folder:
mkdir plugins
- Run the following command to enter the plugins folder:
cd plugins
- To download the script of the GPU metric collection plug-in, run the following command:
wget https://telescope-eu-west-101.obs.eu-west-101.myhuaweicloud.eu/gpu_collector
Table 2 Obtaining the plug-in installation package Name
Download Path
Linux 64-bit installation package of the GPU metric collection plug-in
eu-west-101: https://telescope-eu-west-101.obs.eu-west-101.myhuaweicloud.eu/gpu_collector
- Run the following command to add the script execution permissions:
- Run the following command to create the conf.json file, add the configuration content, and configure the plug-in path and metric collection period crontime, which is measured in seconds:
vi conf.json
GPU metric plug-in configuration
{ "plugins": [ { "path": "/usr/local/telescope/plugins/gpu_collector", "crontime": 60 } ] }
RAID metric plug-in configuration
{ "plugins": [ { "path": "/usr/local/telescope/plugins/raid_monitor.sh", "crontime": 60 } ] }
- The parameters gpu_collector and raid_monitor.sh indicate the GPU plug-in and RAID plug-in configuration.
- The collection period of the plug-in is 60 seconds. If the collection period is incorrectly configured, the metric collection will be abnormal.
- Do not change the plug-in path without permission. Otherwise, the metric collection will be abnormal.
- Open the conf_ces.json file in the /usr/local/telescope/bin directory. Add "EnablePlugin": true to the file to enable the plug-in to collect metric data.
{ "Endpoint": "Region address. Retain the default value.", "EnablePlugin": true }
- Run the following command to restart the Agent:
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.