Help Center> Cloud Eye> FAQs> Server Monitoring> Metrics> Environment Constraints for GPU Monitoring
Updated on 2024-01-11 GMT+08:00

Environment Constraints for GPU Monitoring

  1. Only Linux OSs are supported, and only some Linux public image versions support GPU monitoring. For details, see What OSs Does the Agent Support?
  2. Supported flavors: G6v, G6, P2s, P2v, P2vs, G5, Pi2, Pi1, ECSs of P1 series, the BMSs of the P, Pi, G, and KP series.
  3. The lspci tool has been installed on the ECS. If the lspci tool is not installed on the ECS, GPU metric data cannot be collected and events cannot be reported.

    To install the lspci tool, perform the following steps:

    1. Log in to the ECS.
    2. Update the image source to obtain the installation dependencies.

      wget http://mirrors.myhuaweicloud.com/repo/mirrors_source.sh && bash mirrors_source.sh

      For more information, see How Can I Use an Automated Tool to Configure a HUAWEI CLOUD Image Source (x86_64 and Arm)?

    3. Run the following command to install the lspci tool:
      • CentOS:

        yum install pciutils

      • Ubuntu:

        apt install pciutils

    4. Run the following command to view the installation result:

      lspci -d 10de:

      Figure 1 Example installation result
  4. GPU metric collection depends on the following driver files. Check whether there are corresponding driver files in the environment.
    1. Linux driver file
      nvmlUbuntuNvidiaLibraryPath = "/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1"
      nvmlCentosNvidiaLibraryPath = "/usr/lib64/libnvidia-ml.so.1"
      nvmlCceNvidiaLibraryPath    = "/opt/cloud/cce/nvidia/lib64/libnvidia-ml.so.1"
    2. Windows driver file
      DefaultNvmlDLLPath = "C:\\Program Files\\NVIDIA Corporation\\NVSMI\\nvml.dll"
      WHQLNvmlDLLPath    = "C:\\Windows\\System32\\nvml.dll"

Metrics FAQs

more