Updated on 2022-06-01 GMT+08:00

Host Monitoring

Hosts include Elastic Cloud Servers (ECSs). AOM monitors the hosts created during Cloud Container Engine (CCE) or ServiceStage cluster creation and the hosts that are directly created. Ensure that the directly created hosts meet operating system (OS) and version requirements and install the ICAgent on these hosts according to Installing the ICAgent. Otherwise, these hosts cannot be monitored by AOM. In addition, the hosts support both IPv4 and IPv6 addresses.

AOM monitors the resource usage and health status of hosts, common system devices such as disks and file systems of hosts, and service processes or instances running on hosts.

Precautions

  • A maximum of five tags can be added to a host, and each tag must be unique.
  • The same tag can be added to different hosts.
  • For hosts created on the CCE or ServiceStage console, you cannot select clusters or create aliases for them.
  • The host status can be Normal, Abnormal, Warning, Silent, or Deleted. The running status of a host is displayed as Abnormal when the host is faulty due to network failures, and power off or shut down of the host, or when the host generates a threshold alarm. For more information, see What Can I Do If Resources Are Not Running Properly?.

Procedure

  1. In the navigation pane, choose Monitoring > Host Monitoring.

    Click in the upper right corner and select Hide master host.

  2. Perform the following operations as required:

    • Adding an alias

      If a host name is too complex to identify, you can add an alias that is easy to identify a host as required.

      Click Add alias in the Operations column of the target host.

    • Adding a tag

      Tags are identifiers of hosts. You can manage hosts using tags. After a tag is added, you can quickly identify and select a host.

      In the host list, choose More > Add tags in the Operation column, enter a tag, and click and OK. The Tags column of the host list is hidden by default. You can click in the upper right corner and select or deselect Tags to show or hide them.

  3. Set filter criteria to search for the desired host.
  4. Click the host name to enter the Host Details page. In the instance list, monitor the resource usage and health status of the instances running on the host. Click the View Monitor Graphs tab to monitor all the metrics of the host.
  5. Monitor common system devices such as GPUs and NICs of the host.

    • Click the Instance List tab to view the basic information such as the instance status and type. Click an instance to view its metrics on the details page.
    • Click the GPUs tab to view the basic information about the GPUs of the host. Click a GPU to view its metrics on the View Monitor Graphs page.
    • Click the NIC tab to view the basic information about the NICs of the host. Click a NIC to monitor its metrics on the View Monitor Graphs page.
    • Click the Disks tab to view the basic information about the disks of the host. Click a disk to monitor its metrics on the View Monitor Graphs page.
    • Click the File System tab to view the basic information about the file system of the host. Click a disk file partition to monitor its metrics on the View Monitor Graphs page.
    • Click the Alarm Analysis tab to view the alarm details.