Updated on 2023-12-14 GMT+08:00

Resource Monitoring

Log in to FusionInsight Manager, choose Cluster > Name of the desired cluster > Services, and click Resource. The resource monitoring page is displayed.

Some services in the cluster provide service-level resource monitoring metrics. By default, the monitoring data of the latest 12 hours is displayed. You can click to customize a time range. Time range options are 12h, 1d, 1w, and 1m. You can click to export the corresponding report information. If a monitoring item has no data, the report cannot be exported. Table 1 lists the services and monitoring items that support resource monitoring.

Table 1 Service resource monitoring

Service

Metrics

Description

HDFS

Resource Usage (by Tenant)

  • Collects statistics on HDFS resource usage by tenant.
  • Views the metrics Capacity or Number of File Objects.

Resource Usage (by User)

  • Collects statistics on HDFS resource usage by user.
  • Views the metrics Used Capacity or Number of File Objects.

Resource Usage (by Directory)

  • Collects statistics on HDFS resource usage by directory.
  • Views the metrics Used Capacity or Number of File Objects.
  • You can click to configure space monitoring. Alternatively, you can specify an HDFS file system directory for monitoring.

Resource Usage (by Replica)

  • Collects statistics on HDFS resource usages by replica count.
  • Views the metrics Used Capacity or File Count.

Resource Usage (by File Size)

  • Collects statistics on HDFS resource usages by file size.
  • Views the metrics Used Capacity or File Count.

Recycle Bin (by User)

  • Collects statistics on the usage of the HDFS recycle bin by user.
  • Views the metrics Recycle Bin Capacity or Number of File Objects.

Operation Count

  • Collects the number of operations in HDFS.

Automatic Balancer

  • Collects statistics on the execution speed of HDFS automatic balancer and the total capacity of the current balancer migration.

NameNode RPC Open Connections (by User)

  • Displays the number of connections of each user in the Client RPC requests connected to NameNodes.

Slow DataNodes

Displays DataNode that transmits or processes data slowly in the cluster.

Slow Disks

Displays the disk that processes data slowly on the DataNode in the cluster.

HBase

Operation Requests in Tables

Displays the number of PUT, DELETE, GET, SCAN, INCREMENT, and APPEND operation requests in all tables on all RegionServers.

Operation Requests on RegionServers

Displays the number of PUT, DELETE, GET, SCAN, INCREMENT, and APPEND operation requests and number of all operation requests in RegionServer.

Operation Requests for Service

Displays the number of PUT, DELETE, GET, SCAN, INCREMENT, and APPEND operation requests in all regions on RegionServers.

HFiles on RegionServers

Displays the number of HFiles in all RegionServers.

HetuEngine

Coordinator Resource Usage

Displays the coordinator resource usage in the selected queue.

Coordinator Resource Usage Ratio

Displays the coordinator resource usage in the selected queue.

Worker Resource Usage

Displays the worker resource usage in the selected queue.

Worker Resource Usage Ratio

Displays the worker resource usage in the selected queue.

Number of Coordinators and Workers

Displays the number of coordinators and workers in the selected queue.

Hive

HiveServer2-Background-Pool Threads (by IP)

Displays the number of HiveServer2-Background-Pool threads of top users. These threads are measured and displayed in a measurement period.

HiveServer2-Handler-Pool Threads (by IP)

Displays the number of HiveServer2-Handler-Pools of top users collected and displayed in a period.

Used MetaStore Number (by IP)

Collects statistics on and displays the MetaStore usage of top users in a period.

Number of Hive jobs

Displays the number of user-related jobs collected by Hive in a period.

Number of Files Accessed in the Split Phase

Displays the number of files accessed by the underlying file storage system (HDFS by default) in the Split phase in a period.

Hive Basic Operation Time

Collects time for creating a directory (mkdirTime), creating a file (touchTime), writing a file (writeFileTime), renaming a file (renameTime), moving a file (moveTime), deleting a file (deleteFileTime), and deleting a directory (deleteCatalogTime) in a period of time.

Table Partitions

Displays the number of partitions in all Hive tables, which is displayed in the following format: database # table name, number of table partitions.

HQL Map Count

Collects statistics on HQL statements executed in a period and the number of Map statements invoked during the execution. The displayed information includes users, HQL statements, and the number of Map statements.

HQL Access Statistics

Displays the number of HQL access times in a period.

Kafka

Kafka Disk Usage Distribution

Displays the disk usage distribution statistics of the Kafka cluster.

Spark/Spark2x

HQL Access Statistics

Collects HQL access statistics in a period, including the username, HQL statement, and HQL statement execution times.

Yarn

Used resources (by task)

  • Displays the number of CPU cores and memory used by a task.
  • Views the metrics By memory or By CPU.

Resource usage (by tenant)

  • Displays the number of CPU cores and memory used by a tenant.
  • Views the metrics By memory or By CPU.

Resource usage ratio (by tenant)

  • Displays the ratio of the number of CPU cores to the memory used by a tenant.
  • Views the metrics By memory or By CPU.

Task Duration Ranking

Displays Yarn tasks sorted by time consumption.

ResourceManager RPC Open Connections (by User)

Displays the number of client RPC connections to ResourceManager by user.

Operation Count

Collects statistics on the number and proportion of operations corresponding to each Yarn operation type.

Ranking of Tasks in a Queue by Resource Usage

  • Displays the resources consumed by the tasks running in a queue after the queue (tenant) is selected on the GUI.
  • Views the metrics By memory or By CPU.

Ranking of Users in a Queue by Resource Usage

  • Displays the resources consumed by the users who are running tasks in the queue after a queue (tenant) is selected on the GUI.
  • Views the metrics By memory or By CPU.

ZooKeeper

Used Resources (By Second-Level Znode)

  • Displays the ZooKeeper level-2 znode resource status.
  • Views the metrics By Znode quantity or By capacity.

Number of Connections (by Client IP Address)

Displays the ZooKeeper client connection resource status.