Updated on 2024-09-23 GMT+08:00

Viewing MRS Cluster Component Monitoring Metrics

You can manage the following status and metrics of all components (including role instances) on the MRS console: Status information includes operation, health, configuration, and role instance status. Indicator information includes key monitoring indicators for each component.

Prerequisites

  • The IAM users have been synchronized in advance. You can do this by clicking Synchronize next to IAM User Sync on the Dashboard page of the cluster details.
  • You have logged in to MRS Manager. For how to log in, see Accessing MRS Manager.

Viewing the Monitoring Information on the MRS Management Console

  1. Log in to the MRS console.
  2. Choose Active Clusters and click a cluster name to go to the cluster details page.
  3. On the Dashboard tab, click Synchronize next to IAM User Sync to synchronize IAM users.
  4. On the MRS cluster details page, click Components.
  5. View component monitoring information.

    1. Click a specified service in the list to view its status and metric information.
    2. Select and view component-level monitoring metrics.
      1. In the Charts area, click Customize to customize service monitoring metrics.
      2. In Period area, select a time of period and click View to view the monitoring data within the time period.

  6. View role instance monitoring information.

    1. In the component list, click a service name.
    2. Click Instance to view the status of each role instance in the component.

      You can filter all instances of the same role in the upper right corner of the list. You can set search criteria in the role search area by clicking Advanced Search, and click Search to view specified role information. You can click Reset to reset the search criteria. Fuzzy search is supported.

    3. Click the target role instance to view its status and metric information.
    4. Customize and view monitoring graphs.
      1. In the Charts area, click Customize to customize service monitoring metrics.
      2. In Period area, select a time of period and click View to view the monitoring data within the time period.

Viewing Component Monitoring Information on Manager

  1. Log in to Manager.
  2. Go to the Service Management page.

    • For MRS 3.x and later, choose Cluster > Services.
    • For MRS 2.x and earlier, click Services.

  3. View component monitoring information.

    1. Click a specified service in the list to view its status and metrics.
    2. Customize and export monitoring charts.
      1. In the Chart tab, click and Customize to customize service monitoring metrics.
      2. Set a time range and click and Export to export the monitoring data.

  4. View role instance monitoring information.

    1. Click the service name.
    2. Click Instances to view the role status.
    3. Click the target role instance to view its status and metric information.
    4. Customize and export monitoring charts.
      1. In the Chart tab, click and Customize to customize metric chart.
      2. Select a time range. The monitoring data within the time range is displayed.

Component Resource Monitoring Summary

This function is supported only in MRS 3.x and later versions.

Log in to FusionInsight Manager and choose Cluster > Services > Target service. Click Resource. The resource monitoring page is displayed.

Some services in the cluster provide service-level resource monitoring metrics. By default, the monitoring data of the latest 12 hours is displayed. You can click to set a time range. You can click to export the corresponding report information. If a monitoring item has no data, the report cannot be exported. The following table lists the services and monitoring items that support resource monitoring.

Table 1 Service resource monitoring

Service

Monitoring Metric

Description

HDFS

Resource Usage (by Tenant)

  • Collects statistics on HDFS resource usage by tenant.
  • Views the metrics Capacity or Number of File Objects.

Resource Usage (by User)

  • Collects statistics on HDFS resource usage by user.
  • Views the metrics Used Capacity or Number of File Objects.

Resource Usage (by Directory)

  • Collects statistics on HDFS resource usage by directory.
  • Views the metrics Used Capacity or Number of File Objects.
  • You can click to configure space monitoring. Alternatively, you can specify an HDFS file system directory for monitoring.

Resource Usage (by Replica)

  • Collects statistics on HDFS resource usages by replica count.
  • Views the metrics Used Capacity or File Count.

Resource Usage (by File Size)

  • Collects statistics on HDFS resource usages by file size.
  • Views the metrics Used Capacity or File Count.

Recycle Bin (by User)

  • Collects statistics on the usage of the HDFS recycle bin by user.
  • Views the metrics Recycle Bin Capacity or Number of File Objects.

Operation Count

  • Collects the number of operations in HDFS.

Automatic Balancer

Collects statistics on the execution speed of HDFS automatic balancer and the total capacity of the current balancer migration.

NameNode RPC Open Connections (by User)

Displays the number of connections of each user in the Client RPC requests connected to NameNodes.

Slow DataNodes

Displays DataNode that transmits or processes data slowly in the cluster.

Slow Disks

Displays the disk that processes data slowly on the DataNode in the cluster.

HBase

Operation Requests in Tables

Displays the number of PUT, DELETE, GET, SCAN, INCREMENT, and APPEND operation requests in all tables on all RegionServers.

Operation Requests on RegionServers

Displays the number of PUT, DELETE, GET, SCAN, INCREMENT, and APPEND operation requests and number of all operation requests in RegionServer.

Operation Requests for Service

Displays the number of PUT, DELETE, GET, SCAN, INCREMENT, and APPEND operation requests in all regions on RegionServers.

HFiles on RegionServers

Displays the number of HFiles in all RegionServers.

HetuEngine

Coordinator Resource Usage

Displays the coordinator resource usage in the selected queue.

Coordinator Resource Usage Ratio

Displays the coordinator resource usage in the selected queue.

Worker Resource Usage

Displays the worker resource usage in the selected queue.

Worker Resource Usage Ratio

Displays the worker resource usage in the selected queue.

Number of Coordinators and Workers

Displays the number of coordinators and workers in the selected queue.

Hive

HiveServer2-Background-Pool Threads (by IP)

Displays the number of HiveServer2-Background-Pool threads of top users. These threads are measured and displayed in a measurement period.

HiveServer2-Handler-Pool Threads (by IP)

Displays the number of HiveServer2-Handler-Pools of top users collected and displayed in a period.

Used MetaStore Number (by IP)

Collects statistics on and displays the MetaStore usage of top users in a period.

Number of Hive jobs

Displays the number of user-related jobs collected by Hive in a period.

Number of Files Accessed in the Split Phase

Displays the number of files accessed by the underlying file storage system (HDFS by default) in the Split phase in a period.

Hive Basic Operation Time

Collects time for creating a directory (mkdirTime), creating a file (touchTime), writing a file (writeFileTime), renaming a file (renameTime), moving a file (moveTime), deleting a file (deleteFileTime), and deleting a directory (deleteCatalogTime) in a period of time.

Table Partitions

Displays the number of partitions in all Hive tables, which is displayed in the following format: database # table name, number of table partitions.

HQL Map Count

Collects statistics on HQL statements executed in a period and the number of Map statements invoked during the execution. The displayed information includes users, HQL statements, and the number of Map statements.

HQL Access Statistics

Displays the number of HQL access times in a period.

Kafka

Kafka Disk Usage Distribution

Displays the disk usage distribution statistics of the Kafka cluster.

Spark/Spark2x

HQL Access Statistics

Collects HQL access statistics in a period, including the username, HQL statement, and HQL statement execution times.

Yarn

Used resources (by Task)

  • Displays the number of CPU cores and memory used by a task.
  • Views the metrics By memory or By CPU.

Resource Usage (by Tenant)

  • Displays the number of CPU cores and memory used by a tenant.
  • Views the metrics By memory or By CPU.

Resource usage ratio (by Tenant)

  • Displays the ratio of the number of CPU cores to the memory used by a tenant.
  • Views the metrics By memory or By CPU.

Task Duration Ranking

Displays Yarn tasks sorted by time consumption.

ResourceManager RPC Open Connections (by User)

Displays the number of client RPC connections to ResourceManager by user.

HBase Operation Count

Collects statistics on the number and proportion of operations corresponding to each Yarn operation type.

Ranking of Tasks in a Queue by Resource Usage

  • Displays the resources consumed by the tasks running in a queue after the queue (tenant) is selected on the GUI.
  • Views the metrics By memory or By CPU.

Ranking of Users in a Queue by Resource Usage

  • Displays the resources consumed by the users who are running tasks in the queue after a queue (tenant) is selected on the GUI.
  • Views the metrics By memory or By CPU.

ZooKeeper

Used Resources (By Second-Level Znode)

  • Displays the ZooKeeper level-2 znode resource status.
  • Views the metrics By Znode quantity or By capacity.

Number of Connections (by Client IP Address)

Displays the ZooKeeper client connection resource status.