HBase Cluster Supported Metrics
Description
Monitoring is critical to ensure CloudTable reliability, availability, and performance. You can monitor the running status of CloudTable servers.
This section describes the metrics that can be monitored by CES as well as their namespaces and dimensions. You can use the management console or APIs provided by Cloud Eye to query the metrics of the monitored objects and alarms generated for CloudTable.
Namespace
SYS.CloudTable
CloudTable HBase HMaster Instance Monitoring Metrics
Metric ID |
Name |
Meaning |
Value Range |
Monitoring Interval (Raw Data) |
---|---|---|---|---|
disk_throughput_write_rate |
Disks Read Rate |
Volume of data read from the monitored object per second |
≥ 0 bytes/s |
1 min |
disk_throughput_read_rate |
Disks Write Rate |
Volume of data written to the monitored object per second |
≥ 0 bytes/s |
1 min |
cmdForTotalMemory |
Total Memory |
Total memory size of the monitored object |
> 0 Byte |
1 min |
cmdProcessCPU |
CPU Usage |
CPU usage of the monitored object |
0%–100% |
1 min |
cmdProcessMem |
Memory Usage |
Memory usage of the monitored object |
0%–100% |
1 min |
hm_deadregionservernum |
Faulty RegionServers |
Number of faulty RegionServers in the cluster |
≥ 0 |
1 min |
hm_regionservernum |
Normal RegionServers |
Number of normal RegionServers in the cluster |
≥ 0 |
1 min |
hm_ritCount |
RIT Count |
Number of regions in the Region In Transaction (RIT) state in the cluster where the monitored object is located |
≥ 0 |
1 min |
hm_ritCountOverThreshold |
RIT Count Over Threshold |
Number of regions in the RIT state and reach the threshold in the cluster where the monitored object is running |
≥ 0 |
1 min |
rs_queuecalltime_max |
RPC Queue Call Time (Max) |
Maximum RPC queue call time |
≥ 0 ms |
1 min |
rs_queuecalltime_mean |
RPC Queue Call Time (Mean) |
Mean RPC queue call time |
≥ 0 ms |
1 min |
nn_percentallused |
Disk Utilization Rate |
Disk space usage of the cluster |
0%–100% |
1 min |
nn_capacityremaining |
Disk capacity remaining of cluster |
Remaining disk space of the cluster |
Depends on the cluster disk capacity. |
1 min |
nn_capacityused |
Disk capacity used of cluster |
Disk space used in the cluster |
Depends on the cluster disk capacity. |
1 min |
hmaster instances include hmaster-standby (standby) and hmaster-active (active). When hmaster-active becomes faulty, hmaster-standby becomes active to provide services.
CloudTable HBase RegionServer Instance Monitoring Metrics
Table 2 lists the monitoring metrics supported by CloudTable HBase RegionServer instances.
Metric ID |
Metric |
Meaning |
Value Range |
Monitoring Period (Raw Data) |
---|---|---|---|---|
cmdProcessCPU |
CPU Usage |
CPU usage of the monitored object Unit: % |
0%–100% |
1 minute |
cmdForTotalMemory |
Total Memory |
Total memory size of the monitored object Unit: byte |
> 0 byte |
1 minute |
cmdProcessMem |
Memory Usage |
Memory usage of the monitored object Unit: % |
0%–100% |
1 minute |
disk_throughput_write_rate |
Disks Write Rate |
Volume of data written to the monitored object per second Unit: byte/s |
≥ 0 bytes/s |
1 minute |
disk_throughput_read_rate |
Disks Read Rate |
Volume of data read from the monitored object per second Unit: byte/s |
≥ 0 bytes/s |
1 minute |
hm_regionservernum |
Normal RegionServers |
Number of normal RegionServers |
≥ 0 |
1 minute |
hm_deadregionservernum |
Faulty RegionServers |
Number of faulty RegionServers |
≥ 0 |
1 minute |
hm_ritCountOverThreshold |
RIT Count Over Threshold |
Region in transaction count over threshold |
≥ 0 |
1 minute |
hm_ritCount |
RIT Count |
Region in transaction count |
≥ 0 |
1 minute |
rs_requests |
Requests Per Second |
Number of requests of a RegionServer per second Unit: Request/s |
≥ 0 requests/s |
1 minute |
rs_regions |
Regions |
Number of regions of a RegionServer |
≥ 0 |
1 minute |
rs_writerequestscount |
Write Requests |
Number of write requests of a RegionServer |
≥ 0 |
1 minute |
rs_readrequestscount |
Read Requests |
Number of read requests of a RegionServer |
≥ 0 |
1 minute |
rs_blockcachehitcachingratio |
Hit Cache Block Caching Ratio |
Block cache hit caching ratio Unit: % |
0%–100% |
1 minute |
rs_blockCacheCountHitPercent |
Hit Cache Block Ratio |
Block cache hit ratio Unit: % |
0%–100% |
1 minute |
rs_getavgtime |
Get Delay (Avg) |
Average Get operation delay of the RegionServer per unit time Unit: millisecond |
≥ 0 ms |
1 minute |
rs_putavgtime |
Put Delay (Avg) |
Average Put operation delay of the RegionServer per unit time Unit: millisecond |
≥ 0 ms |
1 minute |
rs_deleteavgtime |
Delete Delay (Avg) |
Average Delete operation delay of the RegionServer per unit time Unit: millisecond |
≥ 0 ms |
1 minute |
rs_getnumops |
Get Operations |
Number of Get operations of the RegionServer per unit time |
≥ 0 |
1 minute |
rs_putnumops |
Put Operations |
Number of Put operations of the RegionServer per unit time |
≥ 0 |
1 minute |
rs_deletenumops |
Delete Operations |
Number of Delete operations of the RegionServer per unit time |
≥ 0 |
1 minute |
rs_queuecalltime_max |
RPC Queue Call Time (Max) |
Maximum RPC queue call time Unit: millisecond |
≥ 0 ms |
1 minute |
rs_queuecalltime_mean |
RPC Queue Call Time (Mean) |
Mean RPC queue call time Unit: millisecond |
≥ 0 ms |
1 minute |
rs_flushtime_mean |
Flush Time(Mean) |
Mean time of flush Unit: millisecond |
≥ 0 ms |
1 minute |
rs_compactionqueuesize |
Compaction Queue Size |
Point in time length of the compaction queue. The number of Stores for compaction in the RegionServer. |
≥ 0 |
1 minute |
rs_flushqueuesize |
Flush Queue Size |
Flush queue size |
≥ 0 |
1 minute |
rs_compactionscompletedcount |
Compaction Count |
Count of compaction |
≥ 0 |
1 minute |
rs_flushtimeops_num |
Flush Operation Count |
Count of flush operation |
≥ 0 |
1 minute |
rs_blockcacheevictedcount |
Discarded Cache Blocks |
Block cache evict count |
≥ 0 |
1 minute |
rs_syncTime_max |
Sync WAL Time(Max) |
Maximum time it took to sync the WAL to HDFS Unit: millisecond |
≥ 0 ms |
1 minute |
rs_syncTime_mean |
Sync WAL Time(Mean) |
Mean time it took to sync the WAL to HDFS Unit: millisecond |
≥ 0 ms |
1 minute |
dn_byteswritten_speed |
Bytes written per second |
Bytes written per second of the node |
≥ 0 byte |
1 min |
dn_bytesread_speed |
Bytes read per second |
Bytes read per second of the node |
≥ 0 byte |
1 min |
rs_numActiveHandler |
Number of RegionServer Active Handlers |
Number of active RegionServer handlers (total number of handlers for processing user table requests, meta table requests, and replication requests) |
≥ 0 |
1 min |
rs_numActiveGeneralHandler |
Number of RegionServer Active Handlers for Processing User Table Requests |
Number of active RegionServer handlers for processing user table requests |
≥ 0 |
1 min |
rs_scanTime_p999 |
99.9th Percentile of the Scan Operation Delay |
99.9th percentile of the RegionServer Scan operation delay |
≥ 0 ms |
1 min |
rs_syncTime_p999 |
99.9th Percentile of the WAL Sync Operation Delay |
99.9th percentile of the RegionServer WAL Sync operation delay |
≥ 0 ms |
1 min |
rs_Get_99th_percentile |
99th Percentile of the Get Operation Delay |
99th percentile of the RegionServer Get operation delay |
≥ 0 ms |
1 min |
rs_Put_99th_percentile |
99th Percentile of the Put Operation Delay |
99th percentile of the RegionServer Put operation delay |
≥ 0 ms |
1 min |
rs_Delete_99th_percentile |
99th Percentile of the Delete Operation Delay |
99th percentile of the RegionServer Delete operation delay |
≥ 0 ms |
1 min |
rs_Get_999th_percentile |
99.9th Percentile of the Get Operation Delay |
99.9th percentile of the RegionServer Get operation delay |
≥ 0 ms |
1 min |
rs_Put_999th_percentile |
99.9th Percentile of the Put Operation Delay |
99.9th percentile of the RegionServer Put operation delay |
≥ 0 ms |
1 min |
rs_Delete_999th_percentile |
99.9th Percentile of the Delete Operation Delay |
99.9th percentile of the RegionServer Delete operation delay |
≥ 0 ms |
1 min |
Dimension
Key |
Value |
---|---|
cluster_id |
CloudTable cluster ID. |
instance_name |
Name of a CloudTable cluster node. |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot