Configuring Alarm Threshold

Scenario

You can configure monitoring indicator thresholds to monitor the health status of indicators on FusionInsight Manager. If abnormal data occurs and the preset conditions are met, the system triggers an alarm and displays the alarm information on the alarm page.

Procedure

Log in to FusionInsight Manager.
Choose O&M > Alarm > Thresholds.
Select a monitoring metric for a host or service in the cluster.

Figure 1 Configuring the threshold for a metric
For example, after selecting Host Memory Usage, the information about this indicator threshold is displayed.
- When Switch is on, an alarm will be triggered if the threshold is met.
- When Alarm Severity is on, hierarchical alarms are enabled. The system dynamically reports alarms of the corresponding severity based on the real-time metric values and hierarchical thresholds set for that severity.
- Alarm ID and Alarm Name: alarm information triggered against the threshold
- Trigger Count: FusionInsight Manager checks whether the value of a monitoring metric reaches the threshold. If the number of consecutive checks reaches the value of Trigger Count, an alarm is generated. Trigger Count is configurable.
- Check Period (s): interval for the system to check the monitoring metric.
- The rules in the rule list are used to trigger alarms.

Click Create Rule to add rules used for monitoring indicators.

**Table 1** Monitoring indicator rule parameters
Parameter	Description	Example Value
Rule Name	Set a rule name.	CPU_MAX
Severity	Select an alarm severity. After Alarm Severity is on, you need to configure the alarm severity in Thresholds.	Critical Major Minor Warning
Threshold Type	You can use the maximum or minimum value of an indicator as the alarm triggering threshold. If Threshold Type is set to Max value, the system generates an alarm when the value of the specified indicator is greater than the threshold. If Threshold Type is set to Min value, the system generates an alarm when the value of the specified indicator is less than the threshold.	Max value Min value
Date	This parameter is used to set the date when the rule takes effect. If Alarm Severity is on, only Daily is supported.	Daily Weekly Others
Add Date	This parameter is available only when Date is set to Others. You can set the date when the rule takes effect. Multiple options are available.	09-30
Thresholds	This parameter is used to set the time range when the rule takes effect. If Alarm Severity is on, you cannot set the start time and end time. The default start time and end time are 00:00-23:59.	Start and End Time: 00:00–08:30
Thresholds	Thresholds of the rule monitoring metric After Alarm Severity is on, different alarm severities can be set for a cluster based on different thresholds.	Alarm severity Threshold

You can click to set multiple time ranges for the threshold or click to delete one.

Click OK to save the rules.
Locate the row that contains an added rule, and click Apply in the Operation column. The value of Effective for this rule changes to Yes.

A new rule can be applied only after you click Cancel for an existing rule.

Monitoring Metric Reference

FusionInsight Manager alarm monitoring metrics are classified as node information metrics and cluster service metrics. Table 2 describes the metrics for which you can configure thresholds on nodes.

**Table 2** Node monitoring metrics
Metric Group	Metric	Description	Default Threshold
CPU	Host CPU Usage	This indicator reflects the computing and control capabilities of the current cluster in a measurement period. By observing the indicator value, you can better understand the overall resource usage of the cluster.	90.0%
Disk	Disk Usage	Indicates the disk usage of a host.	95% (critical) 85% (major)
Disk	Disk Inode Usage	Indicates the disk inode usage in a measurement period.	95% (critical) 80% (major)
Memory	Host Memory Usage	Indicates the average memory usage at the current time.	95% (critical) 90% (major)
Host Status	Host File Handle Usage	Indicates the usage of file handles of the host in a measurement period.	95% (critical) 80% (major)
Host Status	Host PID Usage	Indicates the PID usage of a host.	95% (critical) 90% (major)
Network Status	TCP Ephemeral Port Usage	Indicates the usage of temporary TCP ports of the host in a measurement period.	95% (critical) 80% (major)
Network Reading	Read Packet Error Rate	Indicates the read packet error rate of the network interface on the host in a measurement period.	5% (critical) 0.5% (major)
	Read Packet Dropped Rate	Indicates the read packet dropped rate of the network interface on the host in a measurement period.	5% (critical) 0.5% (major)
	Read Throughput Rate	Indicates the average read throughput (at MAC layer) of the network interface in a measurement period.	80%
Network Writing	Write Packet Error Rate	Indicates the write packet error rate of the network interface on the host in a measurement period.	5% (critical) 0.5% (major)
	Write Packet Dropped Rate	Indicates the write packet dropped rate of the network interface on the host in a measurement period.	5% (critical) 0.5% (major)
	Write Throughput Rate	Indicates the average write throughput (at MAC layer) of the network interface in a measurement period.	80%
Process	Uninterruptible Sleep Process	Number of D state and Z state processes on the host in a measurement period	0
Process	omm Process Usage	omm process usage in a measurement period	95% (critical) 90% (major)

**Table 3** Cluster service indicators
Service	Metric Group	Metric	Description	Default Threshold
DBService	Database	Usage of the Number of Database Connections	Indicates the usage of the number of database connections.	95% (critical) 90% (major)
DBService	Database	Disk Space Usage of the Data Directory	Disk space usage of the data directory	85% (critical) 80% (major)
MOTService	Database	MOT Connections Usage	Usage of MOTService database connections	90%
		MOT Disk Space Usage of the Data Directory	Disk space usage of the MOTService data directory	80%
		MOT Used Memory Percentage	MOTService memory usage	85%
		MOT Used CPU Percentage	MOTService CPU usage	80%
Elasticsearch	Disk	Data Directory Usage	Elasticsearch data directory usage	80%
	Garbage Collection	GC Time	Garbage collection duration of the Elasticsearch instance process	30000ms
	Memory	Heap Memory Usage	Elasticsearch heap memory usage	90%
	Shard	Elasticsearch Shard Document Number	Number of Elasticsearch sharded files	100000000
		Elasticsearch Shard Data Volume	Size of Elasticsearch shards	41943040
		Number of Instance Shards	Total number of Elasticsearch instance shards	400
	Replica Quantity Statistics	Total shard number	Number of primary shards whose Elasticsearch status is down	70000
Flume	Agent	Flume Heap Memory Usage Calculate	Indicates the Flume heap memory usage.	95.0% (critical) 90.0% (major)
		Flume Direct Memory Usage Statistics	Indicates the Flume direct memory usage.	90.0% (critical) 80.0% (major)
		Flume Non-heap Memory Usage	Indicates the Flume non-heap memory usage.	80.0%
		Total GC duration of Flume process	Indicates the Flume total GC time.	12000 ms
FTP-Server	Process	FTP-Server Heap Memory Usage Calculate	Indicates the FTP-Server heap memory usage.	95.0%
		FTP-Server Direct Buffer Usage Statistics	Indicates the FTP-Server direct memory usage.	80.0%
		FTP-Server Non-Heap Memory Usage	Indicates the FTP-Server non-heap memory usage.	80.0%
		Total GC duration of FTP-Server process	Indicates the total GC time of FTP-Server.	12000 ms
HBase	GC	GC time for old generation	Total GC time of RegionServer	5000 ms
	GC	GC time for old generation	Total GC time of HMaster	5000 ms
	CPU & memory	RegionServer Direct Memory Usage Statistics	RegionServer direct memory usage	90%
		RegionServer Heap Memory Usage Statistics	RegionServer heap memory usage	90%
		HMaster Direct Memory Usage	HMaster direct memory usage	90%
		HMaster Heap Memory Usage Statistics	HMaster heap memory usage	90%
	Service	Number of Online Regions of a RegionServer	Number of regions of a RegionServer	5000 (critical) 2000 (major)
	Service	Region in transaction count over threshold	Number of regions that are in the RIT state and reach the threshold duration	1
	Handler	RegionServer Handler Usage	Handler usage of RegionServer	100% (critical) 90% (major)
	Replication	Replication sync failed times (RegionServer)	Number of times that DR data fails to be synchronized	1
		Number of Log Files to Be Synchronized in the Active Cluster	Number of log files to be synchronized in the active cluster	128
		Number of HFiles to Be Synchronized in the Active Cluster	Number of HFiles to be synchronized in the active cluster	128
	RPC	Number of RegionServer Opened Connections	Number of open RegionServer RPC connections	200 (critical) 100 (major)
		99th Percentile of the RegionServer RPC Request Response Time	99th percentile of the RegionServer RPC request response time	10000 ms (critical) 5000 ms (major)
		99th Percentile of the RegionServer RPC Request Processing Time	99th percentile of the RegionServer RPC request processing time	10000 ms (critical) 5000 ms (major)
	Operation statistics	Number of Timed-Out WAL Writes in RegionServers	Number of timed-out WAL writes in RegionServers	500 (critical) 300 (major)
	Queue	Number of Tasks in RegionServer RPC Write Queues	Number of tasks in RegionServer RPC write queues	2000 (critical) 1600 (major)
		Number of Tasks in RegionServer RPC Read Queues	Number of tasks in RegionServer RPC read queues	2000 (critical) 1600 (major)
		RegionServer Call Queue Size	RegionServer call queue size	838860800 (critical) 629145600 (major)
		Compaction Queue Size	Size of the Compaction queue	100
HDFS	File and Block	Lost Blocks	Number of backup blocks that the HDFS file system lacks	0
	File and Block	Blocks Under Replicated	Total number of blocks that need to be replicated by the NameNode	1000
	RPC	Average Time of Active NameNode RPC Processing	Average NameNode RPC processing time	100 ms (major) 200 ms (critical)
	RPC	Average Time of Active NameNode RPC Queuing	Average NameNode RPC queuing time	200 ms (major) 300 ms (critical)
	Disk	HDFS Disk Usage	HDFS disk usage	80% (major) 90% (critical)
		DataNode Disk Usage	Disk usage of DataNodes in the HDFS	80%
		Percentage of Reserved Space for Replicas of Unused Space	Percentage of the reserved disk space of all the copies to the total unused disk space of DataNodes	90%
	Resource	Faulty DataNodes	Number of faulty DataNodes	3
		NameNode Non-Heap Memory Usage Statistics	Percentage of NameNode non-heap memory usage	90%
		NameNode Direct Memory Usage Statistics	Percentage of direct memory used by NameNodes	90%
		NameNode Heap Memory Usage Statistics	Percentage of NameNode non-heap memory usage	95%
		DataNode Direct Memory Usage Statistics	Percentage of direct memory used by DataNodes	90%
		DataNode Heap Memory Usage Statistics	DataNode heap memory usage	95%
		DataNode Heap Memory Usage Statistics	Percentage of DataNode non-heap memory usage	90%
	Garbage Collection	GC Time (NameNode)/GC Time (DataNode)	Garbage collection (GC) duration of NameNodes per minute	10000 ms (major) 15000 ms (critical)
	Garbage Collection	GC Time	GC duration of DataNodes per minute	12000 ms (major) 20000 ms (critical)
Hive	HQL	Percentage of HQL Statements That Are Executed Successfully by Hive	Percentage of HQL statements that are executed successfully by Hive	90% (critical) 80% (major)
	Connections	Percentage of Number of Sessions Connected to the MetaStore to the Maximum Allowed (MetaStore)	Percentage of the number of sessions connected to MetaStore to the maximum number of sessions allowed by MetaStore	90% (critical) 80% (major)
	Background	Background Thread Usage	Background thread usage	90% (critical) 80% (major)
	GC	Total GC time of MetaStore	Total GC time of MetaStore	12000 ms
	GC	HiveServer Total GC Time in Milliseconds	Total GC time of HiveServer	12000 ms
	Capacity	Percentage of HDFS Space Used by Hive to the Available Space	Percentage of HDFS space used by Hive to the available space	95% (critical) 85% (major)
	CPU & memory	MetaStore Direct Memory Usage Statistics	MetaStore direct memory usage	95% (critical) 85% (major)
		MetaStore Non-Heap Memory Usage Statistics	MetaStore non-heap memory usage	95% (critical) 85% (major)
		MetaStore Heap Memory Usage Statistics	MetaStore heap memory usage	95% (critical) 85% (major)
		HiveServer Direct Memory Usage Statistics	HiveServer direct memory usage	95% (critical) 85% (major)
		HiveServer Non-Heap Memory Usage Statistics	HiveServer non-heap memory usage	95% (critical) 85% (major)
		HiveServer Heap Memory Usage Statistics	HiveServer heap memory usage	95% (critical) 85% (major)
	Session	Percentage of Sessions Connected to the HiveServer to Maximum Number of Sessions Allowed by the HiveServer	Indicates the percentage of the number of sessions connected to the HiveServer to the maximum number of sessions allowed by the HiveServer.	90% (critical) 80% (major)
Kafka	Partition	Percentage of Partitions That Are Not Completely Synchronized	Indicates the percentage of partitions that are not completely synchronized to total partitions.	60% (critical) 50% (major)
	Disk	Broker Disk Usage	Indicates the disk usage of the disk where the Broker data directory is located.	90% (critical) 85% (major)
	Disk	Disk I/O Rate of a Broker	I/O usage of the disk where the Broker data directory is located	80%
	Process	Broker GC Duration per Minute	Indicates the GC duration of the Broker process per minute.	12000 ms
		Heap Memory Usage of Kafka	Indicates the Kafka heap memory usage.	95%
		Kafka Direct Memory Usage	Indicates the Kafka direct memory usage.	100% (critical) 95% (major)
	Others	User Connection Usage on Broker	Usage of user connections on Broker	90% (critical) 85% (major)
Loader	Memory	Heap Memory Usage Calculate	Indicates the Loader heap memory usage.	95% (critical) 80% (major)
		Direct Memory Usage of Loader	Indicates the Loader direct memory usage.	95% (critical) 80% (major)
		Non-heap Memory Usage of Loader	Indicates the Loader non-heap memory usage.	95% (critical) 80% (major)
	GC	Total GC time of Loader	Indicates the total GC time of Loader.	20000 ms (critical) 12000 ms (major)
MapReduce	Garbage Collection	GC Time	Indicates the GC time.	20000 ms (critical) 12000 ms (major)
	Resource	JobHistoryServer Direct Memory Usage Statistics	Indicates the JobHistoryServer direct memory usage.	95% (critical) 90% (major)
		JobHistoryServer Non-Heap Memory Usage Statistics	Indicates the JobHistoryServer non-heap memory usage.	95% (critical) 90% (major)
		JobHistoryServer Heap Memory Usage Statistics	Indicates the JobHistoryServer non-heap memory usage.	95% (critical) 90% (major)
Metadata	Others	Heap Memory Usage Calculate	Indicates the Metadata heap memory usage.	95%
		Metadata Direct Memory Usage Statistics	Indicates the metadata direct memory usage.	80.0%
		Metadata Non-heap Memory Usage	Indicates the metadata non-heap memory usage.	80.0%
		Total GC time of Metadata	Indicates the metadata total GC time.	20000 ms (critical) 12000 ms (major)
Oozie	Memory	Oozie Heap Memory Usage Calculate	Indicates the Oozie heap memory usage.	95%
		Oozie Direct Memory Usage	Indicates the Oozie direct memory usage.	90%
		Oozie Non-heap Memory Usage	Indicates the Oozie non-heap memory usage.	90%
	GC	Total GC duration of Oozie	Indicates the Oozie total GC time.	20000 ms (critical) 12000 ms (major)
Solr	Replica Quantity Statistics	Bad Replica Number	Number of bad replicas of a Solr instance	0
	Garbage Collection	GC Time	Garbage collection duration of the Solr instance process	12000 ms
	Memory	Heap Memory Usage	Indicates the heap memory usage.	99% (critical) 95% (major)
	Shard	Solr Shard Data Volume	Data volume of Solr shards	83886080 (critical) 41943040 (Major)
	Shard	Solr Shard Document Number	Number of Solr shard documents	400000000
Spark	Memory	JDBCServer Heap Memory Usage Statistics	JDBCServer heap memory usage	95% (critical) 85% (major)
		JDBCServer Direct Memory Usage Statistics	JDBCServer direct memory usage	95% (critical) 85% (major)
		JDBCServer Non-Heap Memory Usage Statistics	JDBCServer non-heap memory usage	95% (critical) 85% (major)
		JobHistory Direct Memory Usage Statistics	JobHistory direct memory usage	95% (major) 85% (minor)
		JobHistory Non-Heap Memory Usage Statistics	JobHistory non-heap memory usage	95% (major) 85% (minor)
		JobHistory Heap Memory Usage Statistics	JobHistory heap memory usage	95% (major) 85% (minor)
		IndexServer Direct Memory Usage Statistics	IndexServer direct memory usage	95% (critical) 85% (major)
		IndexServer Heap Memory Usage Statistics	IndexServer heap memory usage	95% (critical) 85% (major)
		IndexServer Non-Heap Memory Usage Statistics	IndexServer non-heap memory usage	95% (critical) 85% (major)
	GC Count	Full GC Number of JDBCServer	Full GC times of JDBCServer	12 (critical) 9 (major)
		Full GC Number of JobHistory	Full GC times of JobHistory	12 (critical) 9 (major)
		Full GC Number of IndexServer	Full GC times of IndexServer	12 (critical) 9 (major)
	GC Time	JDBCServer Total GC Time in Milliseconds	Total GC time of JDBCServer	12000 ms (critical) 9600 ms (major)
		JobHistory Total GC Time in Milliseconds	Total GC time of JobHistory	12000 ms (major) 9600 ms (minor)
		IndexServer Total GC Time in Milliseconds	Total GC time of IndexServer	12000 ms (critical) 9600 ms (major)
Yarn	Resources	NodeManager Direct Memory Usage Statistics	Indicates the percentage of direct memory used by NodeManagers.	90%
		NodeManager Heap Memory Usage Statistics	Indicates the percentage of NodeManager heap memory usage.	95%
		NodeManager Non-Heap Memory Usage Statistics	Indicates the percentage of NodeManager non-heap memory usage.	90%
		ResourceManager Direct Memory Usage Statistics	Indicates the ResourceManager direct memory usage.	90%
		ResourceManager Heap Memory Usage Statistics	Indicates the ResourceManager heap memory usage.	95%
		ResourceManager Non-Heap Memory Usage Statistics	Indicates the ResourceManager non-heap memory usage.	90%
	Garbage collection	GC Time	Indicates the GC duration of NodeManager per minute.	12000 ms (major) 20000 ms (critical)
	Garbage collection	GC Time	Indicates the GC duration of ResourceManager per minute.	10000 ms (major) 15000 ms (critical)
	Others	Failed Applications of root queue	Number of failed tasks in the root queue	50
	Others	Terminated Applications of root queue	Number of killed tasks in the root queue	50
	CPU & memory	Pending Memory	Pending memory capacity	83886080MB
	Application	Pending Applications	Pending tasks	60
ZooKeeper	Connection	ZooKeeper Connections Usage	Indicates the percentage of the used connections to the total connections of ZooKeeper.	80% (major) 90% (critical)
	CPU & memory	ZooKeeper Heap Memory Usage	Indicates the ZooKeeper heap memory usage.	95%
	CPU & memory	ZooKeeper Direct Memory Usage	Indicates the ZooKeeper direct memory usage.	80%
	GC	ZooKeeper GC Duration per Minute	Indicates the GC time of ZooKeeper every minute.	5000 ms (major) 10000 ms (critical)
meta	OBS data write operation	Total Number of Failed OBS Write API Calls	Total number of failed OBS write API calls	10
	OBS exception	Total Number of OBSFileConflictException Errors	Total number of OBSFileConflictException errors	5
		Total Number of OBS AccessControlExceptions Errors	Total number of OBS AccessControlExceptions errors	5
		Total Number of OBS EOFException Errors	Total number of OBS EOFException errors	5
		Total Number of OBSMethodNotAllowedException Errors	Total number of OBSMethodNotAllowedException errors	5
		Total Number of OBSIOException Errors	Total number of OBSIOException errors	5
		Total Number of OBS FileNotFoundException Errors	Total number of OBS FileNotFoundException errors	5
		Total Number of Throttled OBS Operations	Total number of throttled OBS operations	5
		Total Number of OBSIllegalArgumentExceptions Errors	Total number of OBSIllegalArgumentExceptions errors	5
		Total Number of Other OBS Exceptions	Total number of other OBS exceptions reported by all nodes	5
	OBS data read operation	Total Number of Failed OBS Read API Calls	Total number of failed OBS read API calls	10
	OBS data read operation	Total Number of Failed OBS readFully API Calls	Total number of failed OBS readFully API calls	10
Ranger	GC	UserSync GC Duration	UserSync garbage collection (GC) duration	20000 ms (critical) 12000 ms (major)
		PolicySync GC Duration	PolicySync GC Duration	20000 ms (critical) 12000 ms (major)
		RangerAdmin GC Duration	RangerAdmin GC duration	20000 ms (critical) 12000 ms (major)
		TagSync GC Duration	TagSync GC duration	20000 ms (critical) 12000 ms (major)
	CPU & memory	UserSync Non-Heap Memory Usage	UserSync non-heap memory usage	80.0%
		UserSync Direct Memory Usage	UserSync direct memory usage	80.0%
		UserSync Heap Memory Usage	UserSync heap memory usage	95.0%
		PolicySync Direct Memory Usage	Percentage of the PolicySync direct memory usage	80.0%
		PolicySync Heap Memory Usage	Percentage of PolicySync heap memory usage	95.0%
		PolicySync Non-Heap Memory Usage	Percentage of PolicySync non-heap memory usage	80.0%
		RangerAdmin Non-Heap Memory Usage	RangerAdmin non-heap memory usage	80.0%
		RangerAdmin Heap Memory Usage	RangerAdmin heap memory usage	95.0%
		RangerAdmin Direct Memory Usage	RangerAdmin direct memory usage	80.0%
		TagSync Direct Memory Usage	TagSync direct memory usage	80.0%
		TagSync Non-Heap Memory Usage	TagSync non-heap memory usage	80.0%
		TagSync Heap Memory Usage	TagSync heap memory usage	95.0%
ClickHouse	Cluster Quota	Clickhouse service quantity quota usage in ZooKeeper	Quota of the ZooKeeper nodes used by a ClickHouse service	95% (critical) 90% (major)
	Cluster Quota	Capacity quota usage of the Clickhouse service in ZooKeeper	Capacity quota of ZooKeeper directory used by the ClickHouse service	95% (critical) 90% (major)
	Concurrencies	Concurrency Number (ClickHouseServer)	Actual number of concurrent SQL statements of the ClickHouse service	90
IoTDB	Merge	Maximum Task Merge (Intra-Space Merge) Latency	Maximum latency of IoTDBServer intra-space merge	300000ms
		Maximum Merge Task (Flush) Latency	Maximum latency of IoTDBServer flush execution	300000ms
		Maximum Task Merge (Cross-Space Merge) Latency	Maximum latency of IoTDBServer cross-space merge	300000ms
	RPC	Maximum RPC (executeStatement) Latency	Maximum latency of IoTDBServer RPC execution	10000s
	GC	Total GC duration of IoTDBServer	Total time used for IoTDBServer garbage collection (GC)	30000 ms (critical) 12000 ms (major)
	GC	Total GC Duration of ConfigNode	Total time used for ConfigNode garbage collection (GC)	30000 ms (critical) 12000 ms (major)
	Memory	IoTDBServer Heap Memory Usage	IoTDBServer heap memory usage	100% (critical) 90% (major)
		IoTDBServer Direct Memory Usage	IoTDBServer direct memory usage	100% (critical) 90% (major)
		ConfigNode Heap Memory Usage	Percentage of the ConfigNode heap memory usage	100% (critical) 90% (major)
		ConfigNode Direct Memory Usage	Percentage of the ConfigNode direct memory usage	100% (critical) 90% (major)
Containers	Others	Metaspace Usage	WebContainer metaspace usage	75.0%
		Non-Heap Memory Usage	WebContainer non-heap memory usage	75.0%
		Heap Memory Usage	WebContainer heap memory usage	95.0%
		Failure Rate of Application Service Calling	Failure rate of application service calling (SGP)	10.0
		Application Service Calling Latency	Application service calling latency (SGP)	10000.0
		Maximum Number of Concurrent Application Services	Maximum number of concurrent application services (SGP)	120
		BLU Health Status	BLU health status statistics	50.0%
LdapServer	Others	Process Connections of a Single SlapdServer Instance	Number of SlapdServer process connections	1000
LdapServer	Others	CPU Usage of a Single SlapdServer Instance	SlapdServer CPU usage	1200%
Guardian	GC	TokenServer GC Duration	TokenServer GC duration	12000 ms
	CPU & memory	TokenServer Heap Memory Usage	Percentage of the heap memory used by the TokenServer process	95.0%
		TokenServer Non-Heap Memory Usage	Percentage of the non-heap memory used by the TokenServer process	80.0%
		TokenServer Direct Memory Usage	Percentage of the TokenServer direct memory usage	80.0%
Doris	JVM	Accumulated Old-Generation GC Duration	Accumulated GC duration of the old-generation FE process	3000ms
	Connection	FE Ratio of the number of MySQL port connections (FE)	Proportion of connections to the MySQL port of the FE node	95%
	Disk	BE Data Disk Usage	BE data disk usage	95%
	Disk	Disk Status of a Specified Data Directory	Statistics on abnormal disk status of a specified data directory on the BE.	1
	Performance	Maximum Compaction Score of All BE Nodes	Maximum FE compaction score of all BE nodes	10
	Performance	Maximum Duration of RPC Requests Received by Each Method of the FE Thrift Interface	Maximum duration of RPC requests received by each method of the FE thrift interface.	5000ms
	Queue	Queue Length of BE Periodic Report Tasks on the FE	Queue length of BE periodic report tasks on the FE node	10
		Number of FE Tasks Queuing in the Thread Pool Interacting with the BE	Number of FE tasks queuing in the thread pool interacting with the BE node	10
		Number of FE Tasks Queuing in the Task Processing Thread Pool	Number of FE tasks that are queuing in the task processing thread pool on the FE node	10
		Queue Length of Query Execution Thread Pool	Queue length of query execution thread pool	20
	Exception	Failed Metadata Image Generation	Failed metadata image generation on the FE node	1
		Failed Historical Metadata Image Clearing	Failed historical metadata image clearing on the FE node	1
		Status of the Doris FE instance (FE)	Process status statistics of the Doris FE instance.	0
		Status of the Doris BE instance (BE)	Process status statistics of the Doris BE instance.	0
		Error Rate of TCP Packet Receiving (BE)	Error rate of TCP packet receiving on the BE	5%
		Whether the Number of Task Failures of a Certain Type Increases (BE)	Whether the number of failures of a certain type of tasks executed on the BE increases	1
	CPU and Memory	FE CPU Usage	CPU usage statistics on FE nodes	95% (critical) 90% (major)
		FE Memory Usage	Memory usage statistics on FE nodes	90% (critical) 85% (major)
		FE Memory Usage	Memory usage of FE nodes	95%
		FE Heap Memory Usage Rate	Heap memory usage of FE nodes	95%
		BE Memory Usage Rate	Memory usage statistics on BE nodes	90% (critical) 85% (major)
		Maximum BE Memory and Remaining Machine Memory on the BE	The maximum memory required by the BE is greater than the remaining available memory.	1
		BE CPU Usage	CPU usage statistics on BE nodes	95% (critical) 90% (major)