ALM-50226 Unavailable BE Instances
Alarm Description
The system checks the BE process status every 30 seconds. This alarm is generated when the value is greater than 0 (0 indicates that the BE process is normal and 1 indicates that the BE process is abnormal).
This alarm is cleared when the system detects that the BE process becomes normal.
Alarm Attributes
Alarm ID |
Alarm Severity |
Alarm Type |
Service Type |
Auto Cleared |
---|---|---|---|---|
50226 |
Critical |
Error handling |
Doris |
Yes |
Alarm Parameters
Type |
Parameter |
Description |
---|---|---|
Location Information |
Source |
Specifies the cluster or system for which the alarm is generated. |
ServiceName |
Specifies the service for which the alarm is generated. |
|
RoleName |
Specifies the role for which the alarm is generated. |
|
HostName |
Specifies the host for which the alarm is generated. |
|
Additional Information |
Detail |
Specifies the alarm triggering condition. |
Impact on the System
The BE instance is unavailable and cannot provide the data read and write functions.
Possible Causes
- The BE instance is faulty or restarted.
- The BE node disks are abnormal.
- The local disk space of BE nodes is insufficient.
Handling Procedure
View the BE instance status.
- Log in to FusionInsight Manager and choose O&M > Alarm > Alarms. In the alarm list, view the role name and obtain the IP address of the instance in Location of the alarm whose ID is 50226.
- Choose Cluster > Services > Doris > Instances, click the BE instance for which the alarm is generated, and check whether Running Status of the instance is Unknown or Restoring.
- Return to the Instances page, select the BE instance, and choose More > Restart Instance.
- After the BE instance is restarted, choose O&M > Alarm > Alarms. In the alarm list, check whether alarm "Unavailable BE Instances" is cleared.
- If yes, no further action is required.
- If no, go to 5.
Check BE node disks.
- In the alarm list, check whether the BE instances listed in 1 report the "Disk Status of a Specified Data Directory on BE Is Abnormal" alarm.
- Contact O&M engineers to repair the disk.
- In the alarm list, check whether the "Unavailable BE Instances" alarm is cleared.
- If yes, no further action is required.
- If no, go to 8.
Check the local disk space of BE nodes.
- In the alarm list, check whether the BE instance in 1 reports the "BE Data Disk Usage Exceeds the Threshold" alarm.
- Perform the following operations to increase the BE disk space:
- Check the value of storage_root_path in the ${BIGDATA_HOME}/FusionInsight_Doris_*/*_*_BE/etc/be.conf file and mount more disks to the directory as you need.
- Delete data from partitions that are no longer used in the table based on service demand.
- On FusionInsight Manager, choose Cluster > Service > Doris > Instances > Add Instance, and add BE nodes as you need.
- After the MySQL client is connected to Doris, run the following command to reduce the number of table replicas based on service demand:
alter table tblName set ("replication_allocation" = "tag.location.default: xxx");
- In the alarm list, check whether the "Unavailable BE Instances" alarm is cleared.
- If yes, no further action is required.
- If no, go to 11.
Collect fault information.
- On FusionInsight Manager, choose O&M. In the navigation pane on the left, choose Log > Download.
- Expand the Service drop-down list, select Doris for the target cluster, and click OK.
- Click the edit icon in the upper right corner, and set Start Date and End Date for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click Download.
- Contact O&M engineers and provide the collected logs.
Alarm Clearance
This alarm is automatically cleared after the fault is rectified.
Related Information
None.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot