ALM-19026 Damaged WAL Files in HBase
Alarm Description
The system checks the hdfs://hacluster/hbase/corrupt directory on the HDFS of each HBase service every 120 seconds. This alarm is generated when there are WAL files in the /hbase/corrupt directory.
This alarm is cleared when the /hbase/corrupt directory does not exist or does not contain WAL files.
hdfs://hacluster indicates the name of the file system used by HBase, and /hbase indicates the root directory of HBase in the file system. You can log in to FusionInsight Manager, choose Cluster > Services > HBase and click Configuration. Search for fs.defaultFS and hbase.data.rootdir.
Alarm Attributes
Alarm ID |
Alarm Severity |
Alarm Type |
Service Type |
Auto Cleared |
---|---|---|---|---|
19026 |
Major |
Error handling |
HBase |
Yes |
Alarm Parameters
Type |
Parameter |
Description |
---|---|---|
Location Information |
Source |
Specifies the cluster for which the alarm is generated. |
ServiceName |
Specifies the service for which the alarm is generated. |
|
RoleName |
Specifies the role for which the alarm is generated. |
|
HostName |
Specifies the host for which the alarm is generated. |
Impact on the System
If the data in the damaged file is not flushed to disks, the data will be lost. As a result, some data queried by the service is inconsistent.
Possible Causes
The WAL files are damaged.
Handling Procedure
- Log in to FusionInsight Manager and choose O&M. In the navigation pane on the left, choose Alarm > Alarms. On the page that is displayed, locate the row containing the alarm whose Alarm ID is 19026, and view the service in Location.
- Log in to the node where the HDFS clients are installed as the client installation user and run the following commands:
cd Client installation directory
source bigdata_env
kinit Component service user (If Kerberos authentication is disabled for the cluster (the cluster is in normal mode), skip this step.)
- Run the following command to check the damaged WAL files and go to 4:
hdfs dfs -ls hdfs://hacluster/hbase/corrupt/*%2C*
Collect fault information.
- On FusionInsight Manager, choose O&M. In the navigation pane on the left, choose Log > Download.
- Expand the Service drop-down list, and select HBase for the target cluster.
- Click the edit icon in the upper right corner, and set Start Date and End Date for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click Download.
- Contact O&M engineers and provide the collected logs.
Alarm Clearance
This alarm is automatically cleared after the fault is rectified.
Related Information
None.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot