ALM-12014 Device Partition Lost
Alarm Description
This alarm is generated when the system detects that a partition to which service directories are mounted is lost (because the device is removed or offline, or the partition is deleted). The system checks the partition status every 60 seconds.
Alarm Attributes
Alarm ID |
Alarm Severity |
Alarm Type |
Service Type |
Auto Cleared |
---|---|---|---|---|
12014 |
Major |
Physical resource |
FusionInsight Manager |
Yes (Versions earlier than MRS 3.3.0 do not support automatic clearance.) |
Alarm Parameters
Type |
Parameter |
Description |
---|---|---|
Location Information |
Source |
Specifies the cluster or system for which the alarm was generated. |
ServiceName |
Specifies the service for which the alarm was generated. |
|
RoleName |
Specifies the role for which the alarm was generated. |
|
HostName |
Specifies the host for which the alarm was generated. |
|
MountDirectoryName |
Specifies the directory for which the alarm was generated. |
|
PartitionName |
Specifies the device partition for which the alarm was generated. |
|
Additional Information |
Details |
Specifies alarm details. |
Disk ESN |
Specifies the serial number of the disk in the device partition for which the alarm was generated. |
Impact on the System
- Data loss: The device partition is lost and the data stored in the partition is lost.
- System breakdown: If the system disk is lost, the system deployed on the node cannot run properly. In some cases, the system may break down and cannot be started.
- Service failure: Read and write jobs on the lost device partition fail to run or run slowly.
- Service interruption: Customers may need time to restore data and systems, and services cannot be provided.
- Security risk: Important data may be stolen or disclosed, which severely affects customer services.
Possible Causes
- The disk is removed.
- The disk is offline, or a bad sector exists on the disk.
Handling Procedure
- Log in to FusionInsight Manager, choose O&M > Alarm > Alarms, and click in the row that contains the alarm.
- Obtain the HostName, PartitionName, and DirName from the Location area.
- Check whether the disk of PartitionName on HostName is inserted to the correct server slot.
- Contact hardware engineers to remove the faulty disk.
- Log in to the host for which the alarm is generated as user root and check whether the /etc/fstab file has a row containing the directory name.
- Run the vi /etc/fstab command to edit the file and delete the line containing the mounting directory name.
- Contact hardware engineers to insert a new disk. For details, see the hardware product document of the relevant model. If the faulty disk is in a RAID group, configure the RAID group. For details, see the configuration methods of the relevant RAID controller card.
- Wait 20 to 30 minutes (The disk size determines the waiting time), and run the mount command to check whether the disk has been mounted to the specified directory.
- Wait 2 minutes and check whether the alarm is automatically cleared.
- If yes, no further action is required.
- If no, go to 10.
Collect fault information.
- On FusionInsight Manager, choose O&M. In the navigation pane on the left, choose Log > Download.
- Expand the Service drop-down list, select OmmServer for the target cluster, and click OK.
- Click in the upper right corner, and set Start Date and End Date for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click Download.
- Contact O&M engineers and provide the collected logs.
Alarm Clearance
MRS 3.3.0 and later patch versions: After the fault is rectified, the system automatically clears the alarm.
MRS 3.3.0 and earlier versions: After the fault is rectified, the system does not automatically clear the alarm. You need to clear the alarm.
Related Information
None.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot