ALM-12085 Service Audit Log Dump Failure
Description
The system dumps service audit logs at 03:00 every day and stores them on the OMS node. This alarm is generated when the dump fails. This alarm is cleared when the next dump succeeds.
Attribute
Alarm ID |
Alarm Severity |
Auto Clear |
---|---|---|
12085 |
Minor |
Yes |
Parameters
Name |
Meaning |
---|---|
Source |
Specifies the cluster or system for which the alarm is generated. |
ServiceName |
Specifies the service for which the alarm is generated. |
RoleName |
Specifies the role for which the alarm is generated. |
HostName |
Specifies the host for which the alarm is generated. |
Impact on the System
If the audit logs of a component fail to be dumped, the audit logs cannot be retrieved if they are aged locally. This affects service analysis and troubleshooting of the component.
Possible Causes
- The service audit logs are oversized.
- The OMS backup storage space is insufficient.
- The storage space of a host where the service is located is insufficient.
Procedure
Check whether the service audit logs are oversized.
- In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and view the IP address of the host and additional information for which the alarm is generated.
- Log in to the host where the alarm is generated as user root.
- Run the vi ${BIGDATA_LOG_HOME}/controller/scriptlog/getLogs.log command to check whether the keyword "LOG SIZE is more than 5000MB" can be searched.
- Check whether the oversized service audit logs are caused by exceptions.
The OMS backup storage space is insufficient.
- Run the vi ${BIGDATA_LOG_HOME}/controller/scriptlog/getLogs.log command to check whether the keyword "Collect log failed, too many logs on" can be searched.
- Log in to the host with the IP address obtained in 5 as user root.
- Run the vi {BIGDATA_LOG_HOME}/nodeagent/scriptlog/collectLog.log command to check whether the keyword "log size exceeds" can be searched.
- Check whether the alarm additional information contains the keyword "no enough space".
- Perform the following operations to expand the disk capacity (only for MRS 3.1.2 and earlier versions) or reduce the maximum number of audit log backups:
- Expand the capacity of the OMS node.
- Run the following command to edit the file and decrease the value of MAX_NUM_BK_AUDITLOG.
vi ${CONTROLLER_HOME}/etc/om/componentsauditlog.properties
- In the next execution period, 03:00, check whether the alarm is cleared.
- If it is, no further action is required.
- If it is not, go to 11.
Check whether the space of the host where the service is located is insufficient.
- Run the vi ${BIGDATA_LOG_HOME}/controller/scriptlog/getLogs.log command to check whether the keyword "Collect log failed, no enough space on hostIp" can be searched.
- Log in to the host with the IP address obtained as user root, and run the df "$BIGDATA_HOME/tmp" -lP | tail -1 | awk '{print ($4/1024)}' command to obtain the remaining space of the host log directory. Check whether the value is less than 1000 MB.
- Expand the capacity of the node
- In the next execution period, 03:00, check whether the alarm is cleared.
- If it is, no further action is required.
- If it is not, go to 15.
Collect fault information.
- On FusionInsight Manager, choose O&M> Log > Download.
- Select Controller for Service and click OK.
- Click in the upper right corner. In the displayed dialog box, set Start Date and End Date to 10 minutes before and after the alarm generation time respectively and click OK. Then, click Download.
- Contact the O&M personnel and send the collected log information.
Alarm Clearing
This alarm will be automatically cleared after the fault is rectified.
Related Information
None
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot