ALM-12099 Core Dump for Cluster Processes
Alarm Description
Core files generated when applications crash are centrally managed on a cluster. This alarm is generated when a new core file is detected.
Alarm Attributes
Alarm ID |
Alarm Severity |
Alarm Type |
Service Type |
Auto Cleared |
---|---|---|---|---|
12099 |
Minor |
Quality of service |
FusionInsight Manager |
No |
Alarm Parameters
Type |
Parameter |
Description |
---|---|---|
Location Information |
Source |
Specifies the cluster or system for which the alarm was generated. |
ServiceName |
Specifies the service for which the alarm was generated. |
|
RoleName |
Specifies the role for which the alarm was generated. |
|
HostName |
Specifies the host for which the alarm was generated. |
|
Viewing the timestamp |
Timestamp |
|
Additional Information |
Details |
Specifies alarm details. |
Impact on the System
If a key process crashes, the cluster may be unavailable for a short period of time.
Possible Causes
Processes crash.
Handling Procedure
- The following operations for parsing and viewing core file stack information may involve sensitive user data. Developers or O&M engineers can perform these operations only after being authorized by users.
- By default, the system keeps the core files the alarm is generated for for 72 hours. The system automatically clears the files upon expiration or if the files are too large. If this alarm is report, contact O&M engineers as soon as possible.
- In the alarm list on FusionInsight Manager, locate the row that contains the alarm, view the host address for which the alarm is generated in the alarm details, and view the path for storing the core files specified by the DumpedFilePath attribute in the additional information.
- Log in to the host for which the alarm is generated as user omm and run the gdb --version command to check whether the gdb tool is installed on the host.
- Use the gdb tool to view the detailed stack information about the core files.
- Go to the DumpedFilePath directory and find the core files.
- Run the following command to obtain the symbol table of the core files:
source $BIGDATA_HOME/mppdb/.mppdbgs_profile
cd ${BIGDATA_HOME}/FusionInsight_MPPDB_XXX/install/FusionInsight-MPPDB-XXX/package/MPPDB_ALL_PACKAGE
tar -xzvf GaussDB-Kernel-V300R002C00- Operating system-64bit-symbol.tar.gz
cd symbols/bin/
Find the symbol table file whose name is the same as the process name in the alarm. For example, the symbol table of cm_agent is cm_agent.symbol.
Copy the symbol table to the ${GAUSSHOME}/bin directory.
- Run the gdb --batch -n -ex thread -ex bt Core file name command to view the detailed stack information about the core file.
- Contact O&M engineers and provide the collected logs.
Alarm Clearance
After the fault is rectified, the system does not automatically clear this alarm and you need to manually clear the alarm.
Related Information
None.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot