ALM-43204 GC Duration of the Elasticsearch Process Exceeds the Threshold
Alarm Description
The system checks the garbage collection (GC) duration of the Elasticsearch process every 60s. This alarm is generated when the GC duration exceeds the threshold.
If Trigger Count is set to 1, and the GC duration of the Elasticsearch process is less than or equal to the threshold, this alarm is cleared. If Trigger Count is greater than 1, and the GC duration of the Elasticsearch process is less than or equal to 90% of the threshold, this alarm is cleared.
Alarm Attributes
Alarm ID |
Alarm Severity |
Alarm Type |
Service Type |
Auto Cleared |
---|---|---|---|---|
43204 |
Major (default threshold: 30000ms) Critical (default threshold: 60000ms) |
Quality of service |
Elasticsearch |
Yes |
Alarm Parameters
Type |
Parameter |
Description |
---|---|---|
Location Information |
Source |
Specifies the cluster for which the alarm is generated. |
ServiceName |
Specifies the service for which the alarm is generated. |
|
RoleName |
Specifies the role for which the alarm is generated. |
|
HostName |
Specifies the host for which the alarm is generated. |
|
Additional Information |
Trigger Condition |
Specifies the threshold for triggering the alarm. |
Impact on the System
If the GC time of the Elasticsearch instance process is too long, the index data read/write performance of Elasticsearch may be affected, and the request may time out.
Possible Causes
Service load of the Elasticsearch instance on the node is high or the heap memory is not properly configured. As a result, GC frequently occurs.
Handling Procedure
Check the configured heap memory.
- Log in to FusionInsight Manager, choose O&M > Alarm > Alarms, and check the location information of this alarm. Check the IP address of the instance for which the alarm is generated.
- On the FusionInsight Manager home page, choose Cluster > Name of the desired cluster > Services > Elasticsearch. On the displayed page, click Instance and Click the drop-down menu in the Chart area and choose Customize > Clear All > Garbage Collection > EsMaster GC Time Stats, and click OK. Check whether the GC duration is greater than the threshold.
- Choose Cluster > Name of the desired cluster > Services > Elasticsearch. On the displayed page, click Configurations.
- In the upper right corner of the Configuration page, enter GC_OPTS in the search box and click. The GC_OPTS parameter values of all instances are displayed.
- Select the instance whose GC_OPTS value needs to be changed, and check whether the differentiated configuration iconis displayed after the instance value configuration box.
- Click. In the displayed dialog box, clickin the right pane and click OK to save the settings.
- Adjust the values of -Xms and -Xmx of the GC_OPTS parameter by referring to the Note.
Suggestions on configuring the GC parameter of Elasticsearch:
- It is recommended that 50% memory be reserved for the Lucence cache and 50% memory for Solr. You are advised to allocate 30 GB (no more than 31 GB) to machines with large memory. Confirm that the JVM Compressed Oops function has been enabled. You can run the following command to check:
java -server -Xms28G -Xmx28G -XX:+UseConcMarkSweepGC -XX:+UnlockDiagnosticVMOptions -XX:+PrintCompressedOopsMode -version
If the returned value for Compressed Oops mode is Zero based, it indicates that the JVM Compressed Oops function is enabled and you need to increase the size of the allocated memory. Change 28 GB to 29 GB and check whether the Compressed Oops function is enabled. Try until the allocated memory reaches the maximum for the Compressed Oops function to remain enabled.
If the returned value for Compressed Oops mode is Non-zero based, it indicates that the JVM Compressed Oops function is disabled and you need to decrease the size of the allocated memory. Change 28 GB to 27 GB and check whether the Compressed Oops function is enabled. Try until the allocated memory reaches the maximum for the Compressed Oops function to remain enabled.
- It is recommended that -Xms and -Xmx be set to the same value to prevent dynamic adjustment of heap memory size by JVM from affecting the performance.
- If half of the computer memory is less than the number of instances multiplied by 30 GB, allocate the memory by referring to the following:
Instance memory = (Computer memory x 0.5)/Number of instances on the computer
For example, if a computer has a memory of 128 GB and has three Elasticsearch instances, the value of GC_OPTS is: 128 GB x 0.5/3 = 21 GB, and Confirm that the JVM Compressed Oops function has been enabled.
- It is recommended that 50% memory be reserved for the Lucence cache and 50% memory for Solr. You are advised to allocate 30 GB (no more than 31 GB) to machines with large memory. Confirm that the JVM Compressed Oops function has been enabled. You can run the following command to check:
- After the modification, click Save in the upper left corner. In the Save Configuration dialog box displayed, click OK.
- On FusionInsight Manager, choose Cluster > Name of the desired cluster > Services > Elasticsearch. On the displayed page, click Instance, select the instances whose Configuration Status is Expired, and restart the instances.
- Five minutes later, check whether the alarm is cleared.
- If yes, no further action is required.
- If no, go to 11.
Collect fault information.
- On FusionInsight Manager, and choose O&M > Log > Download.
- Select Elasticsearch in the required cluster for Service.
- Click in the upper right corner. In the displayed dialog box, set Start Date and End Date to 10 minutes before and after the alarm generation time respectively and click OK. Then, click Download.
- Contact the O&M engineers and send the collected logs.
Alarm Clearance
After the fault is rectified, the system automatically clears this alarm.
Related Information
None.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot