ALM-13007 Available ZooKeeper Client Connections Are Insufficient
Description
The system periodically detects the number of active processes between the ZooKeeper client and the ZooKeeper server every 60 seconds. This alarm is generated when the number of connections exceeds the threshold.
Attribute
Alarm ID |
Alarm Severity |
Automatically Cleared |
---|---|---|
13007 |
Minor |
Yes |
Parameters
Name |
Meaning |
---|---|
Source |
Specifies the cluster for which the alarm is generated. |
ServiceName |
Specifies the service name for which the alarm is generated. |
RoleName |
Specifies the role name for which the alarm is generated. |
HostName |
Specifies the host name for which the alarm is generated. |
ClientIP |
Specifies the client IP address. |
ServerIP |
Specifies the server IP address. |
Trigger Condition |
Specifies the cause of the alarm. |
Impact on the System
A large number of processes are connected to ZooKeeper, and the number of ZooKeeper connections is used up. As a result, services of upstream components (such as Yarn, Flink, and Spark) are abnormal.
Possible Causes
A large number of client processes are connected to ZooKeeper. The thresholds are not appropriate.
Procedure
Check whether there are a large number of client processes connected to ZooKeeper.
- On FusionInsight Manager, choose O&M > Alarm > Alarms. On the displayed interface, click the drop-down button of Available ZooKeeper Client Connections Are Insufficient. Confirm the node IP address of the host for which the alarm is generated in the Location Information.
- Open the ZooKeeper service interface, click Resource to enter the Resource page, and check whether the number of connections of the client with the IP address specified by Number of Connections (By Client IP Address) is large.
- Check whether connection leakage occurs on the client process.
- Click in the Number of Connections (by Client IP Address) to enter the Thresholds page, and click Modify under Operation. Increase the threshold by referring to the value of maxClientCnxns by choosing Cluster > Name of the desired cluster > Services > ZooKeeper > Configurations > All Configurations > quorumpeer.
- Check whether the alarm is cleared.
- If it is, no further action is required.
- If it is not, go to 6.
Collect fault information.
- On the FusionInsight Manager portal, choose O&M > Log > Download.
- Select ZooKeeper in the required cluster from the Service.
- Click in the upper right corner, and set Start Date and End Date for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click Download.
- Contact the O&M personnel and send the collected logs.
Alarm Clearing
After the fault is rectified, the system automatically clears this alarm.
Related Information
None
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot