ALM-12052 TCP Temporary Port Usage Exceeds the Threshold
Alarm Description
The system checks the TCP temporary port usage every 30 seconds and compares the actual usage with the threshold. This alarm is generated when the TCP temporary port usage exceeds the threshold for several times (5 times by default) consecutively.
To change the threshold, choose O&M > Alarm > Thresholds > Name of the desired cluster > Host > Network Status > TCP Ephemeral Port Usage.
When the Trigger Count is 1, this alarm is cleared when the TCP temporary port usage is less than or equal to the threshold. When the Trigger Count is greater than 1, this alarm is cleared when the TCP temporary port usage is less than or equal to 90% of the threshold.
Alarm Attributes
Alarm ID |
Alarm Severity |
Alarm Type |
Service Type |
Auto Cleared |
---|---|---|---|---|
12052 |
Critical (default threshold: 95%) Major (default threshold: 80%) |
Environment |
FusionInsight Manager |
Yes |
Alarm Parameters
Type |
Parameter |
Description |
---|---|---|
Location Information |
Source |
Specifies the cluster or system for which the alarm is generated. |
ServiceName |
Specifies the service for which the alarm is generated. |
|
RoleName |
Specifies the role for which the alarm is generated. |
|
HostName |
Specifies the host for which the alarm is generated. |
|
Additional Information |
Trigger Condition |
Specifies the threshold for triggering the alarm. |
Impact on the System
Services on the host cannot establish external connections, and therefore they are interrupted.
Possible Causes
- The temporary port cannot meet the current service requirements.
- The system is abnormal.
Handling Procedure
Expand the temporary port number range.
- On FusionInsight Manager, click in the row where the alarm is located in the real-time alarm list and obtain the IP address of the host for which the alarm is generated.
- Log in to the host for which the alarm is generated as user omm.
- Run the cat /proc/sys/net/ipv4/ip_local_port_range |cut -f 1 command to obtain the value of the start port and run the cat /proc/sys/net/ipv4/ip_local_port_range |cut -f 2 command to obtain the value of the end port. The total number of temporary ports is the value of the end port minus the value of the start port. If the total number of temporary ports is smaller than 28,232, the random port range of the OS is narrow. Contact the system administrator to increase the port range.
- Run the following command to calculate the number of used temporary ports.
ss -ant 2>/dev/null | grep -v LISTEN | awk 'NR > 2 {print $4}'| awk -F':' '{print $NF}' | awk '$1 >"Value of the start port" {print $1}' | sort -u | wc -l
- The formula for calculating the usage of the temporary ports is: Usage of the temporary ports = (Number of used temporary ports/Total number of temporary ports) x 100%. Check whether the temporary port usage exceeds the threshold.
- Wait for 5 minutes, and check whether the alarm is cleared.
- If yes, no further action is required.
- If no, go to 7.
Check whether the system environment is abnormal.
- Run the following command to import the temporary file and view the frequently used ports in the port_result.txt file:
netstat -tnp|sort > $BIGDATA_HOME/tmp/port_result.txt
netstat -tnp|sort Active Internet connections (w/o servers) Proto Recv Send LocalAddress ForeignAddress State PID/ProgramName tcp 0 0 10-120-85-154:45433 10-120-85-154:9866 CLOSE_WAIT 94237/java tcp 0 0 10-120-85-154:45434 10-120-85-154:9866 CLOSE_WAIT 94237/java tcp 0 0 10-120-85-154:45435 10-120-85-154:9866 CLOSE_WAIT 94237/java ...
- Run the following command to view the processes that occupy a large number of ports:
ps -ef |grep PID
- PID is the processes ID queried in 7.
- Run the following command to collect information about all processes and check the processes that occupy a large number of ports:
- After obtaining the administrator's approval, clear the processes that occupy a large number of ports. Wait for 5 minutes, and check whether the alarm is cleared.
- If yes, no further action is required.
- If no, go to 10.
Collect fault information.
- On the FusionInsight Manager home page of the active cluster, choose O&M > Log > Download.
- Select OMS from the Service and click OK.
- Set Host to the node for which the alarm is generated and the active OMS node.
- Click the edit button in the upper right corner, and set Start Date and End Date for log collection to 30 minutes ahead of and after the alarm generation time, respectively. Then, click Download.
- Contact the O&M engineers and send the collected log information and files port_result.txt and ps_result.txt. Then, delete the two residual temporary files from the environment.
Alarm Clearance
After the fault is rectified, the system automatically clears this alarm.
Related Information
None.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot