Worker Runs Abnormally After a Topology Is Submitted and Error "Failed to bind to:host:ip" Is Displayed
Symptom
After the service topology is submitted, the Worker cannot be started normally. Check the Worker log. The log records "Failed to bind to: host:ip."
Possible Causes
The random port range is incorrectly configured.
Troubleshooting Process
1. Check related information in the Worker log.
2. Check the process information about the bond port.
3. Check the random port range.
Cause Analysis
- Use SSH to log in to the host where the Worker fails to be started and run the netstat -anp | grep <port> command to check the ID of the process that occupies the port. In the preceding command, change port to the actual port number.
- Run the ps -ef | grep <pid> command to view process details. In the command, pid indicates the actual process ID.
It is found that the worker process occupies the port. This process is another topology service process. According to the process details, port 29122 is allocated to the process.
- Run the lsof -i:<port> command to view connection details. In the preceding command, change port to the actual port number.
It is found that port 29101 connects to port 21005 of the peer end, and port 21005 is the Kafka server port.
It indicates that the service layer connects to Kafka to obtain messages as a client. Service ports are allocated based on the random port range of the OS.
- Run the cat /proc/sys/net/ipv4/ip_local_port_range command to check the random port range.
- It is found that the random port range is too large and conflicts with the service port range of MRS.
The MRS service port number ranges from 20000 to 30000.
Procedure
- Modify the random port range.
vi /proc/sys/net/ipv4/ip_local_port_range 32768 61000
- Stop the service process that occupies the service port to release the port. (Stop the service topology.)
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.