What Should I Do If a Backend Server Is Unhealthy?

Symptom

If a client fails to access a backend server through a load balancer, the backend server is declared unhealthy.

Background

The load balancer uses IP addresses in 100.125.0.0/16server to send heartbeats to backend servers and check their health. To ensure that health checks can be performed normally, IP addresses in 100.125.0.0/16 must be allowed to access the backend servers.

If a backend server is detected unhealthy, the load balancer will remove this server from the backend server group and stop forwarding traffic to it, until it is declared healthy again.

When a backend server is detected unhealthy, the load balancer will stop routing requests to this server.
When the health check function is disabled, the load balancer will consider the backend server healthy by default and still route requests to it.
ELB uses IP addresses in 100.125.0.0/16 to perform health checks and route requests to backend servers.
Traffic is not routed to a backend server with a weight of 0, and the health check result is meaningless.

Troubleshooting Procedure

Possible causes are sequenced based on their occurrence probability.

If the fault persists after you have ruled out a cause, check other causes.

It takes a while for the modification to take effect after you change the health check configuration. The required time depends on health check interval and timeout duration. View the health check result in the backend server list of target load balancer.

Figure 1 Troubleshooting process
Click to enlarge

**Table 1** Troubleshooting process
Possible Cause	Solution
Health check configuration	See Check the Health Check Configuration.
Security group rules	See Check Security Group Rules.
Network ACL rules	See Check Network ACL Rules.
Backend server listening configuration	See Check the Backend Server.
Backend server firewall configuration	See Check the Backend Server Firewall.
Backend server route configuration	See Check the Backend Server Route.
Backend server load	See Check the Backend Server Load.
Backend server host.deny file	See Check the Backend Server host.deny File.

Check the Health Check Configuration

Click the name of the target load balancer to view its details. On the Backend Server Group tab page, click the name of the target backend server group. In the Basic Information area, click Configure on the right of Health Check and then check the following parameters:

Protocol
Port
Check Path If HTTP is used for health checks, you must check this parameter. A simple static HTML file is recommended.

Check Security Group Rules

TCP, HTTP, or HTTPS listeners: Verify that the inbound rule of the security group containing the backend server allows access from 100.125.0.0/16 and allows the TCP traffic from the health check port.
- If the health check port is the same as the backend port, the inbound rule must allow traffic from the backend port, for example, 80.
- If the health check port is different from the backend port, the inbound rule must allow traffic from both the health check port and backend port, for example, 443 and 80.
  
  You can check the protocol and port in the basic information area of the backend server group.
Figure 2 Example inbound rule
UDP listeners: Verify that the inbound rule of the security group allows traffic from the health check protocol, health check port, and 100.125.0.0/16. In addition, the ICMP traffic must be allowed in the inbound direction.
Figure 3 Example inbound rule that allows ICMP traffic

Access to the backend server from IP addresses in 100.125.0.0/16 must be allowed. Load balancers communicate with backend servers using these IP addresses. After traffic is routed to backend servers, source IP addresses are converted to IP addresses starting with 100.125. Besides that, the IP address of the health check node is allocated from 100.125.0.0/16.
If you are not sure about the security group rules, change the protocol and port range to All for testing.
For UDP listeners, see What Are the Precautions of Using UDP for Health Checks?

Check Network ACL Rules

A network ACL is an optional subnet-class security configuration. You can associate one or more subnets with a network ACL for controlling traffic in and out of the subnets. Similar to security groups, network ACLs provide access control functions, but add an additional layer of defense to your VPC. Default network ACL rules reject all inbound and outbound traffic. If a network ACL and load balancer reside in the same subnet, or the network ACL and backend servers associated with the load balancer reside in the same subnet, the load balancer cannot receive traffic from the public or private network, or backend servers become unhealthy.

You can configure an inbound network ACL rule to permit access from 100.125.0.0/16.

Log in to the management console.
In the upper left corner of the page, click and select the desired region and project.
Under Network, click Virtual Private Cloud.
In the navigation pane on the left, choose Network ACLs.
Locate the target network ACL, and click the network ACL name to switch to the network ACL details page.
On the Inbound Rules or Outbound Rules tab page, click Add Rule to add an inbound or outbound rule.
- Action: Select Allow.
- Protocol: The protocol must be the same as the frontend protocol set when the listener is added.
- Source: Set the value to 100.125.0.0/16.
- Source Port Range: Select the port range of the service.
- Destination: Enter default value 0.0.0.0/0, which indicates that traffic from all IP addresses is permitted.
- Destination Port Range: Select the port range of the service.
- Description: provides supplementary information about the network ACL rule. This parameter is optional.
Click OK.

Check the Backend Server

If the backend server runs a Windows OS, use a browser to access https://Backend server IP address:Health check port. If a 2xx or 3xx code is returned, the backend server is working properly.

Run the following command on the backend server to check whether the health check port is listened on:
```
netstat -anlp | grep port
```
If the health check port and LISTEN are displayed, the backend port is in the listening state. As shown in Figure 4, TCP port 880 is listened on.

If no health check port is specified, backend ports are used by default.
Figure 4 Backend server port listened on

Figure 5 Backend server port not listened on
For HTTP health checks, run the following command on the backend server to check the status code:
```
curl Private IP address of the backend server:Health check port/Health check path -iv
```
To perform an HTTP health check, the load balancer initiates a GET request to the backend server. If the following response status codes are displayed, the backend server is considered healthy:

TCP listeners: 200

The status code is 200, 202, or 401 if the backend server is healthy.

Figure 6 Unhealthy backend server

Figure 7 Healthy backend server
If HTTP is used for health checks and the backend server is detected unhealthy, perform the following steps to configure a TCP health check:
On the Listeners tab page, modify the target listener, select the backend server group for which TCP health check has been configured, or add a backend server group and select TCP as the health check protocol. After the configuration is complete, wait for a while and check the health check result.

Check the Backend Server Firewall

The firewall or other security protection software on the backend server may mask IP addresses in 100.125.0.0/16. Ensure that access from 100.125.0.0/16 is allowed in the security group containing the backend server.

Check the Backend Server Route

Check whether the default route configured for the primary NIC is manually changed. If the default route is changed, health check packets may fail to reach the backend server.

Run the following command on the backend server to check whether the default route points to the gateway (For Layer 3 communications, the default route must be configured to point to the gateway):

ip route

Alternatively, run the following command:

route -n

If the command output does not contain the highlighted route or the IP address to which the route points is not the gateway address of the VPC subnet, change the route to the default one.

Figure 8 Example default route pointing to the gateway