Help Center/ Cloud Container Engine/ FAQs/ Networking/ Network Exception Troubleshooting/ Why the ELB Address Cannot Be used to Access Workloads in a Cluster?
Updated on 2024-11-13 GMT+08:00

Why the ELB Address Cannot Be used to Access Workloads in a Cluster?

Symptom

In a cluster (on a node or in a container), the ELB address cannot be used to access workloads.

Possible Cause

If the service affinity of a Service is set to the node level, that is, the value of externalTrafficPolicy is Local, the Service may fail to be accessed from within the cluster (specifically, nodes or containers). Information similar to the following is displayed:
upstream connect error or disconnect/reset before headers. reset reason: connection failure
Or
curl: (7) Failed to connect to 192.168.10.36 port 900: Connection refused

It is common that a load balancer in a cluster cannot be accessed. The reason is as follows: When Kubernetes creates a Service, kube-proxy adds the access address of the load balancer as an external IP address (External-IP, as shown in the following command output) to iptables or IPVS. If a client inside the cluster initiates a request to access the load balancer, the address is considered as the external IP address of the Service, and the request is directly forwarded by kube-proxy without passing through the load balancer outside the cluster.

When the value of externalTrafficPolicy is Local, the access failures in different container network models and service forwarding modes are as follows:
  • For a multi-pod workload, ensure that all pods are accessible. Otherwise, there is a possibility that the access to the workload fails.
  • In a CCE Turbo cluster that utilizes a Cloud Native 2.0 network model, node-level affinity is supported only when the Service backend is connected to a HostNetwork pod.
  • The table lists only the scenarios where the access may fail. Other scenarios that are not listed in the table indicate that the access is normal.

Service Type Released on the Server

Access Type

Request Initiation Location on the Client

Tunnel Network Cluster (IPVS)

VPC Network Cluster (IPVS)

Tunnel Network Cluster (iptables)

VPC Network Cluster (iptables)

NodePort Service

Public/Private network

Same node as the service pod

Access the IP address and NodePort on the node where the server is located: The access is successful.

Access the IP address and NodePort on a node other than the node where the server is located: The access failed.

Access the IP address and NodePort on the node where the server is located: The access is successful.

Access the IP address and NodePort on a node other than the node where the server is located: The access failed.

Access the IP address and NodePort on the node where the server is located: The access is successful.

Access the IP address and NodePort on a node other than the node where the server is located: The access failed.

Access the IP address and NodePort on the node where the server is located: The access is successful.

Access the IP address and NodePort on a node other than the node where the server is located: The access failed.

Different nodes from the service pod

Access the IP address and NodePort on the node where the server is located: The access is successful.

Access the IP address and NodePort on a node other than the node where the server is located: The access failed.

Access the IP address and NodePort on the node where the server is located: The access is successful.

Access the IP address and NodePort on a node other than the node where the server is located: The access failed.

The access is successful.

The access is successful.

Other containers on the same node as the service pod

Access the IP address and NodePort on the node where the server is located: The access is successful.

Access the IP address and NodePort on a node other than the node where the server is located: The access failed.

The access failed.

Access the IP address and NodePort on the node where the server is located: The access is successful.

Access the IP address and NodePort on a node other than the node where the server is located: The access failed.

The access failed.

Other containers on different nodes from the service pod

Access the IP address and NodePort on the node where the server is located: The access is successful.

Access the IP address and NodePort on a node other than the node where the server is located: The access failed.

Access the IP address and NodePort on the node where the server is located: The access is successful.

Access the IP address and NodePort on a node other than the node where the server is located: The access failed.

Access the IP address and NodePort on the node where the server is located: The access is successful.

Access the IP address and NodePort on a node other than the node where the server is located: The access failed.

Access the IP address and NodePort on the node where the server is located: The access is successful.

Access the IP address and NodePort on a node other than the node where the server is located: The access failed.

LoadBalancer Service using a dedicated load balancer

Private network

Same node as the service pod

The access failed.

The access failed.

The access failed.

The access failed.

Other containers on the same node as the service pod

The access failed.

The access failed.

The access failed.

The access failed.

DNAT gateway Service

Public network

Same node as the service pod

The access failed.

The access failed.

The access failed.

The access failed.

Different nodes from the service pod

The access failed.

The access failed.

The access failed.

The access failed.

Other containers on the same node as the service pod

The access failed.

The access failed.

The access failed.

The access failed.

Other containers on different nodes from the service pod

The access failed.

The access failed.

The access failed.

The access failed.

nginx-ingress add-on connected with a dedicated load balancer (Local)

Private network

Same node as cceaddon-nginx-ingress-controller pod

The access failed.

The access failed.

The access failed.

The access failed.

Other containers on the same node as the cceaddon-nginx-ingress-controller pod

The access failed.

The access failed.

The access failed.

The access failed.

Solution

The following methods can be used to solve this problem:

  • (Recommended) In the cluster, use the ClusterIP Service or service domain name for access.
  • Set externalTrafficPolicy of the Service to Cluster, which means cluster-level service affinity. Note that this affects source address persistence.
    apiVersion: v1 
    kind: Service
    metadata: 
      annotations:   
        kubernetes.io/elb.class: union
        kubernetes.io/elb.autocreate: '{"type":"public","bandwidth_name":"cce-bandwidth","bandwidth_chargemode":"traffic","bandwidth_size":5,"bandwidth_sharetype":"PER","eip_type":"5_bgp","name":"james"}'
      labels: 
        app: nginx 
      name: nginx 
    spec: 
      externalTrafficPolicy: Cluster
      ports: 
      - name: service0 
        port: 80
        protocol: TCP 
        targetPort: 80
      selector: 
        app: nginx 
      type: LoadBalancer
  • Leveraging the pass-through feature of the Service, kube-proxy is bypassed when the ELB address is used for access. The ELB load balancer is accessed first, and then the workload.
    • In a CCE standard cluster, after passthrough networking is configured for a dedicated load balancer, the private IP address of the load balancer cannot be accessed from the node where the workload pod resides or other containers on the same node as the workload.
    • Passthrough networking is not supported for clusters of v1.15 or earlier.
    • In IPVS network mode, the passthrough settings of Services connected to the same load balancer must be the same.
    • If node-level (local) service affinity is used, kubernetes.io/elb.pass-through is automatically set to onlyLocal to enable pass-through.
    apiVersion: v1 
    kind: Service 
    metadata: 
      annotations:   
        kubernetes.io/elb.pass-through: "true"
        kubernetes.io/elb.class: union
        kubernetes.io/elb.autocreate: '{"type":"public","bandwidth_name":"cce-bandwidth","bandwidth_chargemode":"traffic","bandwidth_size":5,"bandwidth_sharetype":"PER","eip_type":"5_bgp","name":"james"}'
      labels: 
        app: nginx 
      name: nginx 
    spec: 
      externalTrafficPolicy: Local
      ports: 
      - name: service0 
        port: 80
        protocol: TCP 
        targetPort: 80
      selector: 
        app: nginx 
      type: LoadBalancer