Help Center/ Cloud Container Engine/ Best Practices/ Networking/ Configuring LoadBalancer Ingress Load Balancing for a Persistent-Connection Service
Updated on 2026-03-10 GMT+08:00

Configuring LoadBalancer Ingress Load Balancing for a Persistent-Connection Service

When a containerized persistent-connection service exposes traffic to external systems through a LoadBalancer ingress, pressure-testing may reveal uneven load distribution. This can cause certain backend servers to become overloaded, ultimately affecting overall service performance. To ensure balanced traffic distribution and efficient services, you can fine-tune client connection idle timeout, client request timeout, backend response timeout, and other configurations.

  • Client connection idle timeout: defines how long a connection can remain open when no data is being transmitted. For persistent connections, configuring an appropriate idle timeout prevents connections from occupying resources unnecessarily while still keeping them available when needed. If the timeout is too short, connections may disconnect and reconnect frequently, increasing server load. If the timeout is too long, idle connections may consume resources that could be used more efficiently elsewhere.
  • Client request timeout: defines how long a server waits for a client to send a complete request after the connection is established. For persistent connections, this prevents the server from waiting indefinitely for client data and preventing resources from being occupied for a long time. A reasonable client request timeout improves server responsiveness and resource utilization.
  • Backend response timeout: defines how long a server waits for a backend server to return a response after forwarding a request. For persistent connections, a reasonable backend response timeout prevents congestion caused by slow backend responses. This improves system performance and stability.

Prerequisites

  • A CCE Turbo cluster is available. The cluster has the Cloud Native Cluster Monitoring add-on installed.
  • A dedicated load balancer is available.
  • An ECS that can access the Internet is available. The ECS also has Docker and wrk installed.

Step 1: Prepare a Test Image

  1. Log in to the ECS and create a dockerfile folder.

    mkdir ./dockerfile
    cd ./dockerfile

  2. Prepare the required Dockerfile, go.mod, and app.go files.

  3. After the creation, check whether the dockerfile directory contains the following files:

    app.go  Dockerfile  go.mod

  4. Log in to the SWR image repository and push the image to the SWR repository. For details, see Pushing an Image.

    docker tag http-long-conn:v1 {image-repository-address}/{organization}/http-long-conn:v1
    docker push {image-repository-address}/{organization}/http-long-conn:v1

    {image-repository-address} specifies an SWR image repository address. {organization} specifies an SWR organization name.

Step 2: Deploy a Workload in the Cluster

  1. Access the cluster using kubectl.
  2. Create a file named http-long-conn.yaml. You can name the file as required.

    vi http-long-conn.yaml

    The file content is as follows:

    kind: Deployment
    apiVersion: apps/v1
    metadata:
      name: http-long-conn
      namespace: default
      labels:
        app: http-long-conn
    spec:
      replicas: 4
      selector:
        matchLabels:
          app: http-long-conn
      template:
        metadata:
          labels:
            app: http-long-conn
        spec:
          containers:
            - name: http-long-conn
              image: {image-repository-address}/{organization}/http-long-conn:v1 # Replace it with the SWR image address you uploaded.
              ports:
                - containerPort: 8080
                  protocol: TCP
              env:
                - name: PORT
                  value: '8080'
              resources:
                limits:
                  cpu: 100m
                  memory: 128Mi
                requests:
                  cpu: 100m
                  memory: 128Mi
              livenessProbe:
                httpGet:
                  path: /health
                  port: 8080
                  scheme: HTTP
                initialDelaySeconds: 30
                timeoutSeconds: 1
                periodSeconds: 10
                successThreshold: 1
                failureThreshold: 3
              readinessProbe:
                httpGet:
                  path: /health
                  port: 8080
                  scheme: HTTP
                initialDelaySeconds: 5
                timeoutSeconds: 1
                periodSeconds: 5
                successThreshold: 1
                failureThreshold: 3
              terminationMessagePath: /dev/termination-log
              terminationMessagePolicy: File
              imagePullPolicy: IfNotPresent
          restartPolicy: Always
          terminationGracePeriodSeconds: 30
          dnsPolicy: ClusterFirst
          securityContext: {}
          schedulerName: default-scheduler
          imagePullSecrets:
            - name: default-secret
          tolerations: null
      strategy:
        type: RollingUpdate
        rollingUpdate:
          maxUnavailable: 25%
          maxSurge: 25%
      revisionHistoryLimit: 10
      progressDeadlineSeconds: 600
    
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: http-long-conn
      labels:
        app: http-long-conn
      namespace: default
    spec:
      selector:
        app: http-long-conn
      ports:
        - name: http-0
          targetPort: 8080
          nodePort: 0
          port: 8080
          protocol: TCP
      type: ClusterIP

  3. Create the workload and the Service.

    kubectl create -f http-long-conn.yaml

  4. Check the workload status.

    kubectl get pod -l app=http-long-conn

Step 4: Create a LoadBalancer Ingress

  1. Create a file named elb-ingress.yaml. You can name the file as required.

    vi elb-ingress.yaml

    The file content is as follows:

    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: http-long-conn
      namespace: default
      annotations:
        kubernetes.io/elb.port: '8080'
        kubernetes.io/elb.id: 1fce4b38-c72b-4fd4-8430-62d46c0a7998   #ELB ID
        kubernetes.io/elb.class: performance
        kubernetes.io/elb.keepalive_timeout: '300'  # Timeout setting for client connections
        kubernetes.io/elb.client_timeout: '60'      # Timeout for waiting for a request from a client
        kubernetes.io/elb.member_timeout: '60'      # Timeout for waiting for a response from a backend server
    spec:
      rules:
        - host: example.com # Use a custom domain name.
          http:
            paths:
              - path: /
                backend:
                  service:
                    name: http-long-conn
                    port:
                      number: 8080
                property:
                  ingress.beta.kubernetes.io/url-match-mode: STARTS_WITH
                pathType: ImplementationSpecific
      ingressClassName: cce

  2. Create the ingress.

    kubectl create -f elb-ingress.yaml

  3. Test the domain name connectivity.

Step 5: Perform a Pressure Test

Use wrk to perform a pressure test.

wrk -t2 -c100 -d300s -H "Connection: keep-alive" http://example.com/long-connection

Where:

  • -t2: specifies that two threads are enabled.
  • -c100: specifies that 100 concurrent connections are established.
  • -d300s: specifies how long the test will last.

In this example, pressure tests are performed using 100 and 200 concurrent connections, respectively. By viewing the pod monitoring metrics in the Monitoring Center, you can see that the pods are operating in a balanced load-distribution state.