Help Center/ Ubiquitous Cloud Native Service/ Best Practices/ Cluster Federation/ Using Multi-Cluster Workload Scaling to Scale Workloads
Updated on 2024-11-01 GMT+08:00

Using Multi-Cluster Workload Scaling to Scale Workloads

Application Scenarios

There are predictable and unpredictable traffic peaks for some services in complex scenarios. If you only use the standard FederatedHPA, it takes a long time to scale pods in workloads, which may make services unavailable during the expected peak hours. To abstract away this complexity, UCS provides two scaling policies, FederatedHPA and CronFederatedHPA, to automatically scale pods in workloads based on metric changes or at regular intervals.

This section uses hpa-example as an example to describe how you can use both FederatedHPA and CronFederatedHPA to scale workloads.

Solution Process

Figure 1 shows how to use both FederatedHPA and CronFederatedHPA.

  1. Make preparations. Before creating workload scaling policies, prepare two Huawei Cloud clusters that have been registered with UCS, install Kubernetes Metrics Server for each cluster, and create an image named hpa-example.
  2. Create a workload. Create a Deployment using the prepared image, create an application, and create and deploy a scheduling policy for the Deployment.
  3. Create scaling policies. Use the command line tool to create a FederatedHPA and a CronFederatedHPA.
  4. Observe scaling processes. View the number of pods in the Deployment and observe the effects of the scaling policies.
Figure 1 Process of using both FederatedHPA and CronFederatedHPA

Making Preparations

  • Register two Huawei Cloud clusters (cluster01 and cluster02) with UCS. For details about how to register Huawei Cloud clusters with UCS, see Huawei Cloud Clusters.
  • Install Kubernetes Metrics Server for the clusters. For details about how to install this add-on, see Kubernetes Metrics Server.
  • Log in to the cluster node and deploy a compute-intensive application. When a user sends a request, the result needs to be calculated before being returned to the user. The following describes the details.
    1. Create a PHP file named index.php to calculate the square root of the request for 1,000,000 times before "OK!" is displayed.

      vi index.php

      The following provides an example index.php:

      <?php
        $x = 0.0001;
        for ($i = 0; $i <= 1000000; $i++) {
          $x += sqrt($x);
        }
        echo "OK!";
      ?>
    2. Compile a Dockerfile to create an image.

      vi Dockerfile

      The following provides an example Dockerfile:
      FROM php:5-apache
      COPY index.php /var/www/html/index.php
      RUN chmod a+rx index.php
    3. Create an image named hpa-example with the latest tag.

      docker build -t hpa-example:latest .

    4. (Optional) Log in to the SWR console. In the navigation pane, choose Organizations. In the upper right corner, click Create Organization. Skip this step if you already have an organization.
    5. In the navigation pane, choose My Images. In the upper right corner, click Upload Through Client. In the displayed dialog box, click Generate a temporary login command. Then, click to copy the command.
    6. Run the login command copied in the previous step on the node. If the login is successful, "Login Succeeded" will be displayed.
    7. Add a tag to the hpa-example image.
      docker tag {Image name 1:Tag 1} {Image repository address}/{Organization name}/{Image name 2:Tag 2}
      Table 1 Tag parameters

      Parameter

      Description

      {Image name 1:Tag 1}

      Replace them with the name and tag of the image to be uploaded.

      {Image repository address}

      Replace it with the domain name at the end of the login command in 5.

      {Organization name}

      Replace it with the organization name created in 4.

      {Image name 2:Tag 2}

      Replace them with the image name and tag to be displayed in the SWR image repository.

      The following is a command example:

      docker tag hpa-example:latest swr.ap-southeast-1.myhuaweicloud.com/cloud-develop/hpa-example:latest

    8. Push the image to the image repository.

      docker push {Image repository address}/{Organization name}/{Image name 2:Tag 2}

      The following is a command example:

      docker push swr.ap-southeast-1.myhuaweicloud.com/cloud-develop/hpa-example:latest

      Check whether the following information is returned. If yes, the image push is successful.

      6d6b9812c8ae: Pushed 
      ... 
      fe4c16cbf7a4: Pushed 
      latest: digest: sha256:eb7e3bbd*** size: **
    9. To view the pushed image, go to the SWR console and refresh the My Images page.

Creating a Workload

  1. Use the hpa-example image to create a Deployment with one pod. The image path varies with the SWR repository and needs to be replaced with the actual value.

    kind: Deployment
    apiVersion: apps/v1
    metadata:
      name: hpa-example
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: hpa-example
      template:
        metadata:
          labels:
            app: hpa-example
        spec:
          containers:
          - name: container-1
            image: 'hpa-example:latest'   # Replace it with the path of the image you uploaded to SWR.
            resources:
              limits:                      # Keep the value same as that of requests to prevent flapping during scaling.
                cpu: 500m
                memory: 200Mi
              requests:                    
                cpu: 500m
                memory: 200Mi
          imagePullSecrets:
          - name: default-secret

  2. Create a Service with the port number being 80.

    kind: Service
    apiVersion: v1
    metadata:
      name: hpa-example
    spec:
      ports:
        - name: cce-service-0
          protocol: TCP
          port: 80
          targetPort: 80
          nodePort: 31144
      selector:
        app: hpa-example
      type: NodePort

  3. Create a scheduling policy for the Deployment and Service and deploy the Deployment and Service in cluster01 and cluster02, with the weight of each cluster being 1 to ensure that each cluster has the same priority.

    apiVersion: policy.karmada.io/v1alpha1
    kind: PropagationPolicy
    metadata:
      name: hpa-example-pp
      namespace: default
    spec:
      placement:
        clusterAffinity:
          clusterNames:
          - cluster01
          - cluster02
        replicaScheduling:
          replicaDivisionPreference: Weighted
          replicaSchedulingType: Divided
          weightPreference:
            staticWeightList:
            - targetCluster:
                clusterNames:
                - cluster01
              weight: 1
            - targetCluster:
                clusterNames:
                - cluster02
              weight: 1
      preemption: Never
      propagateDeps: true
      resourceSelectors:
      - apiVersion: apps/v1
        kind: Deployment
        name: hpa-example
        namespace: default
      - apiVersion: v1
        kind: Service
        name: hpa-example
        namespace: default

Creating Scaling Policies

  1. Create a FederatedHPA.

    vi hpa-example-hpa.yaml

    As described in the YAML file, this policy is associated with the Deployment named hpa-example. The stabilization window is 0 seconds for a scale-out and 100 seconds for a scale-in. The maximum number of pods is 100 and the minimum number of pods is 2. This policy contains a system metric rule in which the desired CPU usage is 50%.

    apiVersion: autoscaling.karmada.io/v1alpha1     
    kind: FederatedHPA
    metadata:
      name: hpa-example-hpa                               # FederatedHPA name
      namespace: default                                  # Namespace where the Deployment resides
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: hpa-example                                 # Deployment name
      behavior:
        scaleDown:
          stabilizationWindowSeconds: 100                 # The stabilization window is 100 seconds for a scale-in.
        scaleUp:
          stabilizationWindowSeconds: 0                   # The stabilization window is 0 seconds for a scale-out.
      minReplicas: 2                                      # The minimum number of pods is 2.
      maxReplicas: 100                                    # The maximum number of pods is 100.
      metrics:
        - type: Resource
           resource:
            name: cpu                                     # CPU-based scaling metrics
            target:
               type: Utilization                          # The metric type is resource usage.
               averageUtilization: 50                     # Desired average resource usage

  2. Create a CronFederatedHPA.

    vi cron-federated-hpa.yaml

    As described in the YAML file, this policy works with the FederatedHPA named hpa-example-hpa to scale out 10 pods at 08:30 and scale in 2 pods at 10:00 for the Deployment daily.

    apiVersion: autoscaling.karmada.io/v1alpha1 
    kind: CronFederatedHPA 
    metadata: 
      name: cron-federated-hpa                            # CronFederatedHPA name
    spec: 
      scaleTargetRef: 
        apiVersion: apps/v1 
        kind: FederatedHPA                               # CronFederatedHPA runs based on FederatedHPA.
        name: hpa-example-hpa                             # FederatedHPA name
      rules: 
      - name: "Scale-Up"                                  # Rule name
        schedule: 30 08 * * *                             # Time when the policy is triggered
        targetReplicas: 10                                # Desired number of pods, which is a non-negative integer
        timeZone: Asia/Shanghai                           # Time zone
      - name: "Scale-Down"                                # Rule name
        schedule: 0 10 * * *                              # Time when the policy is triggered
        targetReplicas: 2                                 # Desired number of pods, which is a non-negative integer
        timeZone: Asia/Shanghai                           # Time zone

Observing Scaling Processes

  1. View the FederatedHPA. You can see that the CPU usage of the Deployment is 0%.

    kubectl get FederatedHPA hpa-example-hpa
    NAME              REFERENCE                TARGETS   MINPODS   MAXPODS   REPLICAS   AGE 
    hpa-example-hpa   Deployment/hpa-example   0%/50%    1         100       1          6m

  2. Access the Deployment. In the following command, {ip:port} indicates the access address of the Deployment obtained from its details page.

    while true;do wget -q -O- http://{ip:port}; done

  3. Observe the automatic scale-out process of the Deployment.

    kubectl get federatedhpa hpa-example-hpa --watch

    View the FederatedHPA. You can see that the CPU usage of the Deployment is 200% at 6m23s, which exceeds the target value. In this case, the FederatedHPA is triggered to expand four pods for the Deployment. In the subsequent several minutes, the CPU usage does not decrease until 8m16s. This is because the new pods may not be successfully created. The possible cause is that resources are insufficient and the pods are in the Pending state. During this period, nodes are added.

    At 8m16s, the CPU usage decreases, indicating that the pods are successfully created and start to bear traffic. The CPU usage decreases to 81% at 8m, still greater than the target value and beyond the tolerance range. So, 7 pods are added at 9m31s, and the CPU usage decreases to 51%, which is within the tolerance range. From then on, the number of pods remains 7.

    NAME              REFERENCE                TARGETS    MINPODS   MAXPODS   REPLICAS   AGE 
    hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       1          6m 
    hpa-example-hpa   Deployment/hpa-example   200%/50%   1         100       1          6m23s 
    hpa-example-hpa   Deployment/hpa-example   200%/50%   1         100       4          6m31s 
    hpa-example-hpa   Deployment/hpa-example   210%/50%   1         100       4          7m16s 
    hpa-example-hpa   Deployment/hpa-example   210%/50%   1         100       4          7m16s 
    hpa-example-hpa   Deployment/hpa-example   90%/50%    1         100       4          8m16s 
    hpa-example-hpa   Deployment/hpa-example   85%/50%    1         100       4          9m16s 
    hpa-example-hpa   Deployment/hpa-example   51%/50%    1         100       7          9m31s 
    hpa-example-hpa   Deployment/hpa-example   51%/50%    1         100       7          10m16s 
    hpa-example-hpa   Deployment/hpa-example   51%/50%    1         100       7          11m 

    View the scaling event of the FederatedHPA, from which you can see the effective time of this policy.

    kubectl describe federatedhpa hpa-example-hpa

  4. Stop accessing the Deployment and observe its automatic scale-in process.

    View the FederatedHPA. You can see that the CPU usage is 21% at 13m. The number of pods is reduced to 3 at 18m and then to 1 at 23m.

    kubectl get federatedhpa hpa-example-hpa --watch

    NAME              REFERENCE                TARGETS    MINPODS   MAXPODS   REPLICAS   AGE 
    hpa-example-hpa   Deployment/hpa-example   50%/50%    1         100       7          12m 
    hpa-example-hpa   Deployment/hpa-example   21%/50%    1         100       7          13m 
    hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       7          14m 
    hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       7          18m 
    hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       3          18m
    hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       3          19m 
    hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       3          19m 
    hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       3          19m 
    hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       3          19m 
    hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       3          23m 
    hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       3          23m 
    hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       1          23m

    View the scaling event of the FederatedHPA, from which you can see the effective time of this policy.

    kubectl describe federatedhpa hpa-example-hpa

  5. When the triggering time of the CronFederatedHPA arrives, observe the automatic scaling process of the Deployment.

    The number of pods is increased to 4 at 118m and then to 10 at 123m.

    kubectl get cronfederatedhpa cron-federated-hpa --watch

    NAME                 REFERENCE                TARGETS    MINPODS   MAXPODS   REPLICAS   AGE 
    cron-federated-hpa   Deployment/hpa-example   50%/50%    1         100       1          112m 
    cron-federated-hpa   Deployment/hpa-example   21%/50%    1         100       1          113m 
    cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       4          114m 
    cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       4          118m 
    cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       4          118m
    cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       4          119m 
    cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       7          119m 
    cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       7          119m 
    cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       7          119m
    cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       7          123m  
    cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       10         123m 
    cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       10         123m

    View the scaling event of the CronFederatedHPA, from which you can see the effective time of this policy.

    kubectl describe cronfederatedhpa cron-federated-hpa