Using Multi-Cluster Workload Scaling to Scale Workloads

Application Scenarios

There are predictable and unpredictable traffic peaks for some services in complex scenarios. If you only use the standard FederatedHPA, it takes a long time to scale pods in workloads, which may make services unavailable during the expected peak hours. To abstract away this complexity, UCS provides two scaling policies, FederatedHPA and CronFederatedHPA, to automatically scale pods in workloads based on metric changes or at regular intervals.

This section uses hpa-example as an example to describe how you can use both FederatedHPA and CronFederatedHPA to scale workloads.

Solution Process

Figure 1 shows how to use both FederatedHPA and CronFederatedHPA.

Make preparations. Before creating workload scaling policies, prepare two Huawei Cloud clusters that have been registered with UCS, install Kubernetes Metrics Server for each cluster, and create an image named hpa-example.
Create a workload. Create a Deployment using the prepared image, create an application, and create and deploy a scheduling policy for the Deployment.
Create scaling policies. Use the command line tool to create a FederatedHPA and a CronFederatedHPA.
Observe scaling processes. View the number of pods in the Deployment and observe the effects of the scaling policies.

Figure 1 Process of using both FederatedHPA and CronFederatedHPA

Making Preparations

Register two Huawei Cloud clusters (cluster01 and cluster02) with UCS. For details about how to register Huawei Cloud clusters with UCS, see Huawei Cloud Clusters.
Install Kubernetes Metrics Server for the clusters. For details about how to install this add-on, see Kubernetes Metrics Server.

Log in to the cluster node and deploy a compute-intensive application. When a user sends a request, the result needs to be calculated before being returned to the user. The following describes the details.

Create a PHP file named index.php to calculate the square root of the request for 1,000,000 times before "OK!" is displayed.
vi index.php

The following provides an example index.php:
```
<?php
  $x = 0.0001;
  for ($i = 0; $i <= 1000000; $i++) {
    $x += sqrt($x);
  }
  echo "OK!";
?>
```
Compile a Dockerfile to create an image.
vi Dockerfile
The following provides an example Dockerfile:
```
FROM php:5-apache
COPY index.php /var/www/html/index.php
RUN chmod a+rx index.php
```
Create an image named hpa-example with the latest tag.
docker build -t hpa-example:latest .
(Optional) Log in to the SWR console. In the navigation pane, choose Organizations. In the upper right corner, click Create Organization. Skip this step if you already have an organization.
In the navigation pane, choose My Images. In the upper right corner, click Upload Through Client. In the displayed dialog box, click Generate a temporary login command. Then, click to copy the command.
Run the login command copied in the previous step on the node. If the login is successful, "Login Succeeded" will be displayed.

Add a tag to the hpa-example image.

docker tag {Image name 1:Tag 1} {Image repository address}/{Organization name}/{Image name 2:Tag 2}

**Table 1** Tag parameters
Parameter	Description
{Image name 1:Tag 1}	Replace them with the name and tag of the image to be uploaded.
{Image repository address}	Replace it with the domain name at the end of the login command in 5.
{Organization name}	Replace it with the organization name created in 4.
{Image name 2:Tag 2}	Replace them with the image name and tag to be displayed in the SWR image repository.

The following is a command example:

docker tag hpa-example:latest swr.ap-southeast-1.myhuaweicloud.com/cloud-develop/hpa-example:latest

Push the image to the image repository.
docker push {Image repository address}/{Organization name}/{Image name 2:Tag 2}

The following is a command example:

docker push swr.ap-southeast-1.myhuaweicloud.com/cloud-develop/hpa-example:latest

Check whether the following information is returned. If yes, the image push is successful.
```
6d6b9812c8ae: Pushed 
... 
fe4c16cbf7a4: Pushed 
latest: digest: sha256:eb7e3bbd*** size: **
```
To view the pushed image, go to the SWR console and refresh the My Images page.

Creating a Workload

Use the hpa-example image to create a Deployment with one pod. The image path varies with the SWR repository and needs to be replaced with the actual value.

kind: Deployment
apiVersion: apps/v1
metadata:
  name: hpa-example
spec:
  replicas: 1
  selector:
    matchLabels:
      app: hpa-example
  template:
    metadata:
      labels:
        app: hpa-example
    spec:
      containers:
      - name: container-1
        image: 'hpa-example:latest'   # Replace it with the path of the image you uploaded to SWR.
        resources:
          limits:                      # Keep the value same as that of requests to prevent flapping during scaling.
            cpu: 500m
            memory: 200Mi
          requests:                    
            cpu: 500m
            memory: 200Mi
      imagePullSecrets:
      - name: default-secret

Create a Service with the port number being 80.

kind: Service
apiVersion: v1
metadata:
  name: hpa-example
spec:
  ports:
    - name: cce-service-0
      protocol: TCP
      port: 80
      targetPort: 80
      nodePort: 31144
  selector:
    app: hpa-example
  type: NodePort

Create a scheduling policy for the Deployment and Service and deploy the Deployment and Service in cluster01 and cluster02, with the weight of each cluster being 1 to ensure that each cluster has the same priority.

apiVersion: policy.karmada.io/v1alpha1
kind: PropagationPolicy
metadata:
  name: hpa-example-pp
  namespace: default
spec:
  placement:
    clusterAffinity:
      clusterNames:
      - cluster01
      - cluster02
    replicaScheduling:
      replicaDivisionPreference: Weighted
      replicaSchedulingType: Divided
      weightPreference:
        staticWeightList:
        - targetCluster:
            clusterNames:
            - cluster01
          weight: 1
        - targetCluster:
            clusterNames:
            - cluster02
          weight: 1
  preemption: Never
  propagateDeps: true
  resourceSelectors:
  - apiVersion: apps/v1
    kind: Deployment
    name: hpa-example
    namespace: default
  - apiVersion: v1
    kind: Service
    name: hpa-example
    namespace: default

Creating Scaling Policies

Create a FederatedHPA.

vi hpa-example-hpa.yaml

As described in the YAML file, this policy is associated with the Deployment named hpa-example. The stabilization window is 0 seconds for a scale-out and 100 seconds for a scale-in. The maximum number of pods is 100 and the minimum number of pods is 2. This policy contains a system metric rule in which the desired CPU usage is 50%.

apiVersion: autoscaling.karmada.io/v1alpha1     
kind: FederatedHPA
metadata:
  name: hpa-example-hpa                               # FederatedHPA name
  namespace: default                                  # Namespace where the Deployment resides
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: hpa-example                                 # Deployment name
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 100                 # The stabilization window is 100 seconds for a scale-in.
    scaleUp:
      stabilizationWindowSeconds: 0                   # The stabilization window is 0 seconds for a scale-out.
  minReplicas: 2                                      # The minimum number of pods is 2.
  maxReplicas: 100                                    # The maximum number of pods is 100.
  metrics:
    - type: Resource
       resource:
        name: cpu                                     # CPU-based scaling metrics
        target:
           type: Utilization                          # The metric type is resource usage.
           averageUtilization: 50                     # Desired average resource usage

Create a CronFederatedHPA.

vi cron-federated-hpa.yaml

As described in the YAML file, this policy works with the FederatedHPA named hpa-example-hpa to scale out 10 pods at 08:30 and scale in 2 pods at 10:00 for the Deployment daily.

apiVersion: autoscaling.karmada.io/v1alpha1 
kind: CronFederatedHPA 
metadata: 
  name: cron-federated-hpa                            # CronFederatedHPA name
spec: 
  scaleTargetRef: 
    apiVersion: apps/v1 
    kind: FederatedHPA                               # CronFederatedHPA runs based on FederatedHPA.
    name: hpa-example-hpa                             # FederatedHPA name
  rules: 
  - name: "Scale-Up"                                  # Rule name
    schedule: 30 08 * * *                             # Time when the policy is triggered
    targetReplicas: 10                                # Desired number of pods, which is a non-negative integer
    timeZone: Asia/Shanghai                           # Time zone
  - name: "Scale-Down"                                # Rule name
    schedule: 0 10 * * *                              # Time when the policy is triggered
    targetReplicas: 2                                 # Desired number of pods, which is a non-negative integer
    timeZone: Asia/Shanghai                           # Time zone

Observing Scaling Processes

View the FederatedHPA. You can see that the CPU usage of the Deployment is 0%.

kubectl get FederatedHPA hpa-example-hpa

NAME              REFERENCE                TARGETS   MINPODS   MAXPODS   REPLICAS   AGE 
hpa-example-hpa   Deployment/hpa-example   0%/50%    1         100       1          6m

Access the Deployment. In the following command, {ip:port} indicates the access address of the Deployment obtained from its details page.

while true;do wget -q -O- http://{ip:port}; done
Observe the automatic scale-out process of the Deployment.

kubectl get federatedhpa hpa-example-hpa --watch

View the FederatedHPA. You can see that the CPU usage of the Deployment is 200% at 6m23s, which exceeds the target value. In this case, the FederatedHPA is triggered to expand four pods for the Deployment. In the subsequent several minutes, the CPU usage does not decrease until 8m16s. This is because the new pods may not be successfully created. The possible cause is that resources are insufficient and the pods are in the Pending state. During this period, nodes are added.

At 8m16s, the CPU usage decreases, indicating that the pods are successfully created and start to bear traffic. The CPU usage decreases to 81% at 8m, still greater than the target value and beyond the tolerance range. So, 7 pods are added at 9m31s, and the CPU usage decreases to 51%, which is within the tolerance range. From then on, the number of pods remains 7.
```
NAME              REFERENCE                TARGETS    MINPODS   MAXPODS   REPLICAS   AGE 
hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       1          6m 
hpa-example-hpa   Deployment/hpa-example   200%/50%   1         100       1          6m23s 
hpa-example-hpa   Deployment/hpa-example   200%/50%   1         100       4          6m31s 
hpa-example-hpa   Deployment/hpa-example   210%/50%   1         100       4          7m16s 
hpa-example-hpa   Deployment/hpa-example   210%/50%   1         100       4          7m16s 
hpa-example-hpa   Deployment/hpa-example   90%/50%    1         100       4          8m16s 
hpa-example-hpa   Deployment/hpa-example   85%/50%    1         100       4          9m16s 
hpa-example-hpa   Deployment/hpa-example   51%/50%    1         100       7          9m31s 
hpa-example-hpa   Deployment/hpa-example   51%/50%    1         100       7          10m16s 
hpa-example-hpa   Deployment/hpa-example   51%/50%    1         100       7          11m 
```
View the scaling event of the FederatedHPA, from which you can see the effective time of this policy.

kubectl describe federatedhpa hpa-example-hpa

Stop accessing the Deployment and observe its automatic scale-in process.

View the FederatedHPA. You can see that the CPU usage is 21% at 13m. The number of pods is reduced to 3 at 18m and then to 1 at 23m.

kubectl get federatedhpa hpa-example-hpa --watch

NAME              REFERENCE                TARGETS    MINPODS   MAXPODS   REPLICAS   AGE 
hpa-example-hpa   Deployment/hpa-example   50%/50%    1         100       7          12m 
hpa-example-hpa   Deployment/hpa-example   21%/50%    1         100       7          13m 
hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       7          14m 
hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       7          18m 
hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       3          18m
hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       3          19m 
hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       3          19m 
hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       3          19m 
hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       3          19m 
hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       3          23m 
hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       3          23m 
hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       1          23m

View the scaling event of the FederatedHPA, from which you can see the effective time of this policy.

kubectl describe federatedhpa hpa-example-hpa

When the triggering time of the CronFederatedHPA arrives, observe the automatic scaling process of the Deployment.

The number of pods is increased to 4 at 118m and then to 10 at 123m.

kubectl get cronfederatedhpa cron-federated-hpa --watch

NAME                 REFERENCE                TARGETS    MINPODS   MAXPODS   REPLICAS   AGE 
cron-federated-hpa   Deployment/hpa-example   50%/50%    1         100       1          112m 
cron-federated-hpa   Deployment/hpa-example   21%/50%    1         100       1          113m 
cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       4          114m 
cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       4          118m 
cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       4          118m
cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       4          119m 
cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       7          119m 
cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       7          119m 
cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       7          119m
cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       7          123m  
cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       10         123m 
cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       10         123m

View the scaling event of the CronFederatedHPA, from which you can see the effective time of this policy.

kubectl describe cronfederatedhpa cron-federated-hpa

Parent topic: Cluster Federation

Previous topic: Using a VPC Peering Connection to Connect CCE Clusters

Next topic: Using MCI to Distribute Traffic Across Clusters