Using Multi-Cluster Workload Scaling to Scale Workloads
Application Scenarios
There are predictable and unpredictable traffic peaks for some services in complex scenarios. If you only use the standard FederatedHPA, it takes a long time to scale pods in workloads, which may make services unavailable during the expected peak hours. To abstract away this complexity, UCS provides two scaling policies, FederatedHPA and CronFederatedHPA, to automatically scale pods in workloads based on metric changes or at regular intervals.
This section uses hpa-example as an example to describe how you can use both FederatedHPA and CronFederatedHPA to scale workloads.
Solution Process
Figure 1 shows how to use both FederatedHPA and CronFederatedHPA.
- Make preparations. Before creating workload scaling policies, prepare two Huawei Cloud clusters that have been registered with UCS, install Kubernetes Metrics Server for each cluster, and create an image named hpa-example.
- Create a workload. Create a Deployment using the prepared image, create an application, and create and deploy a scheduling policy for the Deployment.
- Create scaling policies. Use the command line tool to create a FederatedHPA and a CronFederatedHPA.
- Observe scaling processes. View the number of pods in the Deployment and observe the effects of the scaling policies.
Making Preparations
- Register two Huawei Cloud clusters (cluster01 and cluster02) with UCS. For details about how to register Huawei Cloud clusters with UCS, see Huawei Cloud Clusters.
- Install Kubernetes Metrics Server for the clusters. For details about how to install this add-on, see Kubernetes Metrics Server.
- Log in to the cluster node and deploy a compute-intensive application. When a user sends a request, the result needs to be calculated before being returned to the user. The following describes the details.
- Create a PHP file named index.php to calculate the square root of the request for 1,000,000 times before "OK!" is displayed.
vi index.php
The following provides an example index.php:
<?php $x = 0.0001; for ($i = 0; $i <= 1000000; $i++) { $x += sqrt($x); } echo "OK!"; ?>
- Compile a Dockerfile to create an image.
The following provides an example Dockerfile:
FROM php:5-apache COPY index.php /var/www/html/index.php RUN chmod a+rx index.php
- Create an image named hpa-example with the latest tag.
docker build -t hpa-example:latest .
- (Optional) Log in to the SWR console. In the navigation pane, choose Organizations. In the upper right corner, click Create Organization. Skip this step if you already have an organization.
- In the navigation pane, choose My Images. In the upper right corner, click Upload Through Client. In the displayed dialog box, click Generate a temporary login command. Then, click to copy the command.
- Run the login command copied in the previous step on the node. If the login is successful, "Login Succeeded" will be displayed.
- Add a tag to the hpa-example image.
docker tag {Image name 1:Tag 1} {Image repository address}/{Organization name}/{Image name 2:Tag 2}
Table 1 Tag parameters Parameter
Description
{Image name 1:Tag 1}
Replace them with the name and tag of the image to be uploaded.
{Image repository address}
Replace it with the domain name at the end of the login command in 5.
{Organization name}
Replace it with the organization name created in 4.
{Image name 2:Tag 2}
Replace them with the image name and tag to be displayed in the SWR image repository.
The following is a command example:
docker tag hpa-example:latest swr.ap-southeast-1.myhuaweicloud.com/cloud-develop/hpa-example:latest
- Push the image to the image repository.
docker push {Image repository address}/{Organization name}/{Image name 2:Tag 2}
The following is a command example:
docker push swr.ap-southeast-1.myhuaweicloud.com/cloud-develop/hpa-example:latest
Check whether the following information is returned. If yes, the image push is successful.
6d6b9812c8ae: Pushed ... fe4c16cbf7a4: Pushed latest: digest: sha256:eb7e3bbd*** size: **
- To view the pushed image, go to the SWR console and refresh the My Images page.
- Create a PHP file named index.php to calculate the square root of the request for 1,000,000 times before "OK!" is displayed.
Creating a Workload
- Use the hpa-example image to create a Deployment with one pod. The image path varies with the SWR repository and needs to be replaced with the actual value.
kind: Deployment apiVersion: apps/v1 metadata: name: hpa-example spec: replicas: 1 selector: matchLabels: app: hpa-example template: metadata: labels: app: hpa-example spec: containers: - name: container-1 image: 'hpa-example:latest' # Replace it with the path of the image you uploaded to SWR. resources: limits: # Keep the value same as that of requests to prevent flapping during scaling. cpu: 500m memory: 200Mi requests: cpu: 500m memory: 200Mi imagePullSecrets: - name: default-secret
- Create a Service with the port number being 80.
kind: Service apiVersion: v1 metadata: name: hpa-example spec: ports: - name: cce-service-0 protocol: TCP port: 80 targetPort: 80 nodePort: 31144 selector: app: hpa-example type: NodePort
- Create a scheduling policy for the Deployment and Service and deploy the Deployment and Service in cluster01 and cluster02, with the weight of each cluster being 1 to ensure that each cluster has the same priority.
apiVersion: policy.karmada.io/v1alpha1 kind: PropagationPolicy metadata: name: hpa-example-pp namespace: default spec: placement: clusterAffinity: clusterNames: - cluster01 - cluster02 replicaScheduling: replicaDivisionPreference: Weighted replicaSchedulingType: Divided weightPreference: staticWeightList: - targetCluster: clusterNames: - cluster01 weight: 1 - targetCluster: clusterNames: - cluster02 weight: 1 preemption: Never propagateDeps: true resourceSelectors: - apiVersion: apps/v1 kind: Deployment name: hpa-example namespace: default - apiVersion: v1 kind: Service name: hpa-example namespace: default
Creating Scaling Policies
- Create a FederatedHPA.
vi hpa-example-hpa.yaml
As described in the YAML file, this policy is associated with the Deployment named hpa-example. The stabilization window is 0 seconds for a scale-out and 100 seconds for a scale-in. The maximum number of pods is 100 and the minimum number of pods is 2. This policy contains a system metric rule in which the desired CPU usage is 50%.
apiVersion: autoscaling.karmada.io/v1alpha1 kind: FederatedHPA metadata: name: hpa-example-hpa # FederatedHPA name namespace: default # Namespace where the Deployment resides spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: hpa-example # Deployment name behavior: scaleDown: stabilizationWindowSeconds: 100 # The stabilization window is 100 seconds for a scale-in. scaleUp: stabilizationWindowSeconds: 0 # The stabilization window is 0 seconds for a scale-out. minReplicas: 2 # The minimum number of pods is 2. maxReplicas: 100 # The maximum number of pods is 100. metrics: - type: Resource resource: name: cpu # CPU-based scaling metrics target: type: Utilization # The metric type is resource usage. averageUtilization: 50 # Desired average resource usage
- Create a CronFederatedHPA.
vi cron-federated-hpa.yaml
As described in the YAML file, this policy works with the FederatedHPA named hpa-example-hpa to scale out 10 pods at 08:30 and scale in 2 pods at 10:00 for the Deployment daily.
apiVersion: autoscaling.karmada.io/v1alpha1 kind: CronFederatedHPA metadata: name: cron-federated-hpa # CronFederatedHPA name spec: scaleTargetRef: apiVersion: apps/v1 kind: FederatedHPA # CronFederatedHPA runs based on FederatedHPA. name: hpa-example-hpa # FederatedHPA name rules: - name: "Scale-Up" # Rule name schedule: 30 08 * * * # Time when the policy is triggered targetReplicas: 10 # Desired number of pods, which is a non-negative integer timeZone: Asia/Shanghai # Time zone - name: "Scale-Down" # Rule name schedule: 0 10 * * * # Time when the policy is triggered targetReplicas: 2 # Desired number of pods, which is a non-negative integer timeZone: Asia/Shanghai # Time zone
Observing Scaling Processes
- View the FederatedHPA. You can see that the CPU usage of the Deployment is 0%.
kubectl get FederatedHPA hpa-example-hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE hpa-example-hpa Deployment/hpa-example 0%/50% 1 100 1 6m
- Access the Deployment. In the following command, {ip:port} indicates the access address of the Deployment obtained from its details page.
while true;do wget -q -O- http://{ip:port}; done
- Observe the automatic scale-out process of the Deployment.
kubectl get federatedhpa hpa-example-hpa --watch
View the FederatedHPA. You can see that the CPU usage of the Deployment is 200% at 6m23s, which exceeds the target value. In this case, the FederatedHPA is triggered to expand four pods for the Deployment. In the subsequent several minutes, the CPU usage does not decrease until 8m16s. This is because the new pods may not be successfully created. The possible cause is that resources are insufficient and the pods are in the Pending state. During this period, nodes are added.
At 8m16s, the CPU usage decreases, indicating that the pods are successfully created and start to bear traffic. The CPU usage decreases to 81% at 8m, still greater than the target value and beyond the tolerance range. So, 7 pods are added at 9m31s, and the CPU usage decreases to 51%, which is within the tolerance range. From then on, the number of pods remains 7.
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE hpa-example-hpa Deployment/hpa-example 0%/50% 1 100 1 6m hpa-example-hpa Deployment/hpa-example 200%/50% 1 100 1 6m23s hpa-example-hpa Deployment/hpa-example 200%/50% 1 100 4 6m31s hpa-example-hpa Deployment/hpa-example 210%/50% 1 100 4 7m16s hpa-example-hpa Deployment/hpa-example 210%/50% 1 100 4 7m16s hpa-example-hpa Deployment/hpa-example 90%/50% 1 100 4 8m16s hpa-example-hpa Deployment/hpa-example 85%/50% 1 100 4 9m16s hpa-example-hpa Deployment/hpa-example 51%/50% 1 100 7 9m31s hpa-example-hpa Deployment/hpa-example 51%/50% 1 100 7 10m16s hpa-example-hpa Deployment/hpa-example 51%/50% 1 100 7 11m
View the scaling event of the FederatedHPA, from which you can see the effective time of this policy.
kubectl describe federatedhpa hpa-example-hpa
- Stop accessing the Deployment and observe its automatic scale-in process.
View the FederatedHPA. You can see that the CPU usage is 21% at 13m. The number of pods is reduced to 3 at 18m and then to 1 at 23m.
kubectl get federatedhpa hpa-example-hpa --watch
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE hpa-example-hpa Deployment/hpa-example 50%/50% 1 100 7 12m hpa-example-hpa Deployment/hpa-example 21%/50% 1 100 7 13m hpa-example-hpa Deployment/hpa-example 0%/50% 1 100 7 14m hpa-example-hpa Deployment/hpa-example 0%/50% 1 100 7 18m hpa-example-hpa Deployment/hpa-example 0%/50% 1 100 3 18m hpa-example-hpa Deployment/hpa-example 0%/50% 1 100 3 19m hpa-example-hpa Deployment/hpa-example 0%/50% 1 100 3 19m hpa-example-hpa Deployment/hpa-example 0%/50% 1 100 3 19m hpa-example-hpa Deployment/hpa-example 0%/50% 1 100 3 19m hpa-example-hpa Deployment/hpa-example 0%/50% 1 100 3 23m hpa-example-hpa Deployment/hpa-example 0%/50% 1 100 3 23m hpa-example-hpa Deployment/hpa-example 0%/50% 1 100 1 23m
View the scaling event of the FederatedHPA, from which you can see the effective time of this policy.
kubectl describe federatedhpa hpa-example-hpa
- When the triggering time of the CronFederatedHPA arrives, observe the automatic scaling process of the Deployment.
The number of pods is increased to 4 at 118m and then to 10 at 123m.
kubectl get cronfederatedhpa cron-federated-hpa --watch
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE cron-federated-hpa Deployment/hpa-example 50%/50% 1 100 1 112m cron-federated-hpa Deployment/hpa-example 21%/50% 1 100 1 113m cron-federated-hpa Deployment/hpa-example 0%/50% 1 100 4 114m cron-federated-hpa Deployment/hpa-example 0%/50% 1 100 4 118m cron-federated-hpa Deployment/hpa-example 0%/50% 1 100 4 118m cron-federated-hpa Deployment/hpa-example 0%/50% 1 100 4 119m cron-federated-hpa Deployment/hpa-example 0%/50% 1 100 7 119m cron-federated-hpa Deployment/hpa-example 0%/50% 1 100 7 119m cron-federated-hpa Deployment/hpa-example 0%/50% 1 100 7 119m cron-federated-hpa Deployment/hpa-example 0%/50% 1 100 7 123m cron-federated-hpa Deployment/hpa-example 0%/50% 1 100 10 123m cron-federated-hpa Deployment/hpa-example 0%/50% 1 100 10 123m
View the scaling event of the CronFederatedHPA, from which you can see the effective time of this policy.
kubectl describe cronfederatedhpa cron-federated-hpa
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot