Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive
Help Center/ Ubiquitous Cloud Native Service/ Best Practices/ Cluster Federation/ Using Multi-Cluster Workload Scaling to Scale Workloads

Using Multi-Cluster Workload Scaling to Scale Workloads

Updated on 2024-11-01 GMT+08:00

Application Scenarios

There are predictable and unpredictable traffic peaks for some services in complex scenarios. If you only use the standard FederatedHPA, it takes a long time to scale pods in workloads, which may make services unavailable during the expected peak hours. To abstract away this complexity, UCS provides two scaling policies, FederatedHPA and CronFederatedHPA, to automatically scale pods in workloads based on metric changes or at regular intervals.

This section uses hpa-example as an example to describe how you can use both FederatedHPA and CronFederatedHPA to scale workloads.

Solution Process

Figure 1 shows how to use both FederatedHPA and CronFederatedHPA.

  1. Make preparations. Before creating workload scaling policies, prepare two Huawei Cloud clusters that have been registered with UCS, install Kubernetes Metrics Server for each cluster, and create an image named hpa-example.
  2. Create a workload. Create a Deployment using the prepared image, create an application, and create and deploy a scheduling policy for the Deployment.
  3. Create scaling policies. Use the command line tool to create a FederatedHPA and a CronFederatedHPA.
  4. Observe scaling processes. View the number of pods in the Deployment and observe the effects of the scaling policies.
Figure 1 Process of using both FederatedHPA and CronFederatedHPA

Making Preparations

  • Register two Huawei Cloud clusters (cluster01 and cluster02) with UCS. For details about how to register Huawei Cloud clusters with UCS, see Huawei Cloud Clusters.
  • Install Kubernetes Metrics Server for the clusters. For details about how to install this add-on, see Kubernetes Metrics Server.
  • Log in to the cluster node and deploy a compute-intensive application. When a user sends a request, the result needs to be calculated before being returned to the user. The following describes the details.
    1. Create a PHP file named index.php to calculate the square root of the request for 1,000,000 times before "OK!" is displayed.

      vi index.php

      The following provides an example index.php:

      <?php
        $x = 0.0001;
        for ($i = 0; $i <= 1000000; $i++) {
          $x += sqrt($x);
        }
        echo "OK!";
      ?>
    2. Compile a Dockerfile to create an image.

      vi Dockerfile

      The following provides an example Dockerfile:
      FROM php:5-apache
      COPY index.php /var/www/html/index.php
      RUN chmod a+rx index.php
    3. Create an image named hpa-example with the latest tag.

      docker build -t hpa-example:latest .

    4. (Optional) Log in to the SWR console. In the navigation pane, choose Organizations. In the upper right corner, click Create Organization. Skip this step if you already have an organization.
    5. In the navigation pane, choose My Images. In the upper right corner, click Upload Through Client. In the displayed dialog box, click Generate a temporary login command. Then, click to copy the command.
    6. Run the login command copied in the previous step on the node. If the login is successful, "Login Succeeded" will be displayed.
    7. Add a tag to the hpa-example image.
      docker tag {Image name 1:Tag 1} {Image repository address}/{Organization name}/{Image name 2:Tag 2}
      Table 1 Tag parameters

      Parameter

      Description

      {Image name 1:Tag 1}

      Replace them with the name and tag of the image to be uploaded.

      {Image repository address}

      Replace it with the domain name at the end of the login command in 5.

      {Organization name}

      Replace it with the organization name created in 4.

      {Image name 2:Tag 2}

      Replace them with the image name and tag to be displayed in the SWR image repository.

      The following is a command example:

      docker tag hpa-example:latest swr.ap-southeast-1.myhuaweicloud.com/cloud-develop/hpa-example:latest

    8. Push the image to the image repository.

      docker push {Image repository address}/{Organization name}/{Image name 2:Tag 2}

      The following is a command example:

      docker push swr.ap-southeast-1.myhuaweicloud.com/cloud-develop/hpa-example:latest

      Check whether the following information is returned. If yes, the image push is successful.

      6d6b9812c8ae: Pushed 
      ... 
      fe4c16cbf7a4: Pushed 
      latest: digest: sha256:eb7e3bbd*** size: **
    9. To view the pushed image, go to the SWR console and refresh the My Images page.

Creating a Workload

  1. Use the hpa-example image to create a Deployment with one pod. The image path varies with the SWR repository and needs to be replaced with the actual value.

    kind: Deployment
    apiVersion: apps/v1
    metadata:
      name: hpa-example
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: hpa-example
      template:
        metadata:
          labels:
            app: hpa-example
        spec:
          containers:
          - name: container-1
            image: 'hpa-example:latest'   # Replace it with the path of the image you uploaded to SWR.
            resources:
              limits:                      # Keep the value same as that of requests to prevent flapping during scaling.
                cpu: 500m
                memory: 200Mi
              requests:                    
                cpu: 500m
                memory: 200Mi
          imagePullSecrets:
          - name: default-secret

  2. Create a Service with the port number being 80.

    kind: Service
    apiVersion: v1
    metadata:
      name: hpa-example
    spec:
      ports:
        - name: cce-service-0
          protocol: TCP
          port: 80
          targetPort: 80
          nodePort: 31144
      selector:
        app: hpa-example
      type: NodePort

  3. Create a scheduling policy for the Deployment and Service and deploy the Deployment and Service in cluster01 and cluster02, with the weight of each cluster being 1 to ensure that each cluster has the same priority.

    apiVersion: policy.karmada.io/v1alpha1
    kind: PropagationPolicy
    metadata:
      name: hpa-example-pp
      namespace: default
    spec:
      placement:
        clusterAffinity:
          clusterNames:
          - cluster01
          - cluster02
        replicaScheduling:
          replicaDivisionPreference: Weighted
          replicaSchedulingType: Divided
          weightPreference:
            staticWeightList:
            - targetCluster:
                clusterNames:
                - cluster01
              weight: 1
            - targetCluster:
                clusterNames:
                - cluster02
              weight: 1
      preemption: Never
      propagateDeps: true
      resourceSelectors:
      - apiVersion: apps/v1
        kind: Deployment
        name: hpa-example
        namespace: default
      - apiVersion: v1
        kind: Service
        name: hpa-example
        namespace: default

Creating Scaling Policies

  1. Create a FederatedHPA.

    vi hpa-example-hpa.yaml

    As described in the YAML file, this policy is associated with the Deployment named hpa-example. The stabilization window is 0 seconds for a scale-out and 100 seconds for a scale-in. The maximum number of pods is 100 and the minimum number of pods is 2. This policy contains a system metric rule in which the desired CPU usage is 50%.

    apiVersion: autoscaling.karmada.io/v1alpha1     
    kind: FederatedHPA
    metadata:
      name: hpa-example-hpa                               # FederatedHPA name
      namespace: default                                  # Namespace where the Deployment resides
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: hpa-example                                 # Deployment name
      behavior:
        scaleDown:
          stabilizationWindowSeconds: 100                 # The stabilization window is 100 seconds for a scale-in.
        scaleUp:
          stabilizationWindowSeconds: 0                   # The stabilization window is 0 seconds for a scale-out.
      minReplicas: 2                                      # The minimum number of pods is 2.
      maxReplicas: 100                                    # The maximum number of pods is 100.
      metrics:
        - type: Resource
           resource:
            name: cpu                                     # CPU-based scaling metrics
            target:
               type: Utilization                          # The metric type is resource usage.
               averageUtilization: 50                     # Desired average resource usage

  2. Create a CronFederatedHPA.

    vi cron-federated-hpa.yaml

    As described in the YAML file, this policy works with the FederatedHPA named hpa-example-hpa to scale out 10 pods at 08:30 and scale in 2 pods at 10:00 for the Deployment daily.

    apiVersion: autoscaling.karmada.io/v1alpha1 
    kind: CronFederatedHPA 
    metadata: 
      name: cron-federated-hpa                            # CronFederatedHPA name
    spec: 
      scaleTargetRef: 
        apiVersion: apps/v1 
        kind: FederatedHPA                               # CronFederatedHPA runs based on FederatedHPA.
        name: hpa-example-hpa                             # FederatedHPA name
      rules: 
      - name: "Scale-Up"                                  # Rule name
        schedule: 30 08 * * *                             # Time when the policy is triggered
        targetReplicas: 10                                # Desired number of pods, which is a non-negative integer
        timeZone: Asia/Shanghai                           # Time zone
      - name: "Scale-Down"                                # Rule name
        schedule: 0 10 * * *                              # Time when the policy is triggered
        targetReplicas: 2                                 # Desired number of pods, which is a non-negative integer
        timeZone: Asia/Shanghai                           # Time zone

Observing Scaling Processes

  1. View the FederatedHPA. You can see that the CPU usage of the Deployment is 0%.

    kubectl get FederatedHPA hpa-example-hpa
    NAME              REFERENCE                TARGETS   MINPODS   MAXPODS   REPLICAS   AGE 
    hpa-example-hpa   Deployment/hpa-example   0%/50%    1         100       1          6m

  2. Access the Deployment. In the following command, {ip:port} indicates the access address of the Deployment obtained from its details page.

    while true;do wget -q -O- http://{ip:port}; done

  3. Observe the automatic scale-out process of the Deployment.

    kubectl get federatedhpa hpa-example-hpa --watch

    View the FederatedHPA. You can see that the CPU usage of the Deployment is 200% at 6m23s, which exceeds the target value. In this case, the FederatedHPA is triggered to expand four pods for the Deployment. In the subsequent several minutes, the CPU usage does not decrease until 8m16s. This is because the new pods may not be successfully created. The possible cause is that resources are insufficient and the pods are in the Pending state. During this period, nodes are added.

    At 8m16s, the CPU usage decreases, indicating that the pods are successfully created and start to bear traffic. The CPU usage decreases to 81% at 8m, still greater than the target value and beyond the tolerance range. So, 7 pods are added at 9m31s, and the CPU usage decreases to 51%, which is within the tolerance range. From then on, the number of pods remains 7.

    NAME              REFERENCE                TARGETS    MINPODS   MAXPODS   REPLICAS   AGE 
    hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       1          6m 
    hpa-example-hpa   Deployment/hpa-example   200%/50%   1         100       1          6m23s 
    hpa-example-hpa   Deployment/hpa-example   200%/50%   1         100       4          6m31s 
    hpa-example-hpa   Deployment/hpa-example   210%/50%   1         100       4          7m16s 
    hpa-example-hpa   Deployment/hpa-example   210%/50%   1         100       4          7m16s 
    hpa-example-hpa   Deployment/hpa-example   90%/50%    1         100       4          8m16s 
    hpa-example-hpa   Deployment/hpa-example   85%/50%    1         100       4          9m16s 
    hpa-example-hpa   Deployment/hpa-example   51%/50%    1         100       7          9m31s 
    hpa-example-hpa   Deployment/hpa-example   51%/50%    1         100       7          10m16s 
    hpa-example-hpa   Deployment/hpa-example   51%/50%    1         100       7          11m 

    View the scaling event of the FederatedHPA, from which you can see the effective time of this policy.

    kubectl describe federatedhpa hpa-example-hpa

  4. Stop accessing the Deployment and observe its automatic scale-in process.

    View the FederatedHPA. You can see that the CPU usage is 21% at 13m. The number of pods is reduced to 3 at 18m and then to 1 at 23m.

    kubectl get federatedhpa hpa-example-hpa --watch

    NAME              REFERENCE                TARGETS    MINPODS   MAXPODS   REPLICAS   AGE 
    hpa-example-hpa   Deployment/hpa-example   50%/50%    1         100       7          12m 
    hpa-example-hpa   Deployment/hpa-example   21%/50%    1         100       7          13m 
    hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       7          14m 
    hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       7          18m 
    hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       3          18m
    hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       3          19m 
    hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       3          19m 
    hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       3          19m 
    hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       3          19m 
    hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       3          23m 
    hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       3          23m 
    hpa-example-hpa   Deployment/hpa-example   0%/50%     1         100       1          23m

    View the scaling event of the FederatedHPA, from which you can see the effective time of this policy.

    kubectl describe federatedhpa hpa-example-hpa

  5. When the triggering time of the CronFederatedHPA arrives, observe the automatic scaling process of the Deployment.

    The number of pods is increased to 4 at 118m and then to 10 at 123m.

    kubectl get cronfederatedhpa cron-federated-hpa --watch

    NAME                 REFERENCE                TARGETS    MINPODS   MAXPODS   REPLICAS   AGE 
    cron-federated-hpa   Deployment/hpa-example   50%/50%    1         100       1          112m 
    cron-federated-hpa   Deployment/hpa-example   21%/50%    1         100       1          113m 
    cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       4          114m 
    cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       4          118m 
    cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       4          118m
    cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       4          119m 
    cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       7          119m 
    cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       7          119m 
    cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       7          119m
    cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       7          123m  
    cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       10         123m 
    cron-federated-hpa   Deployment/hpa-example   0%/50%     1         100       10         123m

    View the scaling event of the CronFederatedHPA, from which you can see the effective time of this policy.

    kubectl describe cronfederatedhpa cron-federated-hpa

We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out more

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback