Creating a FederatedHPA to Scale Pods Based on Metric Changes
This section describes how you can create a FederatedHPA so that pods in workloads are automatically scaled in or out based on different metrics.
Before creating a FederatedHPA, you must have learnt the basic working principle and concepts of FederatedHPA (How FederatedHPA Works). To know the differences between the FederatedHPA and CronFederatedHPA, see Overview.
Constraints
FederatedHPA can be configured only for clusters 1.19 or later. To query the cluster version, log in to the UCS console, click the name of the fleet that the cluster is added to, and click Container Clusters.
Creating a FederatedHPA
Using the console
- Log in to the UCS console and choose Fleets in the navigation pane.
- Click the name of the fleet with federation enabled.
- Choose Workload Scaling in the navigation pane and click the Metric-based Policy tab. Then click Create Metric-based Policy in the upper right corner.
- Configure parameters for the FederatedHPA.
Table 1 FederatedHPA parameters Parameter
Description
Policy Name
Enter a name containing 4 to 63 characters for the FederatedHPA.
Namespace
Select the namespace for the workload for which you want to set automatic scaling. You can also create a namespace. For details, see Namespaces.
Applicable Workload
Select the name of the workload for which you want to set automatic scaling. You can also create a workload. For details, see Workloads.
Pod Range
Enter the minimum and maximum numbers of pods. When the FederatedHPA is triggered, the pods will be scaled within this range.
- Min.: Enter an integer ranging from 1 to 299.
- Max.: Enter a positive integer ranging from 1 to 1,500. The value must be greater than the minimum value.
Stabilization Window
The scaling operation is initiated only when the metric continuously reaches the desired value within stabilization window. By default, the stabilization window is 0 seconds for a scale-out and 300 seconds for a scale-in. For details about stabilization window, see How Do I Ensure the Stability of Workload Scaling?.
- Scale-out: Enter an integer ranging from 0 to 3,600, in seconds.
- Scale-in: Enter an integer ranging from 0 to 3,600, in seconds.
System rule
If you want to scale pods for a workload based on system metrics, you need to configure this rule.
- Metric Name: Select CPU utilization or Memory utilization.
- Expected Value: The scaling operation is triggered when the metric reaches the desired value.
Custom rule
If you want to scale pods for a workload based on custom metrics, you need to configure this rule.
- Metric Name: Select a name from the drop-down list.
- Source: Select the object type described by the custom metric from the drop-down list. Currently, only Pod is supported.
- Expected Value: The scaling operation is triggered when the metric reaches the desired value.
CAUTION:- Custom rules can be created only for clusters 1.19 or later.
- Before using a custom rule, install the add-on that supports custom metric collection for the cluster. Ensure that the add-on can collect and report the custom metrics of the workloads. For details, see Installing a Metric Collection Add-on.
- Click Create in the lower right corner.
In the displayed policy list, you can view the policy details.
Using kubectl
- Use kubectl to connect to the federation. For details, see Using kubectl to Connect to a Federation.
- Create and edit an fhpa.yaml file. For details about the key parameters, see Table 2.
vi fhpa.yaml
In this example, the FederatedHPA is named hpa-example-hpa and associated with the workload named hpa-example. The stabilization window is 0 seconds for a scale-out and 300 seconds for a scale-in. The maximum number of pods is 100 and the minimum number of pods is 2. There are two system metric rules: memory and cpu. The desired memory usage in memory is 50%, and the desired CPU usage in cpu is 60%.
apiVersion: autoscaling.karmada.io/v1alpha1 kind: FederatedHPA metadata: name: hpa-example-hpa # FederatedHPA name namespace: default # Namespace where the workload resides spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: hpa-example # Workload name behavior: scaleDown: stabilizationWindowSeconds: 300 # The stabilization window is 300 seconds for a scale-in. scaleUp: stabilizationWindowSeconds: 0 # The stabilization window is 0 seconds for a scale-out. minReplicas: 2 # The minimum number of pods is 2. maxReplicas: 100 # The maximum number of pods is 100. metrics: - type: Resource resource: name: memory # Name of the first rule target: type: Utilization # The metric type is resource usage. averageUtilization: 50 # Desired average resource usage - type: Resource resource: name: cpu # Name of the second rule target: type: Utilization # The metric type is resource usage. averageUtilization: 60 # Desired average resource usage
Table 2 Key parameters Parameter
Mandatory
Type
Description
stabilizationWindowSeconds
No
String
Stabilization window for a scale-in. Enter a positive integer ranging from 0 to 3,600, in seconds. If this parameter is not specified, the default value is 300.
NOTE:The scaling operation is initiated only when the metric continuously reaches the desired value within stabilization window. For details about stabilization window, see How Do I Ensure the Stability of Workload Scaling?.
stabilizationWindowSeconds
No
String
Stabilization window for a scale-out. Enter a positive integer ranging from 0 to 3,600, in seconds. If this parameter is not specified, the default value is 0.
minReplicas
Yes
String
Minimum number of pods that can be scaled in for a workload when the scaling policy is triggered. Enter a positive integer ranging from 1 to 299.
maxReplicas
Yes
String
Maximum number of pods that can be scaled out for a workload when the scaling policy is triggered. Enter a positive integer ranging from 1 to 1,500 and ensure that the value is greater than the minimum value.
name
Yes
String
Rule name. For scaling operations based on system metrics, memory is used as the rule name of the memory usage, and cpu is used as the rule name of the CPU usage.
type
Yes
String
Metric type.
- Value: total number of pods
- AverageValue: Total number of pods/Number of current pods
- Utilization: CPUs or memory used by pods in a workload/Requested CPUs or memory
averageUtilization
Yes
String
The scaling operation is triggered when the metric reaches the desired value.
- Create a FederatedHPA.
kubectl apply -f fhpa.yaml
If information similar to the following is displayed, the policy has been created:
FederatedHPA.autoscaling.karmada.io/hpa-example-hpa created
You can run the following commands to check the workload scaling:
- kubectl get deployments: checks the current number of pods in a workload.
- kubectl describe federatedhpa hpa-example-hpa: views scaling events (latest three records) of the FederatedHPA.
You can run the following commands to manage FederatedHPA hpa-example-hpa (replaced with the actual name):
- kubectl get federatedhpa hpa-example-hpa: obtains the FederatedHPA.
- kubectl edit federatedhpa hpa-example-hpa: updates the FederatedHPA.
- kubectl delete federatedhpa hpa-example-hpa: deletes the FederatedHPA.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot