Updated on 2025-08-15 GMT+08:00

Workload Scaling

There are two workload scaling methods: auto scaling and manual scaling.

  • Auto scaling: triggered by metrics. After the configuration is complete, pods can be automatically added or deleted as the resources change over time.
  • Manual scaling: After the configuration is complete, pods can be added or deleted immediately.

Constraints

  • Auto scaling is only supported for Deployments.
  • CCI 2.0 supports auto scaling only in the TR-Istanbul, AF-Johannesburg, AP-Singapore, and ME-Riyadh regions.

Auto Scaling

  • If spec.metrics.resource.target.type is set to Utilization, you need to specify the resource requests value when creating a workload.
  • When spec.metrics.resource.target.type is set to Utilization, the resource usage is calculated as follows: Resource usage = Used resource/Available pod flavor. You can determine the actual flavor of a pod by referring to Pod Flavor.

A properly configured auto scaling policy includes the metric, threshold, and step. It eliminates the need to manually adjust resources in response to service changes and traffic bursts, thus helping you reduce workforce and resource consumption.

CCI 2.0 supports auto scaling based on metrics. The workloads are scaled out or in based on the vCPU or memory usage. You can specify a vCPU or memory usage threshold. If the usage is higher or lower than the threshold, pods are automatically added or deleted.

  • Configure a metric-based auto scaling policy.
    1. Log in to the CCI 2.0 console.
    2. In the navigation pane, choose Workloads. On the Deployments tab, locate the target Deployment and click its name.
    3. On the Auto Scaling tab, click Create from YAML to configure an auto scaling policy.

      The following describes the auto scaling policy file format:

      • Resource description in the hpa.yaml file
        kind: HorizontalPodAutoscaler
        apiVersion: cci/v2
        metadata:
          name: nginx              # Name of the HorizontalPodAutoscaler
          namespace: test          # Namespace of the HorizontalPodAutoscaler
        spec:
          scaleTargetRef:         # Reference the resource to be automatically scaled.
            kind: Deployment      # Type of the target resource, for example, Deployment
            name: nginx           # Name of the target resource
            apiVersion: cci/v2    # Version of the target resource
          minReplicas: 1          # Minimum number of replicas for HPA scaling
          maxReplicas: 5          # Maximum number of replicas for HPA scaling
          metrics:
            - type: Resource                 # Resource metrics are used.
              resource:
                name: memory                 # Resource name, for example, cpu or memory
                target:
                  type: Utilization          # Metric type. The value can be Utilization (percentage) or AverageValue (absolute value).
                  averageUtilization: 50     # Resource usage. For example, when the CPU usage reaches 50%, scale-out is triggered.
          behavior:
            scaleUp:
              stabilizationWindowSeconds: 30  # Scale-out stabilization duration, in seconds
              policies:
              - type: Pods           # Number of pods to be scaled
                value: 1
                periodSeconds: 30  # The check is performed once every 30 seconds.
            scaleDown:
              stabilizationWindowSeconds: 30  # Scale-in stabilization duration, in seconds
              policies:
              - type: Percent      # The resource is scaleed in or out based on the percentage of existing pods.
                value: 50
                periodSeconds: 30  # The check is performed once every 30 seconds.
      • Resource description in the hpa.json file
        {
        	"kind": "HorizontalPodAutoscaler",
        	"apiVersion": "cci/v2",
        	"metadata": {
        		"name": "nginx",             # Name of the the HorizontalPodAutoscaler
                        "namespace": "test"          # Namespace of the HorizontalPodAutoscaler
        	},
        	"spec": {
                        "scaleTargetRef": {         # Reference the resource to be automatically scaled.
        			"kind": "Deployment",        # Type of the target resource, for example, Deployment
        			"name": "nginx",             # Name of the target resource
        			"apiVersion": "cci/v2"       # Version of the target resource
        		},
        		"minReplicas": 1,             # Minimum number of replicas for HPA scaling
        		"maxReplicas": 5,             # Maximum number of replicas for HPA scaling
        		"metrics": [
        			{
        				"type": "Resource",             # Resource metrics are used.
        				"resource": {
        					"name": "memory",       # Resource name, for example, cpu or memory
        					"target": {
        						"type": "Utilization",           # Metric type. The value can be Utilization (percentage) or AverageValue (absolute value).
        						"averageUtilization": 50         # Resource usage. For example, when the CPU usage reaches 50%, scale-out is triggered.
        					}
        				}
        			}
        		],
        		"behavior": {
        			"scaleUp": {
        				"stabilizationWindowSeconds": 30,
        				"policies": [
        					{
        						"type": "Pods",
        						"value": 1,
        						"periodSeconds": 30
        					}
        				]
        			},
        			"scaleDown": {
        				"stabilizationWindowSeconds": 30,
        				"policies": [
        					{
        						"type": "Percent",
        						"value": 50,
        						"periodSeconds": 30
        					}
        				]
        			}
        		}
        	}
        }
    4. Click OK.
      You can view the auto scaling policy on the Auto Scaling tab.
      Figure 1 Auto scaling policy

      When the trigger condition is met, the auto scaling policy will be executed.

Manual Scaling

  1. Log in to the CCI 2.0 console.
  2. In the navigation pane, choose Workloads. On the Deployments tab, locate the target Deployment and click Edit YAML.
  3. Change the value of spec.replicas, for example, to 3, and click OK.
  4. On the Pods tab, view the new pods being created. When the statuses of all added pods change to Running, the scaling is complete.

    Figure 2 Pod list after a manual scaling