Updated on 2024-06-26 GMT+08:00

Egress Network Bandwidth Guarantee

Egress network bandwidth guarantee is implemented by setting network priorities. It has the following advantages:

  • The egress network bandwidth used by online and offline services is balanced to ensure sufficient network bandwidth for online services. When the threshold is reached for online services, the bandwidth usage of offline services will be reduced.
  • When online services occupy a small number of network resources, offline services can use more bandwidth. When online services occupy a large number of network resources, the resource usage of offline services will be reduced to ensure that more network bandwidth prioritizes online services.

Notes and Constraints

To use egress network bandwidth guarantee, the following requirements must be met:

  • Only nodes running Huawei Cloud EulerOS 2.0 are supported.
  • Only CCE Turbo clusters of v1.23 or later are supported.
  • The Volcano add-on of v1.9.0 or later must be installed in the cluster, and the hybrid deployment function must be enabled by setting colocation_enable in the advanced settings to true.
  • Before enabling, modifying, or disabling egress network bandwidth guarantee, ensure that the Volcano add-on is working.
  • For pods that have been running on the node before the Volcano add-on is installed, manually restart the pods after enabling network bandwidth guarantee so that the feature can take effect.
  • Uninstalling the Volcano add-on or disabling the hybrid deployment function (that is, setting colocation_enable in the advanced settings to false) does not affect the existing egress network bandwidth guarantee on the node. To disable this feature, see Disabling Egress Network Bandwidth Guarantee.
  • If bandwidth limit is enabled, the protocol stack cache may be stacked. For protocols without backpressure mechanisms, such as UDP, packet loss and ENOBUFS may occur.
  • Bandwidth limit increases the risk that offline services cannot obtain bandwidth. Services may even be abnormal due to insufficient bandwidth or pod health check may fail.
  • Egress network bandwidth guarantee is not prioritized in the following scenarios:
    • When network bandwidth limit is used for hybrid online or offline pods, the priority of network bandwidth limit is higher than that of the current function.
    • When a pod uses the node network (hostNetwork), the egress network bandwidth guarantee function does not take effect.

Procedure

The following describes how to enable or disable egress network bandwidth guarantee.

  1. Log in to the CCE console and click the cluster name to access the cluster console.
  2. In the navigation pane on the left, choose Nodes. Click the Node Pools tab. When creating or updating a node pool, enable hybrid deployment of online and offline services in Advanced Settings.

    • volcano.sh/oversubscription=true
    • volcano.sh/colocation=true
    Figure 1 Node label settings

  3. In the navigation pane, choose Add-ons and click Install of the Volcano add-on. On the Install Add-on page, enable hybrid deployment in the Parameters area. For details about the installation, see Volcano Scheduler.

    If the Volcano add-on has been installed, click Edit to view or modify the parameter colocation_enable.

    Figure 2 Enabling hybrid deployment of online and offline services

    After CPU burst is disabled, this function is still enabled on the existing pods where CPU burst has been enabled. Disabling CPU burst takes effect only on new pods.

  4. (Optional) Modify parameters for egress network bandwidth guarantee.

    After confirming that the Volcano add-on is working, edit the parameter configmap of volcano-agent-configuration in the kube-system namespace. If enable is set to true (default value), egress network bandwidth guarantee is enabled, and related parameters can be modified.
    kubectl edit configmap -nkube-system volcano-agent-configuration

    Example:

    ...
    data:
      colocation-config: |
        {
            "globalConfig":{
                "cpuBurstConfig":{
                    "enable":true
                },
                "networkQosConfig":{
                    "enable":true,
                    "onlineBandwidthWatermarkPercent":80,
                    "offlineLowBandwidthPercent":10,
                    "offlineHighBandwidthPercent":40
                },
    ...

    The modified parameters take effect for all nodes running Huawei Cloud EulerOS 2.0 in the cluster.

    Table 1 networkQosConfig parameters

    Name

    Description

    Default Value

    Configuration Range

    enable

    Specifies whether to enable the egress network bandwidth guarantee feature.

    true

    true or false

    onlineBandwidthWatermarkPercent

    Ratio of the total bandwidth threshold of online services to the assured bandwidth of the node type

    Total bandwidth threshold of online services = Assured bandwidth of the node type x onlineBandwidthWatermarkPercent/100

    80

    Value range: 1 to 1000

    NOTE:

    The actual network bandwidth may be larger than the assured bandwidth but less than the maximum bandwidth. Therefore, the value can be greater than 100.

    offlineLowBandwidthPercent

    Ratio of the maximum total bandwidth usage of offline services to the assured bandwidth of the node type when the bandwidth usage of online services exceeds the threshold.

    If the total bandwidth usage of online services on the same node exceeds the value of Assured bandwidth of the node type x onlineBandwidthWatermarkPercent/100, the total bandwidth usage of offline services on the same node cannot exceed the value of Assured bandwidth of the node type x offlineLowBandwidthPercent/100.

    10

    offlineHighBandwidthPercent

    Ratio of the maximum total bandwidth usage of offline services to the assured bandwidth of the node type when the bandwidth usage of online services does not exceed the threshold.

    If the total bandwidth usage of online services on the same node does not exceed the value of Assured bandwidth of the node type x onlineBandwidthWatermarkPercent/100, the total bandwidth usage of offline services on the same node cannot exceed the value of Assured bandwidth of the node type x offlineHighBandwidthPercent/100.

    40

    Figure 3 Example of egress network bandwidth guarantee

    In the preceding figure, when the bandwidth of the online job is lower than the bandwidth baseline, the bandwidth threshold of the offline job is relatively high, indicating that the offline job can use certain bandwidth. When the bandwidth of the online job exceeds the bandwidth baseline, the bandwidth threshold of the offline job will be lowered accordingly to reduce the bandwidth used by the offline job so that a higher bandwidth can be reserved for the online job.

  5. To disable egress network bandwidth guarantee, after confirming that the Volcano add-on is working, run the following command to edit the parameter configmap of volcano-agent-configuration in the namespace kube-system. Set enable to false.

    kubectl edit configmap -nkube-system volcano-agent-configuration

    Modify the following parameters:

    ...
    data:
      colocation-config: |
        {
            "globalConfig":{
                "cpuBurstConfig":{
                    "enable":true
                },
                "networkQosConfig":{
                    "enable":false,
                    "onlineBandwidthWatermarkPercent":80,
                    "offlineLowBandwidthPercent":10,
                    "offlineHighBandwidthPercent":40
                },
    ...