Updated on 2025-08-28 GMT+08:00

OpenKruise

Introduction

OpenKruise is an extended component suite for Kubernetes. It leverages CRDs to offer advanced workload and application management features, including automatic deployment, release, O&M, and availability protection for cloud native applications. This simplifies and streamlines application management, making it more efficient.

OpenKruise has the following core capabilities:

  • Advanced workloads: It contains a set of advanced workloads, such as CloneSets and Advanced StatefulSets.
  • Application sidecar management: It provides many SidecarSets to make sidecar inject easier and offers other capabilities like in-place sidecar upgrade.
  • Application security protection: It protects your Kubernetes resources from being interfered by the cascading deletion mechanism.
  • Efficient application O&M: It provides many advanced O&M capabilities to help you better manage applications. For example, you can use the ImagePullJob CRD to pull some images from any nodes beforehand or restart containers in a running pod.

Open-source community: https://github.com/openkruise/kruise

Typical OpenKruise Workload Controllers

OpenKruise includes workload controllers like CloneSets, Advanced StatefulSets, and Advanced DaemonSets. The typical types of workloads are listed in the table below.

Controller

Description

Enhanced Features

CloneSet

An enhanced deployment controller for stateless applications. It is benchmarked against the native Deployments and provides more flexible upgrade and management capabilities.

  • In-place upgrades are supported. There is no need to rebuild pods.
  • Refined upgrade policies are supported. Pods can be upgraded in batches by percentage or quantity. The maximum number of unavailable pods and the minimum number of ready pods can be controlled.
  • Historical versions are retained for quick rollbacks.
  • You can select specific pods for certain operations based on labels.

For more features, see CloneSet.

Advanced StatefulSet

An enhanced controller for stateful applications, which is extended based on the native StatefulSets.

  • In-place upgrades are supported. There is no need to rebuild pods.
  • Pod scale-in can be specified.
  • You can set the upgrade sequence to break the restriction of sequential updates of native StatefulSets.

For more features, see Advanced StatefulSet.

Advanced DaemonSet

An enhanced controller for node-level Services, which can replace the native DaemonSets.

  • In-place upgrades are supported. There is no need to rebuild pods.
  • You can select pods to be upgraded through labels.
  • Batch upgrades are supported, and the upgrades can be suspended.

For more features, see Advanced DaemonSet.

SidecarSet

A centralized management controller for sidecar containers, which decouples sidecar containers from service containers.

  • Automatic injection is supported. Sidecar containers can be injected into pods that meet the requirements through label selectors.
  • In-place upgrades are supported. Sidecar containers can be upgraded separately without affecting the service containers.
  • Priority control is supported, which allows you to define the startup and termination sequence of sidecar containers.

For more features, see SidecarSet.

UnitedDeployment

A distributed application management controller across domains, which supports unified deployment across multiple domains or node groups.

  • Domains can be defined by node label, region, and more. You can create different workloads in each domain.
  • Elastic allocation is supported. You can scale in or out workloads based on the domain list.
  • Adaptive scheduling is supported. Pending pods can be automatically scheduled to available domains.

For more features, see UnitedDeployment.

Notes and Constraints

If you have deployed the community OpenKruise in your cluster, uninstall it and then install the CCE OpenKruise add-on. Otherwise, the add-on may fail to be installed.

Precautions

OpenKruise has added webhooks to its open-source components. The default pod failure policy has been set to Fail by the community. This means that if kruise-controller-manager becomes unavailable, operations like pod creation and deletion will be blocked. Before using this add-on, it is important to carefully assess the risks and configure HA for kruise-controller-manager to ensure that the webhook server can handle requests properly.

OpenKruise is an open-source add-on that CCE has selected, adapted, and integrated into its services. CCE offers comprehensive technical support, but is not responsible for any service disruptions caused by defects in the open-source software, nor does it provide compensation or additional services for such disruptions. It is highly recommended that users regularly upgrade their software to address any potential issues.

Installing the Add-on

  1. Log in to the CCE console and click the cluster name to access the cluster console.
  2. In the navigation pane, choose Add-ons. Locate OpenKruise on the right and click Install.
  3. On the Install Add-on page, configure the specifications as needed.

    • If you selected Preset, you can choose between Small or Large based on the cluster scale. The system will automatically set the number of add-on pods and resource quotas according to the preset specifications. You can see the configurations on the console.

      The small specification specifies that the add-on runs in one pod, which is ideal for clusters with fewer than 50 nodes. The large specification specifies that the add-on runs in two pods, which are suitable for clusters with more than 50 nodes.

    • If you selected Custom, you can adjust the number of pods and resource quotas as needed. High availability is not possible with a single pod. If an error occurs on the node where the add-on instance runs, the add-on will fail.

  4. Check whether to enable Kruise-daemon.

    kruise-daemon, a new DaemonSet component, has been added to OpenKruise. It provides image warm-up and container restart.

    If you install OpenKruise v1.0.23 or earlier in a cluster of v1.25 or later, kruise-daemon cannot run on a Docker node. In this case, use a containerd node. For details, see Components.

  5. Configure deployment policies for the add-on pods.

    • Scheduling policies do not take effect on add-on pods of the DaemonSet type.
    • When configuring multi-AZ deployment or node affinity, ensure that there are nodes meeting the scheduling policy and that resources are sufficient in the cluster. Otherwise, the add-on cannot run.
    Table 1 Configurations for add-on scheduling

    Parameter

    Description

    Multi-AZ Deployment

    • Preferred: Deployment pods of the add-on will be preferentially scheduled to nodes in different AZs. If all the nodes in the cluster are deployed in the same AZ, the pods will be scheduled to different nodes in that AZ.
    • Equivalent mode: Deployment pods of the add-on are evenly scheduled to the nodes in the cluster in each AZ. If a new AZ is added, you are advised to increase add-on pods for cross-AZ HA deployment. With the Equivalent multi-AZ deployment, the difference between the number of add-on pods in different AZs will be less than or equal to 1. If resources in one of the AZs are insufficient, pods cannot be scheduled to that AZ.
    • Forcible: Deployment pods of the add-on are forcibly scheduled to nodes in different AZs. There can be at most one pod in each AZ. If nodes in a cluster are not in different AZs, some add-on pods cannot run properly. If a node is faulty, add-on pods on it may fail to be migrated.

    Node Affinity

    • Not configured: Node affinity is disabled for the add-on.
    • Specify node: Specify the nodes where the add-on is deployed. If you do not specify the nodes, the add-on will be randomly scheduled based on the default cluster scheduling policy.
    • Specify node pool: Specify the node pool where the add-on is deployed. If you do not specify the node pools, the add-on will be randomly scheduled based on the default cluster scheduling policy.
    • Customize affinity: Enter the labels of the nodes where the add-on is to be deployed for more flexible scheduling policies. If you do not specify node labels, the add-on will be randomly scheduled based on the default cluster scheduling policy.

      If multiple custom affinity policies are configured, ensure that there are nodes that meet all the affinity policies in the cluster. Otherwise, the add-on cannot run.

    Toleration

    Using both taints and tolerations allows (not forcibly) the add-on Deployment to be scheduled to a node with the matching taints, and controls the Deployment eviction policies after the node where the Deployment is located is tainted.

    The add-on adds the default tolerance policy for the node.kubernetes.io/not-ready and node.kubernetes.io/unreachable taints, respectively. The tolerance time window is 60s.

    For details, see Configuring Tolerance Policies.

  6. Click Install.

Components

Table 2 Add-on components

Component

Description

Resource Type

kruise-controller-manager

Core component of OpenKruise controller, which includes admission webhooks for Kruise CRDs and pods. kruise-controller-manager creates webhook configurations to configure which resources need to be processed and provides Services that can be called by kube-apiserver.

Deployment

kruise-daemon

Deployed on each node through DaemonSets to provide functions such as image warm-up and container restart.

DaemonSet

Since version 1.24, the Kubernetes community no longer supports Dockershim. CCE uses cri-dockerd as an alternative to Dockershim in clusters of v1.25 or later to accommodate users' Docker habits. However, the OpenKruise community does not support cri-dockerd. For details, see issue. This issue will be solved in later versions.

Therefore, if you install OpenKruise v1.0.3 in a cluster of v1.25 or later, kruise-daemon cannot run on a Docker node. In this case, use a containerd node.

How to Use the Add-on

After the add-on is installed, you can use a workload controller provided by OpenKruise in the cluster. The following describes how to use a CloneSet to deploy a stateless application. For more examples, see the OpenKruise official website.

  1. Write a cloneset.yaml file.

    File content:
    apiVersion: apps.kruise.io/v1alpha1
    kind: CloneSet
    metadata:
      labels:
        app: sample
      name: sample
    spec:
      replicas: 5
      selector:
        matchLabels:
          app: sample
      template:  # The structure of the CloneSet template is the same as that of the Deployment.
        metadata:
          labels:
            app: sample
        spec:
          containers:
          - name: nginx
            image: nginx:alpine
          imagePullSecrets: 
          - name: default-secret

  2. Create the CloneSet.

    kubectl create -f cloneset.yaml

    Information similar to the following is displayed:

    cloneset.apps.kruise.io/sample created

  3. View the CloneSet.

    kubectl get clone

    Information similar to the following is displayed:

    NAME       DESIRED   UPDATED   UPDATED_READY   READY   TOTAL   AGE
    sample     5         5         5               5       5       23s
    • DESIRED: the number of expected pods
    • UPDATED: the number of pods of the latest version
    • UPDATED_READY: the number of available pods of the latest version
    • READY: the total number of available pods
    • TOTAL: the total number of pods.

Troubleshooting

When a workload is being created, the following error occurs:

Error creating: Internal error occurred: failed calling webhook "mpod.kb.io": failed to call webhook: Post "https://kruise-webhook-service.kube-system.svc:443/mutate-pod?timeout=10s": dial tcp 10.247.10.181:443: connect: connection refused

The issue is caused by the unavailability of the kruise-controller-manager component. This results in the interception of pod creation, update, and deletion operations in certain namespaces (excluding the kube-system namespace or namespaces without the control-plane: openkruise label).

Solution

Restore kruise-controller-manager. The causes and solutions are as follows:

  • The resources required by kruise-controller-manager are not enough for kruise-controller-manager to be properly scheduled. You are advised to configure more resources for the add-on.
  • A scheduling or affinity policy configured for kruise-controller-manager may prevent the pod from being scheduled. You are advised to check the scheduling policy and configure a proper one to allow kruise-controller-manager to be scheduled smoothly.

Release History

Table 3 OpenKruise add-on

Add-on Version

Supported Cluster Version

New Feature

Community Version

1.0.35

v1.25

v1.27

v1.28

v1.29

v1.30

v1.31

v1.32

CCE clusters v1.32 are supported.

1.5.4

1.0.23

v1.25

v1.27

v1.28

v1.29

v1.30

v1.31

CCE clusters v1.31 are supported.

1.5.4

1.0.12

v1.23

v1.25

v1.27

v1.28

v1.29

v1.30

CCE clusters v1.30 are supported.

1.5.4

1.0.3

v1.23

v1.25

v1.27

v1.28

v1.29

The OpenKruise add-on is now available.

1.5.4