Help Center/ Ubiquitous Cloud Native Service/ FAQs/ Fleets/ What Can I Do If Cluster Federation Verification Fails to Be Enabled for a Fleet?
Updated on 2024-09-11 GMT+08:00

What Can I Do If Cluster Federation Verification Fails to Be Enabled for a Fleet?

Context

After cluster federation is enabled for a fleet, existing clusters and clusters newly added to the fleet will automatically join the federation. In this process, the fleet verifies the network status, cluster version, clusterrole, and clusterrolebinding of the cluster. If the verification fails, clusters cannot join the federation. After the fault is rectified, click Retry to join the cluster federation again.

Symptom 1: A Message Is Displayed Indicating that clusterrole and clusterrolebinding Already Exist

Cause: A cluster cannot join two or more federations at the same time. If this error message is displayed, the cluster has joined the federation, or joined the federation but has residual resources.

Solution: Manually clear residual resources.

Procedure:

  1. Obtain the kubeconfig file of the faulty cluster, prepare kubectl and the running node, and place the kubeconfig file in the /tmp directory of the running node.
  2. Run the following command to clear residual resources:

    alias kubectl='kubectl --kubeconfig=/tmp/kubeconfig'

    kubectl delete clusterrolebinding `kubectl get clusterrolebinding |grep karmada-controller-manager | awk '{print $1}'`

    kubectl delete clusterrole `kubectl get clusterrole |grep karmada-controller-manager | awk '{print $1}'`

    kubectl delete namespace `kubectl get namespace |egrep 'karmada-[0-9a-f]{8}-([0-9a-f]{4}-){3}[0-9a-f]{12}' |awk '{print $1}'`

Symptom 2: A Message Is Displayed Indicating that an EIP Needs to Be Bound to the CCE Cluster

Cause: After the federation function is enabled for the fleet, an EIP needs to be used to solve the network connection problem when the CCE cluster is accessed.

Solution: Bind an EIP to the CCE cluster.

Symptom 3: An EIP Has Been Bound to a CCE Cluster, but the Cluster Still Fails to Be Added to a Federation. "network in cluster is stable, please retry it later" Is Displayed

Cause: The federation needs to access the CCE cluster over port 5443. The inbound rule of the security group on the control plane of the CCE cluster specifies that 94.74.86.108 (source address) is denied to access the CCE cluster over port 5443.

Solution: Modify the inbound rule of the security group on the control plane of the CCE cluster to allow 94.74.86.108 (source address) to access the CCE cluster over port 5443.

Symptom 4: Cluster That Has Been Added to a Federation Is Abnormal. "cluster is not reachable" Is Displayed

Run the following command in the corresponding member cluster to check whether ServiceAccount exists. Replace {cluster_name} with the name of the member cluster.

kubectl get sa -A|grep karmada-{cluster_name}.clusterspace.{cluster_name}

If the command output indicates that ServiceAccount does not exist, remove the member cluster from the fleet and add this cluster to the fleet again.