Help Center/ Cloud Container Engine_Autopilot/ FAQs/ Cluster Management/ How Do I Locate the Fault When a Cluster Is Unavailable?
Updated on 2025-05-09 GMT+08:00

How Do I Locate the Fault When a Cluster Is Unavailable?

Perform the following operations to locate the fault when a cluster becomes unavailable.

Fault Locating

The causes here are described in order of how likely they are to occur.

If the fault persists after you have ruled out one cause, move on to the next one.

If the fault persists, contact customer service by submitting a service ticket.

Checking Whether the Security Group Is Changed

  1. Log in to the Network Console. In the navigation pane, choose Access Control > Security Groups to locate the security group associated with the master nodes.

    The name of the security group associated with the master nodes is in the format of {cluster-name}-cce-control-{random-ID}.

  2. Click the security group name. On the details page displayed, ensure that the security group rules of the master nodes are correct.

    For details about security groups, see Security Group for Master Nodes.

Checking Whether the Cluster Certificate Works

Symptom

If the region where a cluster is located requires a transition between daylight saving time (DST) and standard time (ST), there may be a period of unavailability during the overlapping time. For instance, if you apply to create a cluster at 02:00 in the morning, the time will shift to 01:00 in the morning when DST changes to ST. This can potentially result in the cluster being unavailable.

Possible cause

The certificate of the cluster has not taken effect, and the cluster is affected. Each component of Kubernetes uses a certificate to access the kube-apiserver, which then verifies the certificate of the request. If the verification fails, the request will be rejected.

Solutions

  • Wait until the certificate takes effect, at which point the cluster will automatically become available.
  • Contact customer service by submitting a service ticket.