Help Center/ Ubiquitous Cloud Native Service/ FAQs/ On-Premises Clusters/ What Can I Do If an On-Premises Cluster Fails to Be Connected?

Updated on 2025-11-17 GMT+08:00

View PDF

What Can I Do If an On-Premises Cluster Fails to Be Connected?

Symptom

This section describes how to troubleshoot cluster connection exceptions and provides solutions. The following exceptions may occur when a cluster is connected to UCS:

You have registered a cluster to UCS and deployed proxy-agent in the cluster, but the console always displays an error message, indicating that the cluster is waiting for connection or fails to get registered after the connection times out.

If the cluster registration fails, click in the upper right corner of the cluster card to register it again and locate the fault as guided in Troubleshooting.
If the status of a connected cluster is unavailable, rectify the fault by referring to Troubleshooting in this section.

Troubleshooting

Table 1 explains the error messages for you to locate faults.

**Table 1** Error messages
Error Message	Description	Check Item
"currently no agents available, please make sure the agents are correctly registered"	The proxy-agent in the connected cluster is abnormal or the network is abnormal.	Check Item 1: proxy-agent Check Item 2: Network Connection Between the Cluster and UCS
"please check the health status of kube apiserver: ..."	The kube-apiserver in the cluster cannot be accessed.	Check Item 3: kube-apiserver
"cluster responded with non-successful status code: ..."	Rectify the fault based on the status code. For example, status code 401 indicates that the user does not have the access permissions. A possible cause is that the cluster authentication information has expired.	Check Item 4: Cluster Authentication Information Changes
"cluster responded with non-successful message: ..."	Rectify the fault based on the returned information. For example, the message Get "https://172.16.0.143:6443/readyz?timeout=32s\": context deadline exceeded indicates that the access to the API server times out. A possible cause is that the API server is faulty.	-
"Current cluster version is not supported in UCS service."	This error occurs because the cluster version does not meet requirements. The version of the Kubernetes cluster connected to UCS must be 1.19 or later.	-

Check Item 1: proxy-agent

After a cluster is unregistered from UCS, the authentication information contained in the original proxy-agent configuration file becomes invalid. You need to delete the proxy-agent pods deployed in the cluster. To connect the cluster to UCS again, download the proxy-agent configuration file from the UCS console again and use it for re-deployment.

Log in to a master node in the cluster.
Check the deployment of proxy-agent.

kubectl -n kube-system get pod | grep proxy-agent

Desired output for successful deployment:
```
proxy-agent-*** 1/1 Running 0 9s
```
If proxy-agent is not in the Running state, run the kubectl -n kube-system describe pod proxy-agent-*** command to view the pod alarms. For details, see What Can I Do If proxy-agent Fails to Be Deployed?.

By default, proxy-agent is deployed with two pods. It can provide services as long as one pod is running normally. However, one pod cannot ensure high availability.
Print the pod logs of proxy-agent and check whether the agent program can connect to UCS.

kubectl -n kube-system logs proxy-agent-*** | grep "Start serving"

If no "Start serving" log is printed but the proxy-agent pods are working, check other items.

Check Item 2: Network Connection Between the Cluster and UCS

Public Network Access

Check whether a public IP address is bound to the cluster or a public NAT gateway is configured.
Check whether the cluster security group allows outbound traffic. To perform access control on the outbound traffic, contact technical support to obtain the destination and port number.
After rectifying network faults, delete the existing proxy-agent pods and rebuild pods. Check whether the logs of the new pods contain "Start serving".

kubectl -n kube-system logs proxy-agent-*** | grep "Start serving"
If desired logs are printed, refresh the UCS console page and check whether the cluster is connected.

Private Network Access

Check whether the cluster security group allows outbound traffic. To perform access control on the outbound traffic, contact technical support to obtain the destination and port number.
Rectify the network connection faults between the cluster and UCS or IDC.
Refer to the following guides based on your network connection type:
- Direct Connect: Troubleshooting
- Virtual Private Network (VPN): Troubleshooting
Rectify the VPC endpoint fault. The VPC endpoint status must be Accepted. If the VPC endpoint is deleted accidently, create another one. For details, see How Do I Restore a Deleted VPC Endpoint for a Cluster Connected Through a Private Network?.

Figure 1 Checking the VPC endpoint status
After rectifying network faults, delete the existing proxy-agent pods and rebuild pods. Check whether the logs of the new pods contain "Start serving".

kubectl -n kube-system logs proxy-agent-*** | grep "Start serving"
If desired logs are printed, refresh the UCS console page and check whether the cluster is connected.

Check Item 3: kube-apiserver

When a cluster is connected to UCS, the error message shown in Figure 2 may be displayed.

Figure 2 Abnormal kube-apiserver
Click to enlarge

This indicates that proxy-agent cannot communicate with the API server in the cluster. Users may have different network configurations for the cluster to connect to UCS, so UCS does not provide any unified solution for this fault. You need to rectify it on your own and try again.

Log in to the UCS console. In the navigation pane, choose Fleets.
Log in to a master node of the destination cluster and check whether the proxy-agent pod can access the kube-apiserver of the destination cluster.

Example command:
```
kubectl exec -ti proxy-agent-*** -n kube-system /bin/bash
# Access kube-apiserver of the cluster.
curl -kv https://kubernetes.default.svc.cluster.local/readyz
```
If the access fails, rectify the cluster network fault, register the cluster to UCS again, and re-deploy proxy-agent.

Check Item 4: Cluster Authentication Information Changes

If the error message "cluster responded with non-successful status: [401][Unauthorized]" is displayed, the IAM network connection may be faulty, according to /var/paas/sys/log/kubernetes/auth-server.log of the three master nodes in the cluster. Ensure that the IAM domain name resolution and the IAM service connectivity are normal.

The common logs are as follows:

Failed to authenticate token: *******: dial tcp: lookup iam.myhuaweicloud.com on *.*.*.*:53: no such host
This log indicates that the node is not capable of resolving iam.myhuaweicloud.com. Configure the corresponding domain name resolution by referring to Preparing for Installation.
Failed to authenticate token: Get *******: dial tcp *.*.*.*:443: i/o timeout
This log indicates that the node's access to IAM times out. Ensure that the node can communicate with Huawei Cloud IAM properly.
currently only supports Agency token
This log indicates that the request is not initiated by UCS. Currently, on-premises clusters can only be connected to UCS using IAM tokens.
IAM assumed user has no authorization/iam assumed user should allowed by TEAdmin
This log indicates that the connection between UCS and the cluster is abnormal. Contact Huawei technical support.
Failed to authenticate token: token expired, please acquire a new token
This log indicates that the token has expired. Run the date command to check whether the time difference is too large. If yes, synchronize the time and check whether the cluster is working. If the fault persists for a long time, you may need to reinstall the cluster. In this case, contact Huawei technical support.

After the preceding problem is resolved, run the crictl ps | grep auth | awk '{print $1}' | xargs crictl stop command to restart the auth-server container.

Parent topic: On-Premises Clusters

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

Which of the following issues have you encountered?

Content is inconsistent with the product UI

Unclear descriptions

Lack of examples or code

Incorrect steps

Can't find what I need

Lack of best practices

Feedback (optional)

0/500

Select at least one type of issue, and enter your comments or suggestions.

Enter a maximum of 500 characters.

Submit Cancel

For any further questions, feel free to contact us through the chatbot.

Chatbot