What Do I Do If the Cluster Connection Component (ANP-Agent) Failed to Be Deployed?
Cluster Connection Component (ANP-Agent) Installation Failure
Symptom
kubectl get pods -n hss | grep proxy-agent
proxy-agent-5dc5cf6cd7-khdlt 0/1 ImagePullBackOff 0 42h proxy-agent-5dc5cf6cd7-n56bx 0/1 Pending 0 42h
Solution
- Log in to a node in the cluster.
- Run the following command to view the node information:
kubectl describe pod proxy-agent-xxx -n hss
proxy-agent-xxx is the name of the cluster connection component displayed in the command output in "Symptom", for example, proxy-agent-5dc5cf6cd7-khdlt.
- Identify the cause based on the command output.
- Possible cause: The image of the cluster connection component cannot be pulled.
Figure 1 Failed to pull the image of the cluster connection component
Solution: If you select Non-CCE cluster (Internet access), ensure that your cluster can access the Internet, that is, you can pull the SWR image.
- Possible cause: There are not enough CPUs or memory on the node. Insufficient cpu/memory is displayed.
Figure 2 Insufficient CPU or memory
Solution: Scale up the node and retry access.
- Possible cause: There are no nodes matching the scheduling rule.
Figure 3 No nodes matching the scheduling rule
Solution: For high availability purposes, the cluster connection component (ANP-agent) allocates two instances to different nodes by default. Ensure there are at least two available nodes in the cluster.
- Possible cause: The image of the cluster connection component cannot be pulled.
Cluster Connection Component (ANP-Agent) Connection Failure
Symptom
for a in $(kubectl get pods -n hss| grep proxy-agent | cut -d ' ' -f1); do kubectl -n hss logs $a | grep 'Start serving';done
The command output is empty, indicating the cluster failed to connect to HSS.
Solution
- Log in to a node in the cluster.
- Run the following command to check the node logs:
kubectl logs proxy-agent-xxx -n hss
- If the command output shown in Figure 4 is displayed, the grpc connection between the cluster connection component and the HSS server failed to be established.
- Perform the following steps to locate and rectify the fault:
Format of the server domain name of the cluster connection component: hss-anp.region_code.myhuaweicloud.com
For details about region codes, see Regions and Endpoints.
- Check whether the cluster security group allows outbound access to port 8091 of the 100.125.0.0/16 CIDR block.
- If the access is allowed, go to 4.b.
- If the access is denied, configure the security group to allow outbound access to the port and retry access.
- Run the following command to check whether the server domain name of the cluster connection component can be pinged:
ping {{Server_domain_name_of_cluster_connection_component}}
- If it can be pinged, go to 4.c.
- If the IP address cannot be pinged, set the DNS server address to the private DNS server address of Huawei Cloud. For more information, see Private DNS Server Address of Huawei Cloud. After the configuration is complete, connect to the cluster asset again.
- Run the following command to check whether the specified port of the cluster connection component can be accessed:
telnet {{Server_domain_name_of_cluster_connection_component}} 8091
- If the access is allowed, go to 4.d.
- If the access fails, disable the firewall and try again.
- In the upper right corner of the Huawei Cloud console, choose Service Tickets > Create Service Ticket and submit a service ticket.
- Check whether the cluster security group allows outbound access to port 8091 of the 100.125.0.0/16 CIDR block.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.