Why Does a Panic Occasionally Occur When I Use Network Policies on a Cluster Node?
Scenario
Cluster version: v1.15.6-r1
Cluster type: CCE cluster
Network model: Container tunnel network
Node operating system: CentOS 7.6
After a network policy is configured for the cluster, the canal-agent network component on the node is incompatible with the CentOS 7.6 kernel. As a result, a kernel panic may occur.
Conditions
If any of the following conditions is not met, this issue will not occur:
- The cluster version is v1.15.6-r1 and the container tunnel network model is used.
- The CentOS 7.6 node uses the canal-agent component whose version is 1.0.RC10.1230.B005 or earlier. (CentOS 7.6 nodes created on or before February 23, 2021 use such component.)
- You plan to use or have used network policies.
Fault Locating
Quick locating (for pay-per-use nodes)
Check whether your CentOS 7.6 node was created after February 24, 2021 on the CCE console.
Accurate locating (General)
If the cluster version is v1.15.6-r1, the network model is container tunnel network, the node OS is CentOS 7.6, and the canal-agent component version is 1.0.RC10.1230.B005.sp1 or later, the problem will not occur. If an earlier version is used (for example, 1.0.RC10.1230.B002), you are advised to reset or delete the node before configuring network policies.
Perform the following steps to query the version of the network component on the node:
- Prepare a node where kubectl can be used.
- Run the following command to query the CentOS node list:
for node_item in $(kubectl get nodes --no-headers | awk '{print $1}') ; do kubectl get node ${node_item} -o yaml | grep CentOS >/dev/null; if [[ "$?" == "0" ]];then echo "${node_item} is CentOS node";fi;done
The command output is as follows:
- Assume that the IP address of the target CentOS node is 10.0.50.187. Run the following command to check the canal-agent version:
kubectl get packageversions.version.cce.io 10.0.50.187 -o yaml | grep -A 1 canal-agent
The command output is as follows:
Solution
If you still want to use the node, reset the CentOS 7.6 nodes in the cluster to upgrade the networking components to the latest version. For details, see Resetting a Node.
If you want to delete the risky node and purchase a new one, see Deleting a Node and Buying a Node.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot