CCE Node Problem Detector
Add-on Overview
CCE Node Problem Detector (node-problem-detector, NPD) is an add-on that monitors abnormal events of cluster nodes and connects to a third-party monitoring platform. It is a daemon running on each node. It collects node issues from different daemons and reports them to the API server. It can run as a DaemonSet or a daemon.
Add-on Parameters
|
Parameter |
Mandatory |
Type |
Description |
|---|---|---|---|
|
basic |
No |
object |
Basic configuration parameters, which do not need to be specified |
|
flavor |
Yes |
Table 2 object |
Flavor parameters |
|
custom |
Yes |
Table 3 object |
Custom parameters |
|
Parameter |
Mandatory |
Type |
Description |
|---|---|---|---|
|
description |
No |
String |
Add-on description |
|
name |
Yes |
String |
Add-on specification name. The value is fixed at Single-instance. |
|
replicas |
Yes |
String |
Number of pods. The default value is 1. |
|
resources |
Yes |
resources object |
Container resource (CPU and memory) quotas |
|
Parameter |
Mandatory |
Type |
Description |
|---|---|---|---|
|
feature_gate |
No |
String |
Feature gate, which is used to enable the beta features |
|
multiAZBalance |
No |
Bool |
Multi AZ deployment |
|
multiAZEnabled |
No |
Bool |
Whether to deploy the add-on pods in multiple AZs. The default value is false. If this parameter is set to true, cross-AZ deployment is forcibly performed. If this parameter is set to false, cross-AZ deployment is preferred. |
|
npc |
Yes |
object Table 5 |
node-problem-controller configuration |
|
tolerations |
No |
List<Object> Table 7 |
Tolerations of the add-on |
|
node_match_expressions |
No |
List<Object> Table 7 |
Node affinity configuration of the add-on |
|
Parameter |
Mandatory |
Type |
Description |
|---|---|---|---|
|
limitsCpu |
Yes |
String |
CPU size limit (unit: m) |
|
limitsMem |
Yes |
String |
Memory size limit (unit: Mi) |
|
name |
Yes |
String |
Add-on name. The value is fixed at custom-resources. |
|
requestsCpu |
Yes |
String |
Requested CPU size (unit: m) |
|
requestsMem |
Yes |
String |
Requested memory size (unit: Mi) |
|
Parameter |
Mandatory |
Type |
Description |
|---|---|---|---|
|
maxTaintedNode |
Yes |
String or Int |
The maximum number of nodes that NPC can add taints to when a single fault occurs on multiple nodes for minimizing impact. The value can be in int or percentage format. |
|
Parameter |
Mandatory |
Type |
Description |
|---|---|---|---|
|
key |
No |
String |
Taint key |
|
effect |
No |
String |
Taint policy |
|
operator |
No |
String |
Operator |
|
tolerationSeconds |
No |
Int |
Toleration time window |
Example Request
{
"kind": "Addon",
"apiVersion": "v3",
"metadata": {
"annotations": {
"addon.install/type": "install"
}
},
"spec": {
"clusterID": "b78fb690-b82c-11ee-83cf-0255ac100b0f",
"version": "1.18.48",
"addonTemplateName": "npd",
"values": {
"basic": {
"image_version": "1.18.48",
"swr_addr": "***",
"swr_user": "***",
"rbac_enabled": true,
"cluster_version": "v1.23"
},
"flavor": {
"description": "custom resources",
"name": "custom-resources",
"replicas": 2,
"resources": [
{
"limitsCpu": "100m",
"limitsMem": "300Mi",
"name": "node-problem-controller",
"requestsCpu": "30m",
"requestsMem": "100Mi"
},
{
"limitsCpu": "100m",
"limitsMem": "300Mi",
"name": "node-problem-detector",
"requestsCpu": "30m",
"requestsMem": "100Mi"
}
],
"category": [
"CCE",
"Turbo"
]
},
"custom": {
"annotations": {},
"common": {},
"feature_gates": "",
"multiAZBalance": false,
"multiAZEnabled": false,
"node_match_expressions": [],
"npc": {
"maxTaintedNode": "10%"
},
"tolerations": [
{
"key": "node.kubernetes.io/not-ready",
"operator": "Exists",
"effect": "NoExecute",
"tolerationSeconds": 60
},
{
"key": "node.kubernetes.io/unreachable",
"operator": "Exists",
"effect": "NoExecute",
"tolerationSeconds": 60
}
]
}
}
}
}
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.