npd

Introduction

node-problem-detector (npd for short) is an add-on that monitors abnormal events of cluster nodes and connects to a third-party monitoring platform. It is a daemon running on each node. It collects node issues from different daemons and reports them to the API server. The npd add-on can run as a DaemonSet or a daemon.

For more information, see node-problem-detector.

Permission Description

To monitor kernel logs, the npd add-on needs to read the host /dev/kmsg. Therefore, the privileged mode must be enabled. For details, see privileged.

In addition, CCE mitigates risks according to the least privilege principle. Only the following privileges are available for npd running:

  • cap_dac_read_search: permission to access /run/log/journal.
  • cap_sys_admin: permission to access /dev/kmsg.

Installing the Add-on

  1. Log in to the CCE console and access the cluster details page. Choose Add-ons in the navigation pane, locate npd on the right, and click Install.
  2. Click Install to directly install the add-on. Currently, the npd add-on has no configurable parameters.