Updated on 2024-12-02 GMT+08:00

Viewing Cluster Risk Items

After a scan task is started, you can view details about cluster risk items in the intelligent O&M list.

Prerequisites

A scan task has been started. For details, see Creating a Scan Task.

Check Items

The following items will be checked and the detected risks will be displayed in the intelligent O&M list:

  • Check the current health status of the cluster. Red: Some primary shards are not allocated. Yellow: Some secondary shards are not allocated. Green: that all shards are allocated.
  • Check the number of nodes in the cluster and the number of AZs to evaluate the high availability status of the distributed Elasticsearch cluster.
  • Check whether index replicas are enabled. If replicas are not enabled and a fault occurs, an index may be unavailable, and the data in a cluster using local disks may be lost.
  • Check for .kibana index conflicts in the cluster.
  • Check disk usage. If the disk usage of a node is too high, new index shards may fail to be allocated to the node and the cluster performance may be affected.
  • Check whether the storage usage of cluster data nodes or cold data nodes is balanced. Unbalanced storage distribution may result in unbalanced cluster loads and increase read/write latency.
  • Check whether any node in the current cluster is disconnected or unavailable for 5 consecutive minutes.
  • Check for nodes with too many shards. A large number of shards will consume too many node resources, increasing read/write latency and slowing down metadata update.
  • Check the size of all shards. A large shard may affect performance deterioration, occupy too much node memory, and slow down shard restoration during scaling or fault recovery.
  • Check whether the current cluster has an available new version.
  • Check for snapshot creation failures and snapshot records in the cluster in the last seven days.

Procedure

  1. Log in to the CSS management console.
  2. On the cluster management page, click a cluster name to go to the basic information page of the cluster.
  3. Choose Intelligent O&M from the navigation pane.
  4. On the intelligent O&M list page, select a started scan task. Click on the left of the task name to view its creation time, abstract, ID, and risk items.

    Click on the left of a risk item to view its details, including the check item, risk description, and risk suggestion.

    You can handle cluster risks in a timely manner based on the suggestions.

    Figure 1 Risk items