Help Center/ Cloud Search Service/ Troubleshooting/ Clusters/ What Do I Do If "Bulk Reject" Is Displayed in an Elasticsearch Cluster?
Updated on 2024-11-20 GMT+08:00

What Do I Do If "Bulk Reject" Is Displayed in an Elasticsearch Cluster?

Symptom

Sometimes the cluster write rejection rate increases and the "Bulk Reject" message is displayed. When I perform bulk writing operations, an error message similar to the following is reported:

[2019-03-01 10:09:58][ERROR]rspItemError: {
    "reason": "rejected execution of org.elasticsearch.transport.TransportService$7@5436e129 on EsThreadPoolExecutor[bulk, queue capacity = 1024, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@6bd77359[Running, pool size = 12, active threads = 12, queued tasks = 2390, completed tasks = 20018208656]]",
    "type": "es_rejected_execution_exception"
}

Issue Analysis

Bulk reject is usually caused by large or unevenly distributed shards. You can use the following methods to locate and analyze the fault:

  1. Check whether the data volume in shards is too large.

    The recommended data size in a single shard is 20 GB to 50 GB. You can run the following command on the Kibana console to view the size of each shard of an index:

    GET _cat/shards?index=index_name&v
  2. Check whether the shards in nodes are unevenly distributed.

    You can check shard allocation in either of the following ways:

    1. Log in to the CSS management console and choose Clusters. Locate the target cluster and click More > View Metric. For details, see Viewing Monitoring Metrics.
    2. On the CURL client, check the number of shards on each node in the cluster.
      curl "$p:$port/_cat/shards?index={index_name}&s=node,store:desc" | awk '{print $8}' | sort | uniq -c | sort

      An example is as follows.

      The first column indicates the number of shards, and the second column indicates the node ID. As shown in the example, some nodes have only one shard while some have eight. The shard allocation is uneven.

Solution

  • If the problem is caused by large shards:

    Configure the value of the number_of_shards parameter in the index template to limit the shard size.

    A newly created template will take effect when a new index is created. Existing indexes cannot be changed.

  • If the problem is caused by uneven shard distribution:

    Workarounds

    1. You can run the following command to set the routing.allocation.total_shards_per_node parameter to dynamically adjust an index:
      PUT <index_name>/_settings
      {
          "settings": {
              "index": {
                  "routing": {
                      "allocation": {
                          "total_shards_per_node": "3"
                      }
                  }
              }
          }
      }

      When you configure the total_shards_per_node parameter, reserve some buffer space to avoid shard allocation failures caused by machine faults. For example, if there are 10 servers and the index has 20 shards, the value of total_shards_per_node must be greater than 2.

    2. Set the number of shards before creating an index.

      Use the index template to set the number of shards on each node.

      PUT _template/<template_name>
      {
          "order": 0,
          "template": "{index_prefix@}*",  //The index prefix you want to change
          "settings": {
              "index": {
                  "number_of_shards": "30", //Total number of shards allocated to nodes. The capacity of a shard can be assumed as 30 GB.
                  "routing.allocation.total_shards_per_node": 3 //Maximum number of shards on a node
              }
          },
          "aliases": {}
      }