Updated on 2024-11-29 GMT+08:00

Custom Data Directory

Scenario

  • The data directory of the host is not planned based on the Elasticsearch requirements.
  • A single Elasticsearch instance requires multiple disks.

Data loss may occur during data directory replacement. Therefore, customize data directories before writing data into directories.

Prerequisites

  • The administrator has planned data directories for hosts based on service requirements.
  • If multiple data disks are configured, each EsNodeX uses the same number of disks.

Procedure

  1. Log in to Manager.
  2. On Manager, choose Cluster > Name of the desired cluster > Services > Elasticsearch > Configurations > All Configurations > Role Name > Data Storage.
  3. Change the value of elasticsearch.data.path to the corresponding data directory.

    • Data directories are at the role level, and configure them under the role.
    • Use commas (,) to separate multiple data directories. Spaces are not allowed.

      For example: /srv/BigData/elasticsearch/esnode1,/srv/BigData/elasticsearch/esnode1_1.

  4. Repeat the preceding steps to change the data directories of other Elasticsearch roles to ensure that each EsNodeX role uses the same number of disks.
  5. After the modification, click Save in the upper left corner. In the dialog box displayed, click OK.
  6. Choose Cluster > Name of the desired cluster > Services > Elasticsearch > Instance. On the displayed page, select the instances whose Configuration Status is Expired, choose More > Restart Instance, and restart the instance.

Suggestions in Multi-Disk Scenarios

  • Plan the size of the service index in advance. Use the "alias + rollover" mode to roll over the index. Ensure that the size of each shard ranges from 20 GB to 30 GB to avoid disk imbalance caused by oversized and undersized shards.
  • Use disks of the same specifications and size to prevent the bucket effect caused by a small disk capacity.
  • Set copies for all indexes. Otherwise, data will be lost if the disk is damaged.
  • When creating an index, set the total_shards_per_nodes parameter to evenly distribute the shards of each index to different instances in the cluster.
  • If a disk is faulty, repair or replace it in a timely manner.