Updated on 2024-11-20 GMT+08:00

Configuring Read/Write Splitting Between Two Elasticsearch Clusters

Scenario

Data written to the primary cluster (Leader) is automatically synchronized to the secondary cluster (Follower). This allows data to be queried from the secondary cluster, improving query performance while alleviating the pressure of the primary cluster. If the primary cluster is unable to provide services, a primary/secondary switchover can be performed to use the secondary cluster to handle write and query requests, ensuring service continuity.

Figure 1 Two application scenarios for read/write splitting

Scenario 1 (left in the figure): Data is written into the primary cluster but queried from the secondary cluster, alleviating pressure for both clusters.

Scenario 2 (right in the figure): When the primary cluster fails, the secondary cluster takes over to ensure service continuity.

Constraints

  • Only Elasticsearch 7.6.2 and Elasticsearch 7.10.2 clusters support read/write splitting.
  • The versions of the primary and secondary clusters must be kept consistent, or errors may occur.

Prerequisites

Two clusters of the same version have been created. One functions as the primary cluster, and the other the secondary cluster. The secondary cluster must be able to access the REST API (default port: 9200) of the primary cluster.

Connecting the Primary and Secondary Clusters

  1. Log in to the CSS management console.
  2. Choose Clusters in the navigation pane. On the Clusters page, locate the secondary cluster and click Access Kibana in the Operation column.
  3. Click Dev Tools in the navigation tree on the left.
  4. Run the following command to configure information about the primary cluster in the secondary cluster:
    PUT /_cluster/settings
    {
      "persistent" : {
        "cluster" : {
          "remote.rest" : {
            "leader1" : {
              "seeds" : [
                "http://10.0.0.1:9200",
                "http://10.0.0.2:9200",
                "http://10.0.0.3:9200"
              ] ,
                "username": "elastic",
                "password": "*****"
            }
          }
        }
      }
    }
    Table 1 Request body parameters

    Parameter

    Description

    leader1

    Name of the primary cluster configuration task, which is user-defined and will be used for configuring read/write splitting later.

    seeds

    Address for accessing the primary cluster. When HTTPS is enabled for the cluster, the URL schema must use HTTPS.

    username

    Username of the primary cluster. This parameter is required only when security mode is enabled for the primary cluster.

    password

    Password of the primary cluster. This parameter is required only when security mode is enabled for the primary cluster.

    Example response:

    {
      "acknowledged" : true,  //Whether the operation is successful
      "persistent" : {
        "cluster" : {
          "remote" : {
            "rest" : {
              "leader1" : {
                "seeds" : [
                "http://10.0.0.1:9200",
                "http://10.0.0.2:9200",
                "http://10.0.0.3:9200"
                ] ,
                "username": "elastic", 
                 "password": "*****"
              }
            }
          }
        }
      },
      "transient" : { }
    }
  5. After the configuration is complete, run the following command in the secondary cluster to check the connection between the secondary and primary clusters:
    GET _remote/rest/info

    Example response:

    {
      "leader1" : {
        "connected" : true  //The two clusters are connected.
      }
    }

Index Synchronization

There are two ways to synchronize indexes: specified index synchronization and matching index synchronization.

During synchronization, indexes in the secondary cluster become read-only. The synchronization is performed periodically. The default synchronization interval is 30 seconds. For how to change it, see Changing the Synchronization Interval.

Synchronizing Specified Indexes
  • Run the following command in the secondary cluster to synchronize a single index from the primary cluster to the secondary cluster without modifying index settings:
    PUT start_remote_sync
    {
      "remote_cluster": "leader1",
      "remote_index": "data1_leader",
      "local_index": "data1_follower"
    }
  • Run the following command in the secondary cluster to synchronize a single index from the primary cluster to the secondary cluster while modifying some of the index settings—enabling synchronization of index settings:
    PUT start_remote_sync
    {
      "remote_cluster": "leader1",
      "remote_index": "data1_leader",
      "local_index": "data1_follower",
      "settings": {
        "number_of_replicas": 4
      },
      "settings_sync_enable": true,
      "settings_sync_patterns": ["*"],
      "settings_sync_exclude_patterns": ["index.routing.allocation.*"],
      "alias_sync_enable": true,
      "state_sync_enable": true
    }

    The following index configuration items cannot be modified: number_of_shards, version.created, uuid, creation_date, and soft_deletes.enabled.

Table 2 Request body parameters

Parameter

Description

remote_cluster

Name of the primary cluster configuration task, which was set in Connecting the Primary and Secondary Clusters. leader1 was set in our example.

remote_index

Name of the index to be synchronized in the primary cluster

local_index

Index name in the secondary cluster

settings

Index settings to be synchronized

settings_sync_enable

Whether to enable synchronization of index settings in the primary cluster. The default value is false.

settings_sync_patterns

Prefix of primary cluster index settings to be synchronized. The default value is *. This parameter takes effect when settings_sync_enable is set to true. The index settings configured in settings will not be synchronized.

settings_sync_exclude_patterns

Prefix of primary cluster index settings not to be synchronized. The default value is empty. This parameter is valid only when settings_sync_enable is set to true.

alias_sync_enable

Whether to enable index alias synchronization in the primary cluster. The default value is false.

state_sync_enable

Whether to enable index status synchronization in the primary cluster. The default value is false.

Matching Index Synchronization
  • Run the following command in the secondary cluster to create a pattern-matching index synchronization policy, which synchronizes matched indexes from the primary cluster to the secondary cluster:
    PUT auto_sync/pattern/${PATTERN}
    {
     "remote_cluster": "leader1",
     "remote_index_patterns": "log*",
     "local_index_pattern": "{{remote_index}}-sync",
     "apply_exist_index": true
    }
  • Run the following command in the secondary cluster to create a pattern-matching index synchronization policy, which synchronizes matched indexes from the primary cluster to the secondary cluster, with some of the index settings modified—enabling synchronization of index settings:
    PUT auto_sync/pattern/${PATTERN}
    {
     "remote_cluster": "leader1",
     "remote_index_patterns": "log*",
     "local_index_pattern": "{{remote_index}}-sync",
     "apply_exist_index": true,
     "settings": {
       "number_of_replicas": 4
     },
     "settings_sync_enable": true,
     "settings_sync_patterns": ["*"],
     "settings_sync_exclude_patterns": ["index.routing.allocation.*"],
     "alias_sync_enable": true,
     "state_sync_enable": true
    }

    The following index configuration items cannot be modified: number_of_shards, version.created, uuid, creation_date, and soft_deletes.enabled.

Table 3 Request body parameters

Parameter

Description

PATTERN

Name of the pattern for index matching.

remote_cluster

Name of the primary cluster configuration task, which was set in Connecting the Primary and Secondary Clusters. In our example, leader1 is used.

remote_index_patterns

Pattern for matching indexes to be synchronized in the primary cluster. The wildcard (*) is supported.

local_index_pattern

Index pattern in the secondary cluster. The index template can be replaced. For example, if this parameter is set to {{remote_index}}-sync, the index log1 changes to log1-sync after synchronization.

apply_exist_index

Whether to synchronize existing indexes in the primary cluster. The default value is true.

settings

Index settings to be synchronized

settings_sync_enable

Whether to enable synchronization of index settings in the primary cluster. The default value is false.

settings_sync_patterns

Prefix of primary cluster index settings to be synchronized. The default value is *. This parameter takes effect when settings_sync_enable is set to true. The index settings configured in settings will not be synchronized.

settings_sync_exclude_patterns

Prefix of primary cluster index settings not to be synchronized. The default value is empty. This parameter is valid only when settings_sync_enable is set to true.

alias_sync_enable

Whether to enable index alias synchronization in the primary cluster. The default value is false.

state_sync_enable

Whether to enable index status synchronization in the primary cluster. The default value is false.

Stopping Index Synchronization

Run the following command in the secondary cluster to stop synchronization tasks for specified indexes. Subsequent changes to the indexes in the primary cluster will not be synchronized to the secondary cluster. The read-only state of the indexes in the secondary cluster will be cancelled, so that new data can be written into them.

PUT log*/stop_remote_sync

In this command, log* indicates the index name. You can specify multiple index names (separated by commas) or use a wildcard. In this example, synchronization tasks for all indexes that start with log are stopped.

Querying and Deleting Created Patterns

  1. Run the following command in the secondary cluster to query created patterns:
    • Query the list of patterns.
      GET auto_sync/pattern
    • Query a specified pattern by name.
      GET auto_sync/pattern/{PATTERN}

    The following is an example of the response:

    {
      "patterns" : [
        {
          "name" : "pattern1",
          "pattern" : {
            "remote_cluster" : "leader",
            "remote_index_patterns" : [
              "log*"
            ],
            "local_index_pattern" : "{{remote_index}}-sync",
            "settings" : { }
          }
        }
      ]
    }
  2. Run the following command in the secondary cluster to delete a specified pattern by name:
    DELETE auto_sync/pattern/{PATTERN}

Enabling Forcible Synchronization

  • Enabling forcible synchronization

    By default, the plug-in determines whether to synchronize metadata based on whether the number of documents in the index of the primary cluster changes. If the primary cluster only updates documents and the number of documents remains unchanged, the plug-in does not synchronize the updates to the secondary cluster. The configuration can be modified. After forcible synchronization is enabled, the index metadata of the primary cluster is forcibly synchronized to the secondary cluster in each synchronization cycle.

    The following is an example of enabling forcible synchronization:

    PUT _cluster/settings
    {
      "persistent": {
        "remote_sync.force_synchronize": true
      }
    }

Changing the Synchronization Interval

  • Changing the synchronization interval

    The synchronization interval is 30 seconds by default and can be changed.

    The example request below changes the synchronization interval to 2 seconds:

    PUT {index_name}/_settings
    {
      "index.remote_sync.sync_interval": "2s"
    }

Querying Index Synchronization Status

  • Obtaining the auto synchronization status

    This API is used to obtain the synchronization status of matched indexes.

    An example request is as follows:

    GET auto_sync/stats

    An example response is as follows:

    {
      "success_count" : 3,
      "failed_count" : 0,
      "failed_remote_cluster_state_requests_count" : 0,
      "last_fail_exception" : { },
      "last_fail_remote_cluster_requests_exception" : { }
    }
  • Obtaining the auto synchronization status of a specified index

    An example request is as follows:

    GET {index_name}/sync_stats

    An example response is as follows:

    {
      "indices" : {
        "data1_follower" : {
          "shards" : {
            "0" : [
              {
                "primary" : false,
                "total_synced_times" : 27,
                "total_empty_times" : 25,
                "total_synced_files" : 4,
                "total_synced_bytes" : 3580,
                "total_paused_nanos" : 0,
                "total_paused_times" : 0,
                "current" : {
                  "files_count" : 0,
                  "finished_files_count" : 0,
                  "bytes" : 0,
                  "finished_bytes" : 0
                }
              },
              {
                "primary" : true,
                "total_synced_times" : 28,
                "total_empty_times" : 26,
                "total_synced_files" : 20,
                "total_synced_bytes" : 17547,
                "total_paused_nanos" : 0,
                "total_paused_times" : 0,
                "current" : {
                  "files_count" : 0,
                  "finished_files_count" : 0,
                  "bytes" : 0,
                  "finished_bytes" : 0
                }
              }
            ]
          }
        }
      }
    }

Switching the Roles of the Primary and Secondary Clusters

When the primary cluster becomes faulty, perform a primary/secondary switchover to have the secondary cluster take over services. The steps are as follows:

  1. Determine the index synchronization method between the primary and secondary clusters. Check whether pattern-matching index synchronization policies have been configured in the secondary cluster. For the command to use, see Querying and Deleting Created Patterns.
    • If there are no such policies, synchronization is performed for specified indexes between the primary and secondary clusters. In this case, go to 3.
    • If there are such policies, index synchronization between the primary and secondary clusters is based on index patterns. In this case, go to 2.
  2. Delete pattern-matching index synchronization policies in the secondary cluster. For the command to use, see Querying and Deleting Created Patterns.
  3. Perform Stopping Index Synchronization in the secondary cluster. Then redirect read and write traffic to it. If the primary and secondary clusters synchronize indexes based on index patterns, use a wildcard to match indexes when running the command that stops index synchronization.
  4. After the primary cluster recovers, configure information about the secondary cluster in the primary cluster, and connect the primary and secondary clusters again. For details, see Connecting the Primary and Secondary Clusters.
  5. Under the primary cluster, perform Index Synchronization to synchronize data from the secondary cluster to the primary cluster, and then perform a primary/secondary switchover to switch back.