Updated on 2024-10-26 GMT+08:00

Replacing Specified Nodes for an OpenSearch Cluster

If a node in a cluster becomes faulty, you can create a new node with the same specifications to replace it. Before the replacement of a specified node, the data of that node will be migrated away in advance and will not be lost.

Prerequisites

The target cluster is available and has no tasks in progress.

Constraints

  • Only one node can be replaced at a time.
  • The ID, IP address, specifications, and AZ of the new node will be the same as those of the original one.
  • The configurations you modified manually will not be retained after node replacement. For example, if you have manually added a return route to the original node, you need to add it to the new node again after the node replacement is complete.
  • If the node you want to replace is a data node or cold data node, pay attention to the following precautions:
    1. Before a data node or cold data node is replaced, its data needs to be migrated to other nodes. To ensure data security, ensure the maximum total of replicas and primary shards of an index is smaller than the total number of data nodes plus cold data nodes in the cluster. The time it takes to replace the node depends on how fast the data can be migrated.
    2. Clusters whose version is earlier than 7.6.2 cannot have closed indexes. Otherwise, data nodes or cold data nodes cannot be replaced.
    3. The AZ of the node to be replaced must have two or more data nodes or cold data nodes.
    4. If the cluster where a data node or cold data node needs to be replaced does not have a master node, the total number of available data nodes and cold data nodes in the cluster must be at least 3.
    5. If it is a master or client node that needs to be replaced, the precautions above do not apply.
    6. The precautions above do not apply if you are replacing a faulty node, regardless of its type. Faulty nodes are not included in _cat/nodes.

Replacing a Specified Node

  1. Log in to the CSS management console.
  2. In the navigation pane, choose a cluster type. The cluster management page is displayed.
  3. Choose More > Modify Configuration in the Operation column of the target cluster. The Modify Configuration page is displayed.
  4. On the Modify Configuration page, click the Replace Node tab.
  5. On the Replace Node tab, set the parameters as needed.
    Table 1 Replacing a specified node

    Parameter

    Description

    Agency

    When a node is deleted, NICs are released. This means you need to have VPC permissions. Select an IAM agency to grant the current account the permission to access and use VPC.

    • This parameter is available only when the new IAM plane is connected.
    • If you are configuring an agency for the first time, click Automatically Create IAM Agency to create css-upgrade-agency.
    • If there is an IAM agency automatically created earlier, you can click One-click authorization to delete the VPC Administrator role or the VPC FullAccess system policy, and add the following custom policies instead to implement more refined permissions control.
      "vpc:subnets:get",
      "vpc:ports:*"
    • To use Automatically Create IAM Agency and One-click authorization, the following minimum permissions are needed:
      "iam:agencies:listAgencies",
      "iam:roles:listRoles",
      "iam:agencies:getAgency",
      "iam:agencies:createAgency",
      "iam:permissions:listRolesForAgency",
      "iam:permissions:grantRoleToAgency",
      "iam:permissions:listRolesForAgencyOnProject",
      "iam:permissions:revokeRoleFromAgency",
      "iam:roles:createRole"
    • To use an IAM agency, the following minimum permissions are needed:
      "iam:agencies:listAgencies",
      "iam:agencies:getAgency",
      "iam:permissions:listRolesForAgencyOnProject",
      "iam:permissions:listRolesForAgency"

    Node Type

    Expand the node type that needs be changed to show all nodes under it. Select the nodes you want to replace.

  6. Click Submit. In the data migration confirmation dialog box, choose to migrate data, which helps to prevent data loss, and click OK.

    During data migration, the system migrates all data from the to-be-removed nodes to the remaining nodes, and removes these nodes upon completion of the data migration. If the data on the to-be-removed nodes has copies on other nodes, data migration can be skipped and the cluster change can be completed faster.

  7. Click Back to Cluster List to switch to the Clusters page. The Task Status is Replacing nodes. When Cluster Status changes to Available, the node has been successfully replaced.