Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Situation Awareness
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive
Help Center/ Cloud Search Service/ User Guide/ Using OpenSearch for Data Search/ Upgrading the Version of an OpenSearch Cluster

Upgrading the Version of an OpenSearch Cluster

Updated on 2025-01-23 GMT+08:00

OpenSearch clusters support both same-version upgrade and cross-version upgrade.

Scenarios

Upgrade scenarios

  • During a same-version upgrade, kernel patches are updated for a cluster. The cluster is upgraded to the latest image of the current version to fix known issues or optimize performance. For example, if the cluster version is 1.3.6(1.3.6_24.3.3_0102), upon a same-version upgrade, the cluster will be upgraded to the latest image 1.3.6(1.3.6_24.3.4_0109) of version 1.3.6. (The version numbers used here are examples only.)
  • Cross-version upgrade means to upgrade a cluster to the latest image of the target version to enhance functionality or incorporate versions. For example, if the cluster version is 1.3.6(1.3.6_24.3.3_1224), upon a cross-version upgrade, the cluster will be upgraded to the latest image 2.17.1(2.17.1_24.3.4_0109) of version 2.17.1. (The version numbers used here are examples only.)

Principle

The nodes in a cluster are upgraded one at a time to prevent service interruption. The upgrade process is as follows: Bring a node offline, migrate its data to another node, create a new node of the target version, and mount the NIC ports of the offline node to the new node to reuse the node IP address, then add the new node to the cluster. Upgrade the remaining nodes one at a time in the same way. If there is a large amount of data in a cluster, the upgrade duration depends on the data migration duration.

Process

  1. Perform the pre-upgrade check: Pre-Upgrade Check.

    The pre-upgrade check is mostly automated. A few of the items need to be checked manually.

  2. Create a snapshot to back up the full index data: Manually Creating a Snapshot.

    During upgrade configuration, you can choose to check whether the full index data has been backed up using snapshots. This helps to prevent data loss in case of an upgrade failure.

  3. Create an upgrade task and start the upgrade: Creating an Upgrade Task.

Constraints

  • A maximum of 20 clusters can be upgraded at the same time. You are advised to perform the upgrade during off-peak hours.
  • Clusters that have ongoing tasks cannot be upgraded.
  • Once started, an upgrade task cannot be stopped until it succeeds or fails.
  • During the upgrade, nodes are replaced one by one. Requests sent to a node that is being replaced may fail. In this case, you are advised to access the cluster through the VPC Endpoint service or a dedicated load balancer.
  • During the upgrade, OpenSearch Dashboards and Cerebro will be rebuilt and become inaccessible. Different OpenSearch Dashboards versions are incompatible with each other. During the upgrade, you may fail to access OpenSearch Dashboards due to version incompatibility. It will become accessible again once the cluster is successfully upgraded.

Pre-Upgrade Check

To ensure a successful upgrade, you must check the items listed in the following table before performing an upgrade.

Table 1 Pre-upgrade checklist

Check Item

Check Method

Description

Normal Status

Cluster status

System check

After an upgrade task is started, the system automatically checks the cluster status. Clusters whose status is green or yellow can work properly and have no unallocated primary shards.

The cluster status is Available.

Node quantity

System check

During a cluster upgrade, the system automatically checks the number of nodes. To ensure service continuity, the total number of data nodes and cold data nodes in a cluster must be greater than or equal to 3.

The total number of data nodes and cold data nodes in a cluster must be greater than or equal to 3.

Disk capacity

System check

After an upgrade task is started, the system automatically checks the disk capacity. During the upgrade, nodes are brought offline one by one and then new nodes are created. Ensure that the disk capacity of all the remaining nodes can process all data of the node that has been brought offline.

After a node is brought offline, the remaining nodes can contain all data of the cluster.

Data backup

System check

Check whether the maximum number of primary and standby shards of indexes in a cluster can be allocated to the remaining data nodes and cold data nodes. Prevent backup allocation failures after a node is brought offline during the upgrade.

Maximum number of primary and standby index shards plus 1 must be less than or equal to the total number of data nodes and cold data nodes before the upgrade.

Data backup

System check

Before the upgrade, back up data to prevent data loss caused by upgrade faults. When submitting an upgrade task, you can determine whether to enable the system to check for the backup of all indexes.

Check whether data has been backed up.

Resources

System check

After an upgrade task is started, the system automatically checks resources. Resources will be created during the upgrade. Ensure that resources are available.

Resources are available and sufficient.

Custom plugins

System and manual check

Perform this check only when custom plugins are installed in the source cluster. If a cluster has a custom plugin, upload all plugin packages of the target version on the plugin management page before the upgrade. During the upgrade, install the custom plugin in the new nodes. Otherwise, the custom plugins will be lost after the cluster is successfully upgraded. After an upgrade task is started, the system automatically checks whether the custom plugin package has been uploaded, but you need to check whether the uploaded plugin package is correct.

NOTE:

If the uploaded plugin package is incorrect or incompatible, the plugin package cannot be automatically installed during the upgrade. As a result, the upgrade task fails. To restore a cluster, you can terminate the upgrade task and restore the node that fails to be upgraded by performing Replacing Specified Nodes for an OpenSearch Cluster.

After the upgrade is complete, the status of the custom plugin is reset to Uploaded.

The plugin package of the cluster to be upgraded has been uploaded to the plugin list.

Custom configurations

System check

During the upgrade, the system automatically synchronizes the content of the cluster configuration file opensearch.yml.

Clusters' custom configurations are not lost after the upgrade.

Non-standard operations

Manual check

Check whether non-standard operations have been performed in the cluster. Non-standard operations refer to manual operations that are not recorded. These operations cannot be automatically passed on during the upgrade, for example, modification of the opensearch_dashboards.yml configuration file, system settings, and return routes.

Some non-standard operations are compatible. For example, the modification of a security plugin can be retained through metadata, and the modification of system configuration can be retained using images. Some non-standard operations, such as the modification of the opensearch_dashboards.yml file, cannot be retained, and you must back up the file in advance.

Compatibility check

System and manual check

After a cross-version upgrade task is started, the system automatically checks whether the source and target versions have incompatible configurations. If a custom plugin is installed for a cluster, the version compatibility of the custom plugin needs to be manually checked.

Configurations before and after the cross-version upgrade are compatible.

Check Cluster Loads

System and manual check

If the cluster is heavily loaded, there is a high probability that the upgrade will get stuck or fail. You are advised to check the cluster load before the upgrade and perform the upgrade only during off-peak hours.

You can also choose to check the cluster load while configuring upgrade information.

  • nodes.thread_pool.search.queue < 1000: Check whether the maximum number of search queues is less than 1000.
  • nodes.thread_pool.write.queue < 200: Check whether the maximum number of write queues is less than 200.
  • nodes.process.cpu.percent < 90: Check whether the maximum CPU usage is less than 90%.
  • nodes.os.cpu.load_average/Number of CPU cores < 80%: Check whether the ratio of the maximum load to the number of CPU cores is less than 80%.

Creating an Upgrade Task

  1. Log in to the CSS management console.
  2. In the navigation pane on the left, choose Clusters. On the cluster list page that is displayed, click the name of a cluster.
  3. On the displayed basic cluster information page, click Version Upgrade.
  4. On the displayed page, set upgrade parameters.
    Table 2 Upgrade parameters

    Parameter

    Description

    Upgrade Type

    • Same-version upgrade: upgrade kernel patches to the latest images within the current cluster version.
    • Cross-version upgrade: upgrade a cluster to the latest image of the target version.

    Target Image

    Image of the target version. After you select an image, the image name and target version details are displayed below.

    The supported target versions are displayed in the drop-down list of Target Image. If no target image is available, possible causes are as follows:

    • The current cluster is of the latest version.
    • The current cluster is created before 2023 and has vector indexes.
    • The new version images have not been added at the current region.
    • The current cluster does not support the upgrade type you have selected.

    Agency

    When a node is deleted, NICs are released. This means you need to have VPC permissions. Select an IAM agency to grant the current account the permission to access and use VPC.

    • If you are configuring an agency for the first time, click Automatically Create IAM Agency to create css-upgrade-agency.
    • If there is an IAM agency automatically created earlier, you can click One-click authorization to have the permissions associated with the VPC Administrator role or the VPC FullAccess system policy deleted automatically, and have the following custom policies added automatically instead to implement more refined permissions control.
      "vpc:subnets:get",
      "vpc:ports:*"
    • To use Automatically Create IAM Agency and One-click authorization, the following minimum permissions are required:
      "iam:agencies:listAgencies",
      "iam:roles:listRoles",
      "iam:agencies:getAgency",
      "iam:agencies:createAgency",
      "iam:permissions:listRolesForAgency",
      "iam:permissions:grantRoleToAgency",
      "iam:permissions:listRolesForAgencyOnProject",
      "iam:permissions:revokeRoleFromAgency",
      "iam:roles:createRole"
    • To use an IAM agency, the following minimum permissions are required:
      "iam:agencies:listAgencies",
      "iam:agencies:getAgency",
      "iam:permissions:listRolesForAgencyOnProject",
      "iam:permissions:listRolesForAgency"
  5. After setting the parameters, click Submit. Determine whether to enable Check full index snapshot and Perform cluster load detection and click OK.

    If a cluster is overloaded, the upgrade task may suspend or fail. Enabling Cluster load detection can effectively avoid failures.

    If any of the following situations occurs during the detection, wait or reduce the load. If you urgently need to upgrade the version and you have understood the upgrade failure risks, you can disable the Cluster load detection function. The cluster load check items are as follows:

    • nodes.thread_pool.search.queue < 1000: Check whether the maximum number of search queues is less than 1000.
    • nodes.thread_pool.write.queue < 200: Check whether the maximum number of write queues is less than 200.
    • nodes.process.cpu.percent < 90: Check whether the maximum CPU usage is less than 90%.
    • nodes.os.cpu.load_average/Number of CPU cores < 80%: Check whether the ratio of the maximum load to the number of CPU cores is less than 80%.
  6. View the upgrade task in the task list. If the task status is Running, you can expand the task list and click View Progress to view the upgrade progress.

    If the task status is Failed, you can retry or terminate the task.

    • Retry the task: Click Retry in the Operation column.
    • Terminate the task: Click Terminate in the Operation column.
      NOTICE:
      • Same version upgrade: If the upgrade task status is Failed, you can terminate the upgrade task.
      • Cross version upgrade: You can stop an upgrade task only when the task status is Failed and no node has been upgraded.

      After an upgrade task is terminated, the Task Status of the cluster is rolled back to the status before the upgrade, and other tasks in the cluster are not affected.

We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out more

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback