Updated on 2024-09-14 GMT+08:00

Switching Between Hot and Cold Storage for an OpenSearch Cluster

You can configure cold data nodes for Elasticsearch clusters in CSS and switch between cold and hot storage for indexes.

Scenario

You can keep hot data on high-performance servers to ensure fast query response times (in seconds). For historical data that requires a query response time of minutes, you can keep it on large-capacity, low-specs servers as cold data. This allows you to cut storage costs and improve search efficiency.

Figure 1 How cold/hot storage switchover works

When a cluster is created, cold data nodes are tagged cold for cold storage, whereas normal data nodes are tagged hot for hot storage. You can configure to have the data of specified indexes stored on cold data nodes. Data nodes will be tagged hot only when there are cold data nodes.

Cold data nodes can be enabled only when a cluster is created. You cannot enable cold data nodes for an existing cluster. If your cluster does not have cold data nodes and yet you want to be able to switch between cold and hot data storage, you can use the service's storage-compute decoupling feature. For details, see Configuring Storage-Compute Decoupling for an Elasticsearch Cluster.

You can scale cold data nodes by adding or reducing nodes or their storage capacity. For details, see Scaling an Elasticsearch Cluster.

Constraints

Only clusters that have cold data nodes support switchover between cold and hot data storage. Cold data nodes can be enabled only when a cluster is created. You cannot enable cold data nodes for an existing cluster.

Procedure

  1. Log in to the CSS management console.
  2. Check whether cold data nodes are enabled in a cluster.
    On the Clusters page, select the cluster that you want to enable storage-compute decoupling, click the cluster name to go to the cluster information page. In the Node area, check whether there is information about cold data nodes.
    Figure 2 Cold data node information
    • If there is information about cold data nodes, the cluster has cold data nodes. Go to the next step.
    • Otherwise, the cluster does not have cold data nodes, in which case, you cannot switch between cold and hot data storage. If you want to define historical data as cold data to cut storage costs while ensuring search efficiency, you may use storage-compute decoupling. For details, see Configuring Storage-Compute Decoupling for an Elasticsearch Cluster.
  3. Click Access Kibana in the Operation column to log in to the Kibana console.
  4. Click Dev Tools in the navigation tree on the left.
  5. On the Kibana page, set an index template to store index data to cold or hot data nodes.

    For example, run the following command to set a template to store indexes that start with myindex to cold data nodes:

    • For an Elasticsearch cluster whose version is earlier than 6.x:
      PUT _template/test
      {
          "order": 1,
          "template": "myindex*",
          "settings": {
              "index": {
                  "refresh_interval": "30s",
                  "number_of_shards": "3",
                  "number_of_replicas": "1",
                  "routing.allocation.require.box_type": "cold"
              }
          }
      }
    • For an Elasticsearch cluster whose version is 6.x or later:
      PUT _template/test
      {
        "order": 1,
        "index_patterns": "myindex*",
        "settings": {
          "refresh_interval": "30s",
          "number_of_shards": "3",
          "number_of_replicas": "1",
          "routing.allocation.require.box_type": "cold"
        }
      }

    Or you can simply specify cold or hot storage for existing indexes.

    For example, run the following command store index myindex to cold data nodes:
    PUT myindex/_settings   
     { 
            "index.routing.allocation.require.box_type": "cold"
        }

    myindex indicates the index name. You can change cold to hot if you need hot storage.

  6. When necessary, run the following command to cancel cold or hot storage configuration. After that, index data will be randomly and evenly distributed across cold and hot data nodes.
    PUT myindex/_settings    
    {
            "index.routing.allocation.require.box_type": null
        }

    myindex indicates the index name.