Elasticsearch Cluster Overload Due to Excessive Logstash Resource Consumption During Data Migration
Symptom
During Elasticsearch data migration, Logstash pipelines can consume excessive CPU and memory resources during the ingestion and output phases. This contention for resources, often due to suboptimal Logstash configuration or aggressive migration throughput targets, can degrade performance or cause outages in the source or destination Elasticsearch clusters. Typical symptoms include:
- In the source cluster: increased query response times, more queued queries, delayed indexing operations, overloaded cluster nodes, and cluster status in red.
- Destination cluster: rejected write requests, cluster status in red, and interrupted data migration.
Possible Causes
- High Logstash resource consumption: During data migration, Logstash operates its input, filter, and output stages concurrently in a single data pipeline. This could lead to excessive CPU and memory usage in the source and destination clusters if two key Logstash parameters are misconfigured: batchSize (the number of events processed per batch) and pipeline.workers (the number of parallel threads).
- Elasticsearch cluster overload: The data migration overloads the clusters due to unmanaged resource contention: the source cluster is overwhelmed by high-concurrency read requests (search/scans) from the Logstash pipeline, while the destination cluster is overwhelmed by a write throughput that exceeds its indexing capacity.
- Inappropriate migration policy: The migration uses a high-throughput policy that does not adapt to real-time cluster load. Instead of migrating data in controlled, sequential batches to smooth resource consumption, it pushes data too fast and in too large volumes, overwhelming the processing capacity of both clusters.
Solutions
- Stop Logstash processes to release resources in the source and destination clusters.
- Go to the Configuration Center page.
- Log in to the CSS management console.
- In the navigation pane on the left, choose Clusters > Logstash.
- In the cluster list, click the name of the target cluster. The cluster information page is displayed.
- Click the Configuration Center tab.
- Select the target pipeline and click Hot Stop above the pipeline list.
- In the displayed dialog box, click OK.
If the hot stop is successful, the task is removed from the pipeline list and data migration is stopped.
- Go to the Configuration Center page.
- Restore the source and destination clusters.
If the node load is too high, restart the high-low nodes or wait for the clusters to recover by themselves. Wait for 5 to 10 minutes before checking the cluster status again.
- Adjust the Logstash configuration to reduce resource contention.
- Select the target Logstash cluster, and enter its Configuration Center page.
- In the configuration file list, locate the target configuration file, and click Edit in the Operation column.
Modify Configuration File Content to reduce the migration task from batch indexes to a single index or same-type indexes, and migrate data in batches based on workload priorities.
- Click Next to modify runtime parameters.
Reduce the value of pipeline.workers to reduce concurrent requests. The default value is the number of vCPUs. Set this parameter based on available resources.
Reduce the value of pipeline.batch.size to reduce the amount of data processed for each batch. The default value is 125. You are advised to set this parameter to 50 to smooth the load.
Increase the value of pipeline.batch.delay to introduce a longer wait time to fill each batch, thus reducing sudden resource spikes. The default value is 50. You can set it to 500 ms.
- Click Save.
- Restart Logstash processes.
- In the configuration file list, select the target configuration file and click Start Logstash.
- In the Start Logstash dialog box, select Keepalive if necessary.
- Click OK to start the configuration file.
You can check the started configuration file in the pipeline list.
- Check whether the data migration is successful by comparing the number of documents between the source and destination clusters.
- If the migration is successful, this issue is fixed.
- Otherwise, contact technical support.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot