CSS Enhancements Over Open-Source Elasticsearch
In enterprise-grade search and analytics scenarios, open-source Elasticsearch, despite its powerful features, often struggles to meet demands such as searching through tens of millions of high-dimensional vectors, managing PB-scale data storage, handling out-of-memory (OOM) exceptions caused by large queries, and maintaining stability during traffic peaks. To overcome these challenges, CSS offers a range of enhancements over the open-source Elasticsearch, such as decoupled storage and compute, large query isolation, and enhanced ingestion performance. These enhancements go beyond performance. They also improve production-environment stability. With simple configurations, your cluster can support high concurrency and cost-effective storage of massive datasets, allowing you to focus on your core applications.
| Category | Feature | Description | Supported Version | Details |
|---|---|---|---|---|
| Cost optimization | Storage-compute decoupling | With decoupled storage and compute, hot data that is frequently accessed is stored in high-performance storage media, while cold data that is infrequently accessed is migrated to low-cost storage media—Object Storage Service (OBS). This ensures real-time query performance for hot data while reducing long-term storage costs. | 7.6.2, 7.10.2 | |
| Switching between hot and cold storage tiers | Data is allocated to nodes of different performance standards based on data temperature (that is, how often data is accessed). The goal is to achieve optimal storage costs and query performance. | All versions that support cold data nodes | ||
| Enhanced stability | Read/write splitting | This feature relies on collaboration between the leader and follower clusters. Writes are directed to the leader and queries to the follower, ensuring optimal write performance while supporting high-concurrency, scalable queries. There is no more resource contention, and peak loads are reduced. | 7.6.2, 7.10.2 | Configuring Read/Write Splitting Between Leader and Follower |
| Flow control | This feature protects clusters from overload through flow control policies, such as client request throttling, shard indexing backpressure, and traffic pattern analysis, ensuring proper resource allocation and cluster stability. NOTE: Elasticsearch 7.6.2 and 7.10.2 clusters created after February 2023 support Flow Control 2.0 only, while clusters created before that support Flow Control 1.0 only. | 7.6.2, 7.10.2 | ||
| Query traffic isolation | You can configure specific nodes as low-priority nodes or isolated nodes and define an index whitelist for exceptions. For shard query requests to indexes that are not on this whitelist, the system preferentially or completely bypasses these nodes, ensuring that faulty nodes do not affect services. | 7.10.2 | ||
| Large query isolation | Large query isolation can be configured to manage queries that have high memory usage or take too long to complete. This helps improve the stability of Elasticsearch clusters and prevent out-of-memory (OOM) exceptions. | 7.6.2, 7.10.2 | ||
| Enhanced performance | Enhanced data ingestion performance | CSS provides several ways to enhance ingestion performance: bulk routing, bulk aggregation, text indexing acceleration, and merge optimization. They can help reduce performance overhead and significantly improve write stability and throughput in heavy-load scenarios. | 7.6.2, 7.10.2 | |
| Enhanced aggregation performance | CSS enhances aggregation performance in the face of large data volumes by leveraging vectorization and optimized clustering techniques, enabling faster analytics and decision-making in complex situations. | 7.10.2 | ||
| Enhanced operations | Index recycle bin | To prevent data loss caused by accidental deletion, CSS provides an index recycle bin. When it is enabled, deleted indexes are temporarily stored in the recycle bin, allowing for recovery before they are permanently removed. This feature improves data reliability and operational security. | 7.10.2 | |
| Query resource tracker | O&M personnel can call an API to obtain top queries with the highest latency, CPU usage, or memory consumption, filter the queries by time range, and quickly identify problematic queries. This can significantly improve troubleshooting efficiency and accuracy. | 7.10.2 |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot