Help Center/ Cloud Search Service/ User Guide/ Vector Database/ CSS Vector Database/ Optimizing Query and Write Performance
Updated on 2026-04-30 GMT+08:00

Optimizing Query and Write Performance

In large-scale vector search scenarios, as vector dimensionality and data volume grow, clusters may face performance bottlenecks such as low write throughput and severe jitter in query latency (P99). Unlike traditional keyword-based search, vector search is defined by compute-intensive index building and memory-intensive retrieval. Standard Elasticsearch and OpenSearch configurations may struggle with the complex topology computations involved. To address these challenges, the CSS vector database supports full-stack performance tuning for clusters, covering both write and query. It helps you achieve the optimal balance between system performance and costs while ensuring high recall.

Optimizing Write Performance

Vector data ingestion involves three major overheads: replica synchronization, index refresh, and segment merging. During real-time index data ingestion, frequent index refresh operations generate a large number of small segments. This triggers frequent vector index build and merge operations, which consume excessive CPU/IO resources. You can try the following solutions to optimize write performance.

Optimizing Query Performance

Query performance is affected by the following factors: the number of segments, the memory circuit breaker mechanism, and field recall. An excessively large number of segments impacts search efficiency; when off-heap memory becomes insufficient, vector index data is frequently swapped in and out of the memory; recalling all fields increases the load during the fetch phase. You can optimize query performance by addressing these factors.