Creating a Vector Index
A vector index can be created by defining the vector dimensions, indexing algorithm (such as HNSW and IVF), and similarity measurement method (for example, cosine distance and Euclidean distance) in the mapping, establishing an optimal structure for indexing feature-rich data. Vector search mitigates the curse of dimensionality in large-scale data processing. Leveraging techniques such as quantization-based compression and multi-layer graph navigation, it delivers high recall accuracy and low latency even in datasets with tens of millions or even billions of records.
Indexing Algorithms
Table 1 lists the indexing algorithms supported by CSS vector databases. Select an algorithm based on your service requirements.
| Algorithm | Description | When to Use | Cluster Version |
|---|---|---|---|
| FLAT | Full brute-force indexing. This method does not build a complex data structure; instead, it performs an exhaustive search by calculating the distance between the query vector and every vector in the database in a sequential manner. It achieves a 100% recall rate (zero precision loss), but the computational load increases linearly as data volume grows. | Datasets with less than 10,000 records, or scenarios requiring maximum recall accuracy | Elasticsearch: 7.6.2 or 7.10.2 OpenSearch: 1.3.6 or 2.19.0 |
| GRAPH | Hierarchical Navigable Small Worlds (HNSW), a graph-based indexing algorithm for efficient approximate nearest neighbor (ANN) search. A multi-layer graph navigation structure accelerates retrieval speeds. It delivers extremely fast retrieval speeds with high recall accuracy, but it is also highly memory-intensive (resident memory is required). | Datasets with up to 100 million records, millisecond-level latency and high precision required. | Elasticsearch: 7.6.2 or 7.10.2 OpenSearch: 1.3.6 or 2.19.0 |
| GRAPH_PQ | A combination of HNSW with product quantization (PQ). By splitting and encoding vectors, this approach significantly reduces storage and memory overheads. The compression rate can reach 1/16 or even higher. It offers a low memory footprint, though precision decreases as the compression ratio increases. | Datasets with up to a billion records | Elasticsearch: 7.6.2 or 7.10.2 OpenSearch: 1.3.6 or 2.19.0 |
| GRAPH_SQ8 | A combination of HNSW with 8-bit scalar quantization. 32-bit floating-point numbers are compressed into 8-bit integers. The compression rate is 1/4. The memory footprint is significantly reduced, with a small precision loss. | Datasets with up to a billion records | Elasticsearch: 7.10.2 OpenSearch: 2.19.0 |
| GRAPH_SQ4 | A combination of HNSW with 4-bit scalar quantization. 32-bit floating-point numbers are compressed into 4-bit integers. The compression rate is 1/8. The memory footprint is significantly reduced, but recall also decreases significantly. GRAPH_SQ4 is more computationally efficient than GRAPH_SQ8. | Datasets with up to a billion records | Elasticsearch: 7.10.2 OpenSearch: 2.19.0 |
| IVF_GRAPH | A combination of Inverted File Index (IVF) with HNSW. The vector space is partitioned into multiple clustering subspaces, each represented by a centroid. During retrieval, only relevant subspaces are scanned. This significantly accelerates retrieval speeds but leads to a slight precision loss. | High write throughput is required, and the operational complexity associated with centroid pre-building is acceptable. | Elasticsearch: 7.6.2 or 7.10.2 OpenSearch: 1.3.6 or 2.19.0 |
| IVF_GRAPH_PQ | A combination of IVF, HNSW, and PQ. System capacity is further increased through compression, and system overheads are reduced. | High write throughput is required, and the operational complexity associated with centroid pre-building is acceptable. | Elasticsearch: 7.6.2 or 7.10.2 OpenSearch: 1.3.6 or 2.19.0 |
If you select IVF_GRAPH or IVF_GRAPH_PQ, you need to perform (Optional) Pre-Building and Registering Centroid Vectors before creating a vector index.
Logging in to Dev Tools
Log into Dev Tools to run DSL commands.
- For an Elasticsearch cluster, log in to Kibana
- Log in to the CSS management console.
- In the navigation pane on the left, choose Clusters > Elasticsearch.
- In the cluster list, find the target cluster, and click Kibana in the Operation column to log in to the Kibana console.
- In the left navigation pane, choose Dev Tools.
The left part of the console is the command input box, and the triangle icon in its upper-right corner is the execution button. The right part shows the execution result.
- For an OpenSearch cluster, log in to Dashboards
- Log in to the CSS management console.
- In the navigation pane on the left, choose Clusters > OpenSearch.
- In the cluster list, find the target cluster, and click Dashboards in the Operation column to log in to OpenSearch Dashboards.
- In the left navigation pane, choose Dev Tools.
The left part of the console is the command input box, and the triangle icon in its upper-right corner is the execution button. The right part shows the execution result.
Creating a Vector Index
Define the index mappings and specify the indexing algorithm parameters.
For example, create an index named my_index. This index contains a vector field named my_vector and a text field named my_label. A graph-based index is created for the vector field, and Euclidean distance is used for similarity measurement.
PUT my_index
{
"settings": {
"index": {
"vector": true,
"number_of_shards": 1,
"number_of_replicas": 1
}
},
"mappings": {
"properties": {
"my_vector": {
"type": "vector",
"dimension": 2,
"indexing": true,
"algorithm": "GRAPH",
"metric": "euclidean"
},
"my_label": {
"type": "keyword"
}
}
}
} | Parameter | Mandatory | Type | Default Value | Description |
|---|---|---|---|---|
| index.vector | Yes | Boolean | N/A | Whether to enable vector indexes. Set this parameter to true. Otherwise, vector indexes cannot be created. |
| index.number_of_shards | No | Integer | 1 | Number of index shards. This value should be divisible by the number of cluster nodes. Value range: 1–1024 |
| index.number_of_replicas | No | Integer | 1 | Number of index replicas. Replicas improve data availability. Value range: 0 to the number of nodes minus 1 |
| index.vector.exact_search_threshold | No | Integer | null (no switching) | Threshold for automatically switching from pre-filtering search to brute-force search. When the size of the intermediate result set in a segment is lower than this threshold, a brute-force search is performed. Value range: null (disables auto switchover from pre-filtering search to brute-force search) or a positive integer |
| index.vector.search.concurrency.enabled | No | Boolean | false | Whether to enable concurrent vector searches across segments. In an Elasticsearch cluster, each index shard consists of multiple segments. By default, each segment is searched in sequence. Enabling concurrent searches across segments reduces query latency but does not increase the cluster's maximum query throughput. Furthermore, it may increase the average CPU usage of cluster nodes. Constraints: This parameter is available only for Elasticsearch clusters whose image version is no earlier than 7.10.2_25.3.0_xxx. The value can be:
|
| Parameter | Mandatory | Type | Default Value | Description |
|---|---|---|---|---|
| type | Yes | String | N/A | Data type of a field. Set this parameter to vector to indicate vector fields. |
| dimension | Yes | Integer | N/A | Number of vector dimensions. Value range: 1–4096 |
| indexing | No | Boolean | false | Whether to enable vector index acceleration. The value can be:
|
| lazy_indexing | No | Boolean | false | Whether to enable lazy vector indexing. When enabled, the system prioritizes data ingestion speed by delaying the construction of vector indices. Instead of building the index in real-time, the system simply persists the raw data. After the ingestion process is complete, offline index building needs to be performed manually. This parameter is designed to balance write throughput against indexing overhead. Use this option for large-scale offline data migrations where high ingestion speed is critical and real-time search capabilities are not required during the ingestion phase. Constraints:
The value can be:
|
| algorithm | No | String | GRAPH | Vector indexing algorithm. Constraints:
Value range: FLAT, GRAPH, GRAPH_PQ, GRAPH_SQ8, GRAPH_SQ4, IVF_GRAPH, and IVF_GRAPH_PQ. For how to select an algorithm and the cluster version needed to run it, see Indexing Algorithms. |
| dim_type | No | String | float | Vector data type. The value can be:
|
| metric | No | String | euclidean | Vector distance metric, which measures the similarity or distance between vectors. The value can be:
|
| Parameter | Mandatory | Type | Default Value | Description |
|---|---|---|---|---|
| neighbors | No | Integer | 64 | Maximum number of neighbors for each vector in the graph index. A larger value indicates denser connectivity of the graph and leads to higher retrieval precision (recall), but it also increases the index file size and slows down index building and query speeds. Constraints: This parameter takes effect only when indexing is set to true and algorithm is a GRAPH variant (GRAPH, GRAPH_PQ, GRAPH_SQ8, or GRAPH_SQ4) in the mapping. Value range: 20–255 |
| shrink | No | Float | 1 | The aggressiveness of redundant edge removal (pruning) during graph construction. This setting directly controls the final graph density. A smaller value indicates more aggressive pruning, a sparser graph, and faster retrieval speeds, but may increase precision loss. Constraints: This parameter takes effect only when indexing is set to true and algorithm is a GRAPH variant (GRAPH, GRAPH_PQ, GRAPH_SQ8, or GRAPH_SQ4) in the mapping. Value range: 0.1–10 |
| scaling | No | Integer | 50 | Scaling ratio for the number of nodes in the upper layers of the HNSW graph. This setting affects the layers and node distribution per layer in the HNSW graph. A proper scaling ratio ensures optimal cross-layer navigation efficiency during retrieval. Constraints: This parameter takes effect only when indexing is set to true and algorithm is a GRAPH variant (GRAPH, GRAPH_PQ, GRAPH_SQ8, or GRAPH_SQ4) in the mapping. Value range: 0–128 |
| efc | No | Integer | 200 | How many nearest neighbors to explore when inserting a new vector into the HNSW graph. This parameter controls the search depth during index building. A larger value results in higher-quality graph structure and query accuracy but slower index building. Constraints: This parameter takes effect only when indexing is set to true and algorithm is a GRAPH variant (GRAPH, GRAPH_PQ, GRAPH_SQ8, or GRAPH_SQ4) in the mapping. Value range: 0–100000 You are advised to increase the value for large-scale datasets. |
| max_scan_num | No | Integer | 10000 | Maximum number of nodes to scan during a single query. This parameter limits the search depth. A larger value results in higher query accuracy but increased latency. Constraints: This parameter takes effect only when indexing is set to true and algorithm is a GRAPH variant (GRAPH, GRAPH_PQ, GRAPH_SQ8, or GRAPH_SQ4) in the mapping. Value range: 0–1000000 |
| Parameter | Mandatory | Type | Default Value | Description |
|---|---|---|---|---|
| centroid_num | No | Integer | 255 (corresponding to 8-bit quantization) | Number of centroids within each subspace for the PQ algorithm. This parameter determines the encoding precision of the vectors after quantization. A larger value indicates more precise representation of the original vectors and higher recall, but also a slight increase in computational overhead and memory footprint. Constraints: This parameter takes effect only when indexing is set to true and algorithm is set to GRAPH_PQ in the mapping. Value range: 0–65535 |
| fragment_num | No | Integer | 0: The plug-in automatically sets the number of fragments based on the vector dimensions. | Number of fragments (M) each high-dimensional vector is split into. It affects the quantization granularity. A larger value results in compressed vectors that more closely approximate the original vectors and hence higher retrieval precision. However, it also leads to higher storage consumption. Constraints: This parameter takes effect only when indexing is set to true and algorithm is set to GRAPH_PQ in the mapping. Value range: 0–4096 When this parameter is set to 0, the system automatically calculates the optimal number of segments based on the vector dimensions dim. if dim <= 256:
fragment_num = dim / 4
elif dim <= 512:
fragment_num = dim / 8
else:
fragment_num = 64 |
(Optional) Pre-Building and Registering Centroid Vectors
When the indexing algorithm is set to IVF_GRAPH or IVF_GRAPH_PQ, you must first pre-build and register the centroid vectors before creating a vector index.
IVF_GRAPH and IVF_GRAPH_PQ help to accelerate indexing and queries in ultra-large-scale datasets with more than 1 billion records. They allow you to narrow down the query scope by dividing a vector space into subspaces through clustering or random sampling. Before pre-build, you need to obtain all centroid vectors through clustering or random sampling. Centroid vectors are pre-built into a GRAPH or GRAPH_PQ index and then registered with the CSS vector database. Multiple nodes can share this index. Reuse of the centroid index among shards can effectively reduce the training overhead and the number of centroid index queries, improving write and query performance.
- Create a centroid index.
For example, run the following command to create a centroid index named my_dict:
PUT my_dict { "settings": { "index": { "vector": true }, "number_of_shards": 1, "number_of_replicas": 0 }, "mappings": { "properties": { "my_vector": { "type": "vector", "dimension": 2, "indexing": true, "algorithm": "GRAPH", "metric": "euclidean" } } } }For detailed parameter configuration, see Creating a Vector Index. Pay attention to the following mandatory parameters:- index.number_of_shards: The number of index shards must be set to 1. Otherwise, the centroid index cannot be registered.
- indexing: This parameter must be set to true to enable vector index acceleration.
- algorithm: Set the indexing algorithm. Set it to GRAPH for the IVF_GRAPH algorithm, and GRAPH_PQ if the IVF_GRAPH_PQ algorithm is used.
- Write centroid vectors to the created index. Write the centroid vectors obtained through sampling or clustering into the newly created index my_dict.
- Call the registration API.
For example, run the following command to register the centroid index as a Dict object with a globally unique name (dict_name):
PUT _vector/register/my_dict { "dict_name": "my_dict" } - Create an IVF_GRAPH or IVF_GRAPH_PQ vector index.
When creating the vector index, you do not need to specify dimension or metric. Rather, you specify the registered Dict object. Table 6 describes key parameters for specifying a Dict object.
For example, run the following command to create an IVF_GRAPH vector index:
PUT my_index { "settings": { "index": { "vector": true, "sort.field": "my_vector.centroid" # Set the centroid subfield of each vector field as a sorting field. } }, "mappings": { "properties": { "my_vector": { "type": "vector", "indexing": true, "algorithm": "IVF_GRAPH", "dict_name": "my_dict", "offload_ivf": true } } } }Table 6 Key parameters for specifying a Dict object Parameter
Mandatory
Type
Default Value
Description
dict_name
Yes
String
N/A
Name of the centroid Dict object, for example, my_dict. The vector dimensions and similarity measurement method of the index are the same as those of the Dict object. There is no need to configure them again.
offload_ivf
Yes
Boolean
false
Whether to offload the IVF inverted index to the Elasticsearch/OpenSearch engine layer.
The value can be:- true: Offloads the IVF inverted index to Elasticsearch/OpenSearch for physical storage management. This significantly reduces the off-heap memory usage of the vector search engine. It also reduces CPU and memory overhead when writing and merging data in large throughput.
- false: Keep the IVF inverted index entirely within the dedicated memory buffer of the vector search engine.
When processing datasets of hundreds of millions of records or more, you should set this parameter to true to optimize the cluster's memory ratio while ensuring retrieval performance.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot