Creating a Vector Index

A vector index can be created by defining the vector dimensions, indexing algorithm (such as HNSW and IVF), and similarity measurement method (for example, cosine distance and Euclidean distance) in the mapping, establishing an optimal structure for indexing feature-rich data. Vector search mitigates the curse of dimensionality in large-scale data processing. Leveraging techniques such as quantization-based compression and multi-layer graph navigation, it delivers high recall accuracy and low latency even in datasets with tens of millions or even billions of records.

Indexing Algorithms

Table 1 lists the indexing algorithms supported by CSS vector databases. Select an algorithm based on your service requirements.

**Table 1** Indexing algorithms for CSS vector databases
Algorithm	Description	When to Use	Cluster Version
FLAT	Full brute-force indexing. This method does not build a complex data structure; instead, it performs an exhaustive search by calculating the distance between the query vector and every vector in the database in a sequential manner. It achieves a 100% recall rate (zero precision loss), but the computational load increases linearly as data volume grows.	Datasets with less than 10,000 records, or scenarios requiring maximum recall accuracy	Elasticsearch: 7.6.2 or 7.10.2 OpenSearch: 1.3.6 or 2.19.0
GRAPH	Hierarchical Navigable Small Worlds (HNSW), a graph-based indexing algorithm for efficient approximate nearest neighbor (ANN) search. A multi-layer graph navigation structure accelerates retrieval speeds. It delivers extremely fast retrieval speeds with high recall accuracy, but it is also highly memory-intensive (resident memory is required).	Datasets with up to 100 million records, millisecond-level latency and high precision required.	Elasticsearch: 7.6.2 or 7.10.2 OpenSearch: 1.3.6 or 2.19.0
GRAPH_PQ	A combination of HNSW with product quantization (PQ). By splitting and encoding vectors, this approach significantly reduces storage and memory overheads. The compression rate can reach 1/16 or even higher. It offers a low memory footprint, though precision decreases as the compression ratio increases.	Datasets with up to a billion records	Elasticsearch: 7.6.2 or 7.10.2 OpenSearch: 1.3.6 or 2.19.0
GRAPH_SQ8	A combination of HNSW with 8-bit scalar quantization. 32-bit floating-point numbers are compressed into 8-bit integers. The compression rate is 1/4. The memory footprint is significantly reduced, with a small precision loss.	Datasets with up to a billion records	Elasticsearch: 7.10.2 OpenSearch: 2.19.0
GRAPH_SQ4	A combination of HNSW with 4-bit scalar quantization. 32-bit floating-point numbers are compressed into 4-bit integers. The compression rate is 1/8. The memory footprint is significantly reduced, but recall also decreases significantly. GRAPH_SQ4 is more computationally efficient than GRAPH_SQ8.	Datasets with up to a billion records	Elasticsearch: 7.10.2 OpenSearch: 2.19.0
IVF_GRAPH	A combination of Inverted File Index (IVF) with HNSW. The vector space is partitioned into multiple clustering subspaces, each represented by a centroid. During retrieval, only relevant subspaces are scanned. This significantly accelerates retrieval speeds but leads to a slight precision loss.	High write throughput is required, and the operational complexity associated with centroid pre-building is acceptable.	Elasticsearch: 7.6.2 or 7.10.2 OpenSearch: 1.3.6 or 2.19.0
IVF_GRAPH_PQ	A combination of IVF, HNSW, and PQ. System capacity is further increased through compression, and system overheads are reduced.	High write throughput is required, and the operational complexity associated with centroid pre-building is acceptable.	Elasticsearch: 7.6.2 or 7.10.2 OpenSearch: 1.3.6 or 2.19.0

If you select IVF_GRAPH or IVF_GRAPH_PQ, you need to perform (Optional) Pre-Building and Registering Centroid Vectors before creating a vector index.

Logging in to Dev Tools

Log into Dev Tools to run DSL commands.

For an Elasticsearch cluster, log in to Kibana
1. Log in to the CSS management console.
2. In the navigation pane on the left, choose Clusters > Elasticsearch.
3. In the cluster list, find the target cluster, and click Kibana in the Operation column to log in to the Kibana console.
4. In the left navigation pane, choose Dev Tools.
  The left part of the console is the command input box, and the triangle icon in its upper-right corner is the execution button. The right part shows the execution result.
For an OpenSearch cluster, log in to Dashboards
1. Log in to the CSS management console.
2. In the navigation pane on the left, choose Clusters > OpenSearch.
3. In the cluster list, find the target cluster, and click Dashboards in the Operation column to log in to OpenSearch Dashboards.
4. In the left navigation pane, choose Dev Tools.
  The left part of the console is the command input box, and the triangle icon in its upper-right corner is the execution button. The right part shows the execution result.

Creating a Vector Index

Define the index mappings and specify the indexing algorithm parameters.

For example, create an index named my_index. This index contains a vector field named my_vector and a text field named my_label. A graph-based index is created for the vector field, and Euclidean distance is used for similarity measurement.

PUT my_index 
{
  "settings": {
    "index": {
      "vector": true,
      "number_of_shards": 1,
      "number_of_replicas": 1
    }
  },
  "mappings": {
    "properties": {
      "my_vector": {
        "type": "vector",
        "dimension": 2,
        "indexing": true,
        "algorithm": "GRAPH",
        "metric": "euclidean"
      },
      "my_label": {
        "type": "keyword"
      }
    }
  }
}

**Table 2** settings parameters
Parameter	Mandatory	Type	Default Value	Description
index.vector	Yes	Boolean	N/A	Whether to enable vector indexes. Set this parameter to true. Otherwise, vector indexes cannot be created.
index.number_of_shards	No	Integer	1	Number of index shards. This value should be divisible by the number of cluster nodes. Value range: 1–1024
index.number_of_replicas	No	Integer	1	Number of index replicas. Replicas improve data availability. Value range: 0 to the number of nodes minus 1
index.vector.exact_search_threshold	No	Integer	null (no switching)	Threshold for automatically switching from pre-filtering search to brute-force search. When the size of the intermediate result set in a segment is lower than this threshold, a brute-force search is performed. Value range: null (disables auto switchover from pre-filtering search to brute-force search) or a positive integer
index.vector.search.concurrency.enabled	No	Boolean	false	Whether to enable concurrent vector searches across segments. In an Elasticsearch cluster, each index shard consists of multiple segments. By default, each segment is searched in sequence. Enabling concurrent searches across segments reduces query latency but does not increase the cluster's maximum query throughput. Furthermore, it may increase the average CPU usage of cluster nodes. Constraints: This parameter is available only for Elasticsearch clusters whose image version is no earlier than 7.10.2_25.3.0_xxx. The value can be: true: Enables concurrent search. false: Performs serial search.

**Table 3** mappings parameters
Parameter	Mandatory	Type	Default Value	Description
type	Yes	String	N/A	Data type of a field. Set this parameter to vector to indicate vector fields.
dimension	Yes	Integer	N/A	Number of vector dimensions. Value range: 1–4096
indexing	No	Boolean	false	Whether to enable vector index acceleration. The value can be: true: Enables vector index acceleration. When this parameter is set to true, an extra vector index is created. The indexing algorithm is specified by the algorithm field and the index supports vector search. false: Disables vector index acceleration. If this parameter is set to false, vector data is written only to docvalues, and only ScriptScore and Rescore can be used for vector query.
lazy_indexing	No	Boolean	false	Whether to enable lazy vector indexing. When enabled, the system prioritizes data ingestion speed by delaying the construction of vector indices. Instead of building the index in real-time, the system simply persists the raw data. After the ingestion process is complete, offline index building needs to be performed manually. This parameter is designed to balance write throughput against indexing overhead. Use this option for large-scale offline data migrations where high ingestion speed is critical and real-time search capabilities are not required during the ingestion phase. Constraints: This parameter takes effect only when indexing is set to true in the mapping. For Elasticsearch clusters, the image version must be 7.10.2_24.3.3_xxx or later. For OpenSearch clusters, the version must be 2.19.0. The value can be: true: Enables lazy indexing. Write throughput is improved. However, before offline index building is manually triggered and completed, data cannot be queried via VectorQuery. false: Enables real-time index building. Indexing happens immediately upon ingestion, so data can be queried immediately after being written in.
algorithm	No	String	GRAPH	Vector indexing algorithm. Constraints: This parameter takes effect only when indexing is set to true in the mapping. If you select IVF_GRAPH or IVF_GRAPH_PQ, you must perform (Optional) Pre-Building and Registering Centroid Vectors before creating a vector index. Value range: FLAT, GRAPH, GRAPH_PQ, GRAPH_SQ8, GRAPH_SQ4, IVF_GRAPH, and IVF_GRAPH_PQ. For how to select an algorithm and the cluster version needed to run it, see Indexing Algorithms. When the GRAPH algorithm (GRAPH, GRAPH_PQ, GRAPH_SQ8, or GRAPH_SQ4) is selected, you can adjust the graph index structure and construction quality via Table 4. When the GRAPH_PQ algorithm is selected, precision can be controlled via Table 5.
dim_type	No	String	float	Vector data type. The value can be: binary: binary value float: floating-point number
metric	No	String	euclidean	Vector distance metric, which measures the similarity or distance between vectors. The value can be: euclidean: Euclidean distance inner_product: inner product distance cosine: cosine distance hamming: Hamming distance, which can be used only when dim_type is set to binary.

**Table 4** Optional parameters for the GRAPH indexing algorithm
Parameter	Mandatory	Type	Default Value	Description
neighbors	No	Integer	64	Maximum number of neighbors for each vector in the graph index. A larger value indicates denser connectivity of the graph and leads to higher retrieval precision (recall), but it also increases the index file size and slows down index building and query speeds. Constraints: This parameter takes effect only when indexing is set to true and algorithm is a GRAPH variant (GRAPH, GRAPH_PQ, GRAPH_SQ8, or GRAPH_SQ4) in the mapping. Value range: 20–255
shrink	No	Float	1	The aggressiveness of redundant edge removal (pruning) during graph construction. This setting directly controls the final graph density. A smaller value indicates more aggressive pruning, a sparser graph, and faster retrieval speeds, but may increase precision loss. Constraints: This parameter takes effect only when indexing is set to true and algorithm is a GRAPH variant (GRAPH, GRAPH_PQ, GRAPH_SQ8, or GRAPH_SQ4) in the mapping. Value range: 0.1–10
scaling	No	Integer	50	Scaling ratio for the number of nodes in the upper layers of the HNSW graph. This setting affects the layers and node distribution per layer in the HNSW graph. A proper scaling ratio ensures optimal cross-layer navigation efficiency during retrieval. Constraints: This parameter takes effect only when indexing is set to true and algorithm is a GRAPH variant (GRAPH, GRAPH_PQ, GRAPH_SQ8, or GRAPH_SQ4) in the mapping. Value range: 0–128
efc	No	Integer	200	How many nearest neighbors to explore when inserting a new vector into the HNSW graph. This parameter controls the search depth during index building. A larger value results in higher-quality graph structure and query accuracy but slower index building. Constraints: This parameter takes effect only when indexing is set to true and algorithm is a GRAPH variant (GRAPH, GRAPH_PQ, GRAPH_SQ8, or GRAPH_SQ4) in the mapping. Value range: 0–100000 You are advised to increase the value for large-scale datasets.
max_scan_num	No	Integer	10000	Maximum number of nodes to scan during a single query. This parameter limits the search depth. A larger value results in higher query accuracy but increased latency. Constraints: This parameter takes effect only when indexing is set to true and algorithm is a GRAPH variant (GRAPH, GRAPH_PQ, GRAPH_SQ8, or GRAPH_SQ4) in the mapping. Value range: 0–1000000

**Table 5** Optional parameters for the GRAPH_PQ indexing algorithm
Parameter	Mandatory	Type	Default Value	Description
centroid_num	No	Integer	255 (corresponding to 8-bit quantization)	Number of centroids within each subspace for the PQ algorithm. This parameter determines the encoding precision of the vectors after quantization. A larger value indicates more precise representation of the original vectors and higher recall, but also a slight increase in computational overhead and memory footprint. Constraints: This parameter takes effect only when indexing is set to true and algorithm is set to GRAPH_PQ in the mapping. Value range: 0–65535
fragment_num	No	Integer	0: The plug-in automatically sets the number of fragments based on the vector dimensions.	Number of fragments (M) each high-dimensional vector is split into. It affects the quantization granularity. A larger value results in compressed vectors that more closely approximate the original vectors and hence higher retrieval precision. However, it also leads to higher storage consumption. Constraints: This parameter takes effect only when indexing is set to true and algorithm is set to GRAPH_PQ in the mapping. Value range: 0–4096 When this parameter is set to 0, the system automatically calculates the optimal number of segments based on the vector dimensions dim. if dim <= 256: fragment_num = dim / 4 elif dim <= 512: fragment_num = dim / 8 else: fragment_num = 64

(Optional) Pre-Building and Registering Centroid Vectors

When the indexing algorithm is set to IVF_GRAPH or IVF_GRAPH_PQ, you must first pre-build and register the centroid vectors before creating a vector index.

IVF_GRAPH and IVF_GRAPH_PQ help to accelerate indexing and queries in ultra-large-scale datasets with more than 1 billion records. They allow you to narrow down the query scope by dividing a vector space into subspaces through clustering or random sampling. Before pre-build, you need to obtain all centroid vectors through clustering or random sampling. Centroid vectors are pre-built into a GRAPH or GRAPH_PQ index and then registered with the CSS vector database. Multiple nodes can share this index. Reuse of the centroid index among shards can effectively reduce the training overhead and the number of centroid index queries, improving write and query performance.

Create a centroid index.
For example, run the following command to create a centroid index named my_dict:
```
PUT my_dict 
 { 
   "settings": { 
     "index": { 
       "vector": true 
     }, 
     "number_of_shards": 1, 
     "number_of_replicas": 0 
   }, 
   "mappings": { 
     "properties": { 
       "my_vector": { 
         "type": "vector", 
         "dimension": 2, 
         "indexing": true, 
         "algorithm": "GRAPH", 
         "metric": "euclidean" 
       } 
     } 
   } 
 }
```
For detailed parameter configuration, see Creating a Vector Index. Pay attention to the following mandatory parameters:
- index.number_of_shards: The number of index shards must be set to 1. Otherwise, the centroid index cannot be registered.
- indexing: This parameter must be set to true to enable vector index acceleration.
- algorithm: Set the indexing algorithm. Set it to GRAPH for the IVF_GRAPH algorithm, and GRAPH_PQ if the IVF_GRAPH_PQ algorithm is used.
Write centroid vectors to the created index. Write the centroid vectors obtained through sampling or clustering into the newly created index my_dict.
Call the registration API.
For example, run the following command to register the centroid index as a Dict object with a globally unique name (dict_name):
```
PUT _vector/register/my_dict 
 { 
   "dict_name": "my_dict" 
 }
```

Create an IVF_GRAPH or IVF_GRAPH_PQ vector index.

When creating the vector index, you do not need to specify dimension or metric. Rather, you specify the registered Dict object. Table 6 describes key parameters for specifying a Dict object.

For example, run the following command to create an IVF_GRAPH vector index:

PUT my_index 
 { 
   "settings": { 
     "index": { 
       "vector": true,
       "sort.field": "my_vector.centroid" # Set the centroid subfield of each vector field as a sorting field.
     } 
   }, 
   "mappings": { 
     "properties": { 
       "my_vector": { 
         "type": "vector", 
         "indexing": true, 
         "algorithm": "IVF_GRAPH", 
         "dict_name": "my_dict", 
         "offload_ivf": true 
       } 
     } 
   } 
 }

**Table 6** Key parameters for specifying a Dict object
Parameter	Mandatory	Type	Default Value	Description
dict_name	Yes	String	N/A	Name of the centroid Dict object, for example, my_dict. The vector dimensions and similarity measurement method of the index are the same as those of the Dict object. There is no need to configure them again.
offload_ivf	Yes	Boolean	false	Whether to offload the IVF inverted index to the Elasticsearch/OpenSearch engine layer. The value can be: true: Offloads the IVF inverted index to Elasticsearch/OpenSearch for physical storage management. This significantly reduces the off-heap memory usage of the vector search engine. It also reduces CPU and memory overhead when writing and merging data in large throughput. false: Keep the IVF inverted index entirely within the dedicated memory buffer of the vector search engine. When processing datasets of hundreds of millions of records or more, you should set this parameter to true to optimize the cluster's memory ratio while ensuring retrieval performance.

Parent topic: CSS Vector Database

Previous topic: Preparing the Environment

Next topic: Importing Vector Data

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

For any further questions, feel free to contact us through the chatbot.

Chatbot