Updated on 2026-04-30 GMT+08:00

Creating a Vector Index

A vector index can be created by defining the vector dimensions, indexing algorithm (such as HNSW and IVF), and similarity measurement method (for example, cosine distance and Euclidean distance) in the mapping, establishing an optimal structure for indexing feature-rich data. Vector search mitigates the curse of dimensionality in large-scale data processing. Leveraging techniques such as quantization-based compression and multi-layer graph navigation, it delivers high recall accuracy and low latency even in datasets with tens of millions or even billions of records.

Indexing Algorithms

Table 1 lists the indexing algorithms supported by CSS vector databases. Select an algorithm based on your service requirements.

Table 1 Indexing algorithms for CSS vector databases

Algorithm

Description

When to Use

Cluster Version

FLAT

Full brute-force indexing. This method does not build a complex data structure; instead, it performs an exhaustive search by calculating the distance between the query vector and every vector in the database in a sequential manner.

It achieves a 100% recall rate (zero precision loss), but the computational load increases linearly as data volume grows.

Datasets with less than 10,000 records, or scenarios requiring maximum recall accuracy

Elasticsearch: 7.6.2 or 7.10.2

OpenSearch: 1.3.6 or 2.19.0

GRAPH

Hierarchical Navigable Small Worlds (HNSW), a graph-based indexing algorithm for efficient approximate nearest neighbor (ANN) search. A multi-layer graph navigation structure accelerates retrieval speeds.

It delivers extremely fast retrieval speeds with high recall accuracy, but it is also highly memory-intensive (resident memory is required).

Datasets with up to 100 million records, millisecond-level latency and high precision required.

Elasticsearch: 7.6.2 or 7.10.2

OpenSearch: 1.3.6 or 2.19.0

GRAPH_PQ

A combination of HNSW with product quantization (PQ). By splitting and encoding vectors, this approach significantly reduces storage and memory overheads.

The compression rate can reach 1/16 or even higher. It offers a low memory footprint, though precision decreases as the compression ratio increases.

Datasets with up to a billion records

Elasticsearch: 7.6.2 or 7.10.2

OpenSearch: 1.3.6 or 2.19.0

GRAPH_SQ8

A combination of HNSW with 8-bit scalar quantization. 32-bit floating-point numbers are compressed into 8-bit integers.

The compression rate is 1/4. The memory footprint is significantly reduced, with a small precision loss.

Datasets with up to a billion records

Elasticsearch: 7.10.2

OpenSearch: 2.19.0

GRAPH_SQ4

A combination of HNSW with 4-bit scalar quantization. 32-bit floating-point numbers are compressed into 4-bit integers.

The compression rate is 1/8. The memory footprint is significantly reduced, but recall also decreases significantly. GRAPH_SQ4 is more computationally efficient than GRAPH_SQ8.

Datasets with up to a billion records

Elasticsearch: 7.10.2

OpenSearch: 2.19.0

IVF_GRAPH

A combination of Inverted File Index (IVF) with HNSW. The vector space is partitioned into multiple clustering subspaces, each represented by a centroid. During retrieval, only relevant subspaces are scanned.

This significantly accelerates retrieval speeds but leads to a slight precision loss.

High write throughput is required, and the operational complexity associated with centroid pre-building is acceptable.

Elasticsearch: 7.6.2 or 7.10.2

OpenSearch: 1.3.6 or 2.19.0

IVF_GRAPH_PQ

A combination of IVF, HNSW, and PQ. System capacity is further increased through compression, and system overheads are reduced.

High write throughput is required, and the operational complexity associated with centroid pre-building is acceptable.

Elasticsearch: 7.6.2 or 7.10.2

OpenSearch: 1.3.6 or 2.19.0

If you select IVF_GRAPH or IVF_GRAPH_PQ, you need to perform (Optional) Pre-Building and Registering Centroid Vectors before creating a vector index.

Logging in to Dev Tools

Log into Dev Tools to run DSL commands.

  • For an Elasticsearch cluster, log in to Kibana
    1. Log in to the CSS management console.
    2. In the navigation pane on the left, choose Clusters > Elasticsearch.
    3. In the cluster list, find the target cluster, and click Kibana in the Operation column to log in to the Kibana console.
    4. In the left navigation pane, choose Dev Tools.

      The left part of the console is the command input box, and the triangle icon in its upper-right corner is the execution button. The right part shows the execution result.

  • For an OpenSearch cluster, log in to Dashboards
    1. Log in to the CSS management console.
    2. In the navigation pane on the left, choose Clusters > OpenSearch.
    3. In the cluster list, find the target cluster, and click Dashboards in the Operation column to log in to OpenSearch Dashboards.
    4. In the left navigation pane, choose Dev Tools.

      The left part of the console is the command input box, and the triangle icon in its upper-right corner is the execution button. The right part shows the execution result.

Creating a Vector Index

Define the index mappings and specify the indexing algorithm parameters.

For example, create an index named my_index. This index contains a vector field named my_vector and a text field named my_label. A graph-based index is created for the vector field, and Euclidean distance is used for similarity measurement.

PUT my_index 
{
  "settings": {
    "index": {
      "vector": true,
      "number_of_shards": 1,
      "number_of_replicas": 1
    }
  },
  "mappings": {
    "properties": {
      "my_vector": {
        "type": "vector",
        "dimension": 2,
        "indexing": true,
        "algorithm": "GRAPH",
        "metric": "euclidean"
      },
      "my_label": {
        "type": "keyword"
      }
    }
  }
}
Table 2 settings parameters

Parameter

Mandatory

Type

Default Value

Description

index.vector

Yes

Boolean

N/A

Whether to enable vector indexes.

Set this parameter to true. Otherwise, vector indexes cannot be created.

index.number_of_shards

No

Integer

1

Number of index shards. This value should be divisible by the number of cluster nodes.

Value range: 1–1024

index.number_of_replicas

No

Integer

1

Number of index replicas. Replicas improve data availability.

Value range: 0 to the number of nodes minus 1

index.vector.exact_search_threshold

No

Integer

null (no switching)

Threshold for automatically switching from pre-filtering search to brute-force search. When the size of the intermediate result set in a segment is lower than this threshold, a brute-force search is performed.

Value range: null (disables auto switchover from pre-filtering search to brute-force search) or a positive integer

index.vector.search.concurrency.enabled

No

Boolean

false

Whether to enable concurrent vector searches across segments. In an Elasticsearch cluster, each index shard consists of multiple segments. By default, each segment is searched in sequence. Enabling concurrent searches across segments reduces query latency but does not increase the cluster's maximum query throughput. Furthermore, it may increase the average CPU usage of cluster nodes.

Constraints: This parameter is available only for Elasticsearch clusters whose image version is no earlier than 7.10.2_25.3.0_xxx.

The value can be:
  • true: Enables concurrent search.
  • false: Performs serial search.
Table 3 mappings parameters

Parameter

Mandatory

Type

Default Value

Description

type

Yes

String

N/A

Data type of a field.

Set this parameter to vector to indicate vector fields.

dimension

Yes

Integer

N/A

Number of vector dimensions.

Value range: 1–4096

indexing

No

Boolean

false

Whether to enable vector index acceleration.

The value can be:
  • true: Enables vector index acceleration. When this parameter is set to true, an extra vector index is created. The indexing algorithm is specified by the algorithm field and the index supports vector search.
  • false: Disables vector index acceleration. If this parameter is set to false, vector data is written only to docvalues, and only ScriptScore and Rescore can be used for vector query.

lazy_indexing

No

Boolean

false

Whether to enable lazy vector indexing. When enabled, the system prioritizes data ingestion speed by delaying the construction of vector indices. Instead of building the index in real-time, the system simply persists the raw data. After the ingestion process is complete, offline index building needs to be performed manually. This parameter is designed to balance write throughput against indexing overhead. Use this option for large-scale offline data migrations where high ingestion speed is critical and real-time search capabilities are not required during the ingestion phase.

Constraints:
  • This parameter takes effect only when indexing is set to true in the mapping.
  • For Elasticsearch clusters, the image version must be 7.10.2_24.3.3_xxx or later.
  • For OpenSearch clusters, the version must be 2.19.0.
The value can be:
  • true: Enables lazy indexing. Write throughput is improved. However, before offline index building is manually triggered and completed, data cannot be queried via VectorQuery.
  • false: Enables real-time index building. Indexing happens immediately upon ingestion, so data can be queried immediately after being written in.

algorithm

No

String

GRAPH

Vector indexing algorithm.

Constraints:

Value range: FLAT, GRAPH, GRAPH_PQ, GRAPH_SQ8, GRAPH_SQ4, IVF_GRAPH, and IVF_GRAPH_PQ.

For how to select an algorithm and the cluster version needed to run it, see Indexing Algorithms.

  • When the GRAPH algorithm (GRAPH, GRAPH_PQ, GRAPH_SQ8, or GRAPH_SQ4) is selected, you can adjust the graph index structure and construction quality via Table 4.
  • When the GRAPH_PQ algorithm is selected, precision can be controlled via Table 5.

dim_type

No

String

float

Vector data type.

The value can be:
  • binary: binary value
  • float: floating-point number

metric

No

String

euclidean

Vector distance metric, which measures the similarity or distance between vectors.

The value can be:

  • euclidean: Euclidean distance
  • inner_product: inner product distance
  • cosine: cosine distance
  • hamming: Hamming distance, which can be used only when dim_type is set to binary.
Table 4 Optional parameters for the GRAPH indexing algorithm

Parameter

Mandatory

Type

Default Value

Description

neighbors

No

Integer

64

Maximum number of neighbors for each vector in the graph index. A larger value indicates denser connectivity of the graph and leads to higher retrieval precision (recall), but it also increases the index file size and slows down index building and query speeds.

Constraints: This parameter takes effect only when indexing is set to true and algorithm is a GRAPH variant (GRAPH, GRAPH_PQ, GRAPH_SQ8, or GRAPH_SQ4) in the mapping.

Value range: 20–255

shrink

No

Float

1

The aggressiveness of redundant edge removal (pruning) during graph construction. This setting directly controls the final graph density. A smaller value indicates more aggressive pruning, a sparser graph, and faster retrieval speeds, but may increase precision loss.

Constraints: This parameter takes effect only when indexing is set to true and algorithm is a GRAPH variant (GRAPH, GRAPH_PQ, GRAPH_SQ8, or GRAPH_SQ4) in the mapping.

Value range: 0.1–10

scaling

No

Integer

50

Scaling ratio for the number of nodes in the upper layers of the HNSW graph. This setting affects the layers and node distribution per layer in the HNSW graph. A proper scaling ratio ensures optimal cross-layer navigation efficiency during retrieval.

Constraints: This parameter takes effect only when indexing is set to true and algorithm is a GRAPH variant (GRAPH, GRAPH_PQ, GRAPH_SQ8, or GRAPH_SQ4) in the mapping.

Value range: 0–128

efc

No

Integer

200

How many nearest neighbors to explore when inserting a new vector into the HNSW graph. This parameter controls the search depth during index building. A larger value results in higher-quality graph structure and query accuracy but slower index building.

Constraints: This parameter takes effect only when indexing is set to true and algorithm is a GRAPH variant (GRAPH, GRAPH_PQ, GRAPH_SQ8, or GRAPH_SQ4) in the mapping.

Value range: 0–100000

You are advised to increase the value for large-scale datasets.

max_scan_num

No

Integer

10000

Maximum number of nodes to scan during a single query. This parameter limits the search depth. A larger value results in higher query accuracy but increased latency.

Constraints: This parameter takes effect only when indexing is set to true and algorithm is a GRAPH variant (GRAPH, GRAPH_PQ, GRAPH_SQ8, or GRAPH_SQ4) in the mapping.

Value range: 0–1000000

Table 5 Optional parameters for the GRAPH_PQ indexing algorithm

Parameter

Mandatory

Type

Default Value

Description

centroid_num

No

Integer

255 (corresponding to 8-bit quantization)

Number of centroids within each subspace for the PQ algorithm. This parameter determines the encoding precision of the vectors after quantization. A larger value indicates more precise representation of the original vectors and higher recall, but also a slight increase in computational overhead and memory footprint.

Constraints: This parameter takes effect only when indexing is set to true and algorithm is set to GRAPH_PQ in the mapping.

Value range: 0–65535

fragment_num

No

Integer

0: The plug-in automatically sets the number of fragments based on the vector dimensions.

Number of fragments (M) each high-dimensional vector is split into. It affects the quantization granularity. A larger value results in compressed vectors that more closely approximate the original vectors and hence higher retrieval precision. However, it also leads to higher storage consumption.

Constraints: This parameter takes effect only when indexing is set to true and algorithm is set to GRAPH_PQ in the mapping.

Value range: 0–4096

When this parameter is set to 0, the system automatically calculates the optimal number of segments based on the vector dimensions dim.

if dim <= 256:
    fragment_num = dim / 4
elif dim <= 512:
    fragment_num = dim / 8
else:
    fragment_num = 64

(Optional) Pre-Building and Registering Centroid Vectors

When the indexing algorithm is set to IVF_GRAPH or IVF_GRAPH_PQ, you must first pre-build and register the centroid vectors before creating a vector index.

IVF_GRAPH and IVF_GRAPH_PQ help to accelerate indexing and queries in ultra-large-scale datasets with more than 1 billion records. They allow you to narrow down the query scope by dividing a vector space into subspaces through clustering or random sampling. Before pre-build, you need to obtain all centroid vectors through clustering or random sampling. Centroid vectors are pre-built into a GRAPH or GRAPH_PQ index and then registered with the CSS vector database. Multiple nodes can share this index. Reuse of the centroid index among shards can effectively reduce the training overhead and the number of centroid index queries, improving write and query performance.

  1. Create a centroid index.

    For example, run the following command to create a centroid index named my_dict:

    PUT my_dict 
     { 
       "settings": { 
         "index": { 
           "vector": true 
         }, 
         "number_of_shards": 1, 
         "number_of_replicas": 0 
       }, 
       "mappings": { 
         "properties": { 
           "my_vector": { 
             "type": "vector", 
             "dimension": 2, 
             "indexing": true, 
             "algorithm": "GRAPH", 
             "metric": "euclidean" 
           } 
         } 
       } 
     }
    For detailed parameter configuration, see Creating a Vector Index. Pay attention to the following mandatory parameters:
    • index.number_of_shards: The number of index shards must be set to 1. Otherwise, the centroid index cannot be registered.
    • indexing: This parameter must be set to true to enable vector index acceleration.
    • algorithm: Set the indexing algorithm. Set it to GRAPH for the IVF_GRAPH algorithm, and GRAPH_PQ if the IVF_GRAPH_PQ algorithm is used.
  2. Write centroid vectors to the created index. Write the centroid vectors obtained through sampling or clustering into the newly created index my_dict.
  3. Call the registration API.

    For example, run the following command to register the centroid index as a Dict object with a globally unique name (dict_name):

    PUT _vector/register/my_dict 
     { 
       "dict_name": "my_dict" 
     }
  4. Create an IVF_GRAPH or IVF_GRAPH_PQ vector index.

    When creating the vector index, you do not need to specify dimension or metric. Rather, you specify the registered Dict object. Table 6 describes key parameters for specifying a Dict object.

    For example, run the following command to create an IVF_GRAPH vector index:

    PUT my_index 
     { 
       "settings": { 
         "index": { 
           "vector": true,
           "sort.field": "my_vector.centroid" # Set the centroid subfield of each vector field as a sorting field.
         } 
       }, 
       "mappings": { 
         "properties": { 
           "my_vector": { 
             "type": "vector", 
             "indexing": true, 
             "algorithm": "IVF_GRAPH", 
             "dict_name": "my_dict", 
             "offload_ivf": true 
           } 
         } 
       } 
     }
    Table 6 Key parameters for specifying a Dict object

    Parameter

    Mandatory

    Type

    Default Value

    Description

    dict_name

    Yes

    String

    N/A

    Name of the centroid Dict object, for example, my_dict. The vector dimensions and similarity measurement method of the index are the same as those of the Dict object. There is no need to configure them again.

    offload_ivf

    Yes

    Boolean

    false

    Whether to offload the IVF inverted index to the Elasticsearch/OpenSearch engine layer.

    The value can be:
    • true: Offloads the IVF inverted index to Elasticsearch/OpenSearch for physical storage management. This significantly reduces the off-heap memory usage of the vector search engine. It also reduces CPU and memory overhead when writing and merging data in large throughput.
    • false: Keep the IVF inverted index entirely within the dedicated memory buffer of the vector search engine.

    When processing datasets of hundreds of millions of records or more, you should set this parameter to true to optimize the cluster's memory ratio while ensuring retrieval performance.