Creating an Index

Prerequisites

You have added the vector permission to the all-index policy by referring to Authentication Based on Ranger if you need to use vector retrieval in Ranger authentication mode.
In ACL authentication mode, user elasticsearch has the permission on all interfaces except the unregister interface in vector retrieval by default.

The unregister interface is used to delete dictionaries. You can delete only the dictionaries created by yourself.

Procedure

The following creates an index named my_index that contains the my_vector field. The field creates a graph index and uses Euclidean distance to measure the similarity. For details about the parameters, see Table 1.

PUT my_index
{
 "settings": {
  "index": {
   "vector": true
  }
},
 "mappings": {
  "properties": {
   "my_vector": {
    "type": "vector",
    "dimension": 2,
    "indexing": true,
    "algorithm": "GRAPH",
    "metric": "euclidean"
   }
  }
 }
}

**Table 1** Parameter description
Parameter	Sub-Parameter	Remarks
settings	vector	If vector index acceleration is required, set this parameter to true.
mappings	type	Field type. If this parameter is set to vector, the field is a vector.
	dimension	Vector data dimension. The value ranges from 0 to 4096.
	indexing	Whether to enable index acceleration. By default, index acceleration is disabled. true: indicates that index acceleration is enabled. false: indicates that index acceleration is disabled.
	algorithm	Index algorithm. This parameter is valid only when indexing is set to true. The default value is GRAPH. Options: FLAT: brute-force algorithm that calculates the distance between the target vector and all vectors in sequence. The algorithm has a considerable calculation workload and its recall rate reaches 100%. Therefore, this algorithm applies to scenarios that require high recall rate accuracy. GRAPH: Graph index that is embedded with the Self-developed HNSW algorithm. This algorithm is mainly used in scenarios where high performance and precision are required and the data volume is less than 10 million. GRAPH_PQ: an algorithm that combines the HNSW algorithm with particle quantification (PQ) index. This algorithm can reduce the storage overhead of original vectors, enabling HNSW to easily support hundreds of millions of data search.
	metric	Metric of measuring the distance between vectors. The default value is euclidean. Options: euclidean inner_product cosine hamming