Querying a Vector
- Basic query
Basic query provides special vector query syntax for vector fields for which vector indexes are created. In the following example code, the first vector indicates that the query type is VectorQuery, and my_vector specifies the name of the vector field to be queried. The second vector specifies the vector value to be queried, which can be an array or Base64 strings. Generally, the value of topk is the same as that of size. Finally, the query will return n (specified by size/topk and 2 in the following example) pieces of data records that are closet to the query vector.
POST my_index/_search { "size": 2, "query": { "vector": { "my_vector": { "vector": [1.0, 2.0], "topk": 2 } } } }
- Compound query
Vector search can be used together with other Elasticsearch subqueries, such as Boolean query and post-filtering, for compound query.
- Example of Boolean query
POST my_index/_search { "size": 10, "query": { "bool": { "must": { "vector": { "my_vector": { "vector": [1.0, 2.0], "topk": 10 } } }, "filter": { "term" : { "tags" : "production" } }, "must_not" : { "range" : { "age" : {"gte" : 10, "lte" : 20} } } } } }
In this example, topk (10) results closest to the query vector are queried first. filter specifies the condition for filtering only the results whose tags field is production. The modifier of the range clause is must_not, indicating that the results obtained by the range query are deleted so as to obtain the final result. However, the number of final data records may be less than the value specified by topk.
- Example of post-filtering
GET my_index/_search { "size": 10, "query": { "vector": { "my_vector": { "vector": [1.0, 2.0], "topk": 10 } } }, "post_filter": { "term": { "tags": "production" } } }
- Example of Boolean query
- Query scoring
When GRAPH_PQ is used, the query result is sorted based on the asymmetric distance calculated by PQ. CSS supports re-scoring and sorting of query results to improve the recall rate. Assuming that my_index is a PQ index, an example of rescoring the query results is as follows:
GET my_index/_search { "size": 10, "query": { "vector": { "my_vector": { "vector": [1.0, 2.0], "topk": 100 } } }, "rescore": { "window_size": 100, "vector_rescore": { "field": "my_vector", "vector": [1.0, 2.0], "metric": "euclidean" } } }
Table 1 Rescore parameter description Parameter
Remarks
window_size
topk results are returned by the vector query and only the first window_size results are sorted.
field
Name of a vector.
vector
Vectors to be queried.
metric
Metric of measuring the distance between vectors. The default value is euclidean. Options:
- euclidean
- inner_product
- cosine
- hamming
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot