Milvus indexing refers to the process of organizing and structuring the high-dimensional data in a way that enables an efficient similarity search operations.
What Is a Vector in Milvus?
A Milvus vector is a numerical representation of an entity or data point in a high-dimensional space. It is a fundamental concept that is used in similarity search and vector databases.
A vector is represented as an array of numerical values where each value corresponds to an entity’s specific feature or attribute.
Milvus allows you to store efficiently, index, and search the high-dimensional vectors by leveraging the advanced indexing techniques such as the approximate nearest neighbor (ANN) search algorithms to achieve a fast and accurate similarity matching.
Milvus Vector Index
Vector indexes in Milvus work as an organizational unit that organizes the vector metadata to enhance the vector similarity search. Without a vector index, Milvus performs a brute-force search on the specified data upon a search operation.
Let us learn how we can create a vector index to enhance the search operations on a vector.
Prepare the Parameter Index
Unlike a scalar index, we need to specify the index parameters for vector indexes. The following are the supported parameters for a vector index in Milvus.
- Metric_type – This parameter specifies the metric that is used to determine the similarity of vectors. The supported values include:
- L2 – Euclidean Distance
- IP – Inner Product
- JACCARD – Jaccard Distance
- TANIMOTO – Tanimoto Distance
- HAMMING – Hamming Distance
- SUPERSTRUCTURE – Superstructure distance
- SUBSTRUCTURE – Substructure distance.
- Index_type – This specifies the index type that is used to accelerate the vector search. The accepted values include:
- FLAT
- IVF_FLAT
- IVF_SQ8
- IVF_PQ
- HNSW
- ANNOY
- DISKANN*
- BIN_FLAT
- BIN_IVF_FLAT
- Params – It specifies the specific parameters for the index.
Create a Vector Index in the Milvus CLI
The following demonstrates how to create a vector index on the film collection using the create index command in the Milvus CLI.
Once you run the previous command, Milvus will prompt the step-by-step process of creating a vector index as shown in the following queries:
The name of the field to create an index for (vector): vector
Index type (FLAT, IVF_FLAT, IVF_SQ8, IVF_PQ, RNSG, HNSW, ANNOY): IVF_FLAT
Index metric type (L2, IP, HAMMING, TANIMOTO): L2
Index params nlist: 1024
Timeout []:
From the previous prompts and outputs, we create a vector index on the “film” collection on the vector field.
In the previous example, we provide the name of the index as vector_data_index of type IVF_FLAT using the L2 metric.
The resulting output is as follows:
| Corresponding Collection | film |
+--------------------------+-------------+
| Corresponding Field | vector |
+--------------------------+-------------+
| Index Type | IVF_FLAT |
+--------------------------+-------------+
| Metric Type | L2 |
+--------------------------+-------------+
| Params | nlist: 1024 |
+--------------------------+-------------+
Create index successfully!
Create a Vector Index in Milvus Using Python
To create a vector index on a Milvus collection in Python, we can use the PyMilvus package and the create_index method.
The code is as follows:
# configure index params
index_params = {
"metric_type":"L2",
"index_type":"IVF_FLAT",
"params":{"nlist":1024}
}
collection = Collection("film")
collection.create_index(
field_name=vector,
index_params=index_params
)
utility.index_building_progress("film")
This should set up a similar vector index on the server.
NOTE: Ensure that another vector index does not exist on the target film. If so, Milvus will fail to create the new index and return an error.
Create a Vector Index in Milvus Using cURL
You can also use cURL to create a vector index on a target collection as shown in the following command:
'http://localhost:9091/api/v1/index' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"collection_name": "film",
"field_name": "vector",
"extra_params":[
{"key": "metric_type", "value": "L2"},
{"key": "index_type", "value": "IVF_FLAT"},
{"key": "params", "value": "{"nlist":1024}"}
]
}'
In this case, we use cURL with a POST request to index the endpoint and pass the data of the vector index as JSON data.
Conclusion
We learned how we can use the Milvus CLI, cURL, and Python to create a vector index in simple steps.