AI

Milvus Create an Index on the Vector

Milvus is an open-source vector database that stores and retrieves the large-scale vectors or embeddings and numerical representations of complex objects such as images, texts, or audio.

Milvus indexing refers to the process of organizing and structuring the high-dimensional data in a way that enables an efficient similarity search operations.

What Is a Vector in Milvus?

A Milvus vector is a numerical representation of an entity or data point in a high-dimensional space. It is a fundamental concept that is used in similarity search and vector databases.

A vector is represented as an array of numerical values where each value corresponds to an entity’s specific feature or attribute.

Milvus allows you to store efficiently, index, and search the high-dimensional vectors by leveraging the advanced indexing techniques such as the approximate nearest neighbor (ANN) search algorithms to achieve a fast and accurate similarity matching.

Milvus Vector Index

Vector indexes in Milvus work as an organizational unit that organizes the vector metadata to enhance the vector similarity search. Without a vector index, Milvus performs a brute-force search on the specified data upon a search operation.

Let us learn how we can create a vector index to enhance the search operations on a vector.

Prepare the Parameter Index

Unlike a scalar index, we need to specify the index parameters for vector indexes. The following are the supported parameters for a vector index in Milvus.

  1. Metric_type – This parameter specifies the metric that is used to determine the similarity of vectors. The supported values include:
    1. L2 – Euclidean Distance
    2. IP – Inner Product
    3. JACCARD – Jaccard Distance
    4. TANIMOTO – Tanimoto Distance
    5. HAMMING – Hamming Distance
    6. SUPERSTRUCTURE – Superstructure distance
    7. SUBSTRUCTURE – Substructure distance.
  2. Index_type – This specifies the index type that is used to accelerate the vector search. The accepted values include:
    1. FLAT
    2. IVF_FLAT
    3. IVF_SQ8
    4. IVF_PQ
    5. HNSW
    6. ANNOY
    7. DISKANN*
    8. BIN_FLAT
    9. BIN_IVF_FLAT
  3. Params – It specifies the specific parameters for the index.

Create a Vector Index in the Milvus CLI

The following demonstrates how to create a vector index on the film collection using the create index command in the Milvus CLI.

milvus_cli > create index

Once you run the previous command, Milvus will prompt the step-by-step process of creating a vector index as shown in the following queries:

Collection name (articles, images, film, car): film

The name of the field to create an index for (vector): vector

Index name: vector_data_index

Index type (FLAT, IVF_FLAT, IVF_SQ8, IVF_PQ, RNSG, HNSW, ANNOY): IVF_FLAT

Index metric type (L2, IP, HAMMING, TANIMOTO): L2

Index params nlist: 1024

Timeout []:

From the previous prompts and outputs, we create a vector index on the “film” collection on the vector field.

In the previous example, we provide the name of the index as vector_data_index of type IVF_FLAT using the L2 metric.

The resulting output is as follows:

+--------------------------+-------------+
| Corresponding Collection | film |
+--------------------------+-------------+
| Corresponding Field | vector |
+--------------------------+-------------+
| Index Type | IVF_FLAT |
+--------------------------+-------------+
| Metric Type | L2 |
+--------------------------+-------------+
| Params | nlist: 1024 |
+--------------------------+-------------+
Create index successfully!

Create a Vector Index in Milvus Using Python

To create a vector index on a Milvus collection in Python, we can use the PyMilvus package and the create_index method.

The code is as follows:

from pymilvus import Collection, utility
# configure index params
index_params = {
"metric_type":"L2",
"index_type":"IVF_FLAT",
"params":{"nlist":1024}
}

collection = Collection("film")
collection.create_index(
field_name=vector,
index_params=index_params
)
utility.index_building_progress("film")

This should set up a similar vector index on the server.

NOTE: Ensure that another vector index does not exist on the target film. If so, Milvus will fail to create the new index and return an error.

Create a Vector Index in Milvus Using cURL

You can also use cURL to create a vector index on a target collection as shown in the following command:

curl -X 'POST' \

'http://localhost:9091/api/v1/index' \

-H 'accept: application/json' \

-H 'Content-Type: application/json' \

-d '{

"collection_name": "film",

"field_name": "vector",

"extra_params":[

{"key": "metric_type", "value": "L2"},

{"key": "index_type", "value": "IVF_FLAT"},

{"key": "params", "value": "{"nlist":1024}"}

]

}'

In this case, we use cURL with a POST request to index the endpoint and pass the data of the vector index as JSON data.

Conclusion

We learned how we can use the Milvus CLI, cURL, and Python to create a vector index in simple steps.

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list