AI

PyMilvus Load()

Milvus is an open-source vector database to embed the similarity search and AI applications. Milvus provides efficient tools and features to manage and search the unstructured data in a vector storage. It is an efficient tool to build the AI and machine learning applications like recommendation systems.

Milvus also provides the PyMilvus SDK, a Python SDK to interact and work with the Milvus database using the Python programming language.

One function in this SDK is the load() function. This function in PyMilvus loads a collection or partition data into the memory before performing any vector similarity search operation.

In this tutorial, we will walk you through the steps on working with the load() function in PyMilvus.

Install Milvus and PyMilvus

Before you start, you need to install the Milvus server and PyMilvus SDK. You can check our tutorial on how to install Milvus on your system.

To install PyMilvus, you can use pip:

pip install pymilvus

Import the Required Libraries

Once installed, we can import the PyMilvus and Numpy packages which allows us to interact with Milvus and create a vector data.

from pymilvus import Milvus, DataType, CollectionSchema, FieldSchema, Collection

Connect to the Milvus Server

Next, we need to connect to the server before interacting with its stored data. We can use the code as follows:

milvus = Milvus(host='localhost', port='19530')

Define the Collection

Next, we can create a collection to store the vector data. We create a simple collection for demonstration purposes as illustrated in the following code:

dim = 128

default_fields = [

FieldSchema(name="count", dtype=DataType.INT64, is_primary=True, auto_id=True),

FieldSchema(name="random_vector", dtype=DataType.FLOAT_VECTOR, dim=dim)

]

default_schema = CollectionSchema(fields=default_fields, description="simple collection")

This code defines the schema of the collection and the respective fields.

Finally, we can create the collection using the “Collection” method:

collection_name = "simple_collection"
collection = Collection(name=collection_name, schema=default_schema)

Insert the Data into the Collection

Next, we can insert some random vectors into the collection:

import numpy as np

# generate random vectors
vectors = np.random.random([1000, dim]).tolist()
collection.insert([vectors])

Finally, call flush() to ensure that all data are persisted to Milvus:

collection.flush()

Load the Collection into the Memory

Before searching the vectors in Milvus, we must load the collection into the memory:

collection.load()

Perform the Vector Similarity Search

Finally, we can perform a vector similarity search. Let’s take a random vector and search the top 10 most similar vectors in the collection:

search_params = {"metric_type": "L2", "params": {"nprobe": 10}}

topK = 10

query_vectors = np.random.random([1, dim]).tolist()

results = collection.search(query_vectors, "random_vector", search_params, topK)

for vector in results[0]:

print(vector.id)

There, you understand how to work with the load() function in PyMilvus.

Conclusion

This tutorial demonstrates the basic load() function usage in the PyMilvus library.

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list