Requirements:
To follow along with this post, ensure that you have the following:
- Installed Milvus Server on your machine
- Python 3.10 and above
- PyMilvus
PyMilvus Collection
The “collection” method in PyMilvus is a constructor method that allows us to create a collection with the defined schema or to fetch an existing collection with the provided name.
The method syntax is as follows:
The parameters are explained in the following:
- Name – This specifies the name of the collection that you wish to create or get.
- Schema – It defines the schema of the collection to create.
- Properties – This denotes a dictionary that contains the properties for modifying the collection.
- Using – It defines which Milvus connection is used to create the collection.
- Shards_num – The shards_num parameter sets the number of shards of the collection to create.
- Num_partitions – This sets the number of logical partitions that are allocated for the collection.
- kwargs: consistency_level – It sets with consistency level to create the collection.
Return Value
The function returns a new collection object that is created with the specified schema or an existing collection object by name.
Example Usage:
In the following example code, we demonstrate how to use the “collection” method to create a collection that stores the film information. This includes the vector information, data types, etc.
film_id = FieldSchema(
name="film_id",
dtype=DataType.INT64,
is_primary=True,
)
title = FieldSchema(
name="title",
dtype=DataType.VARCHAR,
)
release_year = FieldSchema(
name="release_year",
dtype=DataType.INT64,
)
genres = FieldSchema(
name="genres",
dtype=DataType.STRING_VECTOR,
dim=10,
)
schema = CollectionSchema(
fields=[film_id, title, release_year, genres],
description="Film information",
)
collection_name = "film"
collection = Collection(
name=collection_name,
schema=schema,
using="default",
shards_num=2,
consistency_level="Strong",
)
collection.schema
{
auto_id: False,
description: "Film information",
fields: [
{
name: film_id,
description: "",
type: 5,
is_primary: True,
auto_id: False,
},
{
name: title,
description: "",
type: 1,
},
{
name: release_year,
description: "",
type: 5,
},
{
name: genres,
description: "",
type: 101,
params: {"dim": 10},
},
],
}
collection.description
'Film information'
collection.name
'film'
collection.is_empty
True
collection.primary_field
{
name: film_id,
description: "",
type: 5,
is_primary: True,
auto_id: False,
}
Add Entities in Milvus
Let us proceed and learn how to insert the entities into the film collection. The first step is to prepare the data to insert. For this, we use the Python random module with a random data.
for _ in range(num_documents):
document = {
"film_id": random.randint(1, 1000),
"title": "Film " + str(random.randint(1, 100)),
"release_year": random.randint(1900, 2023),
"genres": [random.choice(["Action", "Comedy", "Drama", "Sci-Fi"]) for _ in range(10)]
}
Once you have the data ready, you can use the collection.insert() method to add the entities to the collection.
You can then verify the collection data.
This should return the total number of entities in that collection.
Conclusion
In this post, we discussed how we can create a collection in Milvus and use the Python random module to insert the entities into the film collection.