You can think of a Milvus collection as a single organization unit within the database where the vectors with related characteristics are stored.
When creating a Milvus collection, we need to specify the collection’s schema which defines the structure of the stored vector data. The schema includes the properties such as the vectors, the corresponding data types, and any additional attributes of the vectors.
In this tutorial, we will learn how to use the “collection” method to create a collection with a specified schema or fetch an existing collection with the provided name.
Requirements:
To follow along with this post, ensure that you have the following:
- Installed Milvus server on your machine
- Python 3.10 and above
- PyMilvus
PyMilvus Collection
The “collection” method in PyMilvus is a constructor method that allows us to create a collection with the defined schema or to fetch an existing collection with the provided name.
The method syntax is as follows:
The parameters are explained in the following:
- Name – This specifies the name of the collection that you wish to create or get.
- Schema – This defines the schema of the collection to create.
- Properties – This denotes a dictionary that contains the properties to modify the collection.
- Using – It defines which Milvus connection is used to create the collection.
- Shards_num – The shards_num parameter sets the number of shards of the collection to create.
- Num_partitions – This sets the number of logical partitions that are allocated for the collection.
- kwargs: consistency_level – It sets with consistency level to create the collection.
Return Value
The function returns a new collection object that is created with the specified schema or an existing collection object by name.
Example Usage
In the following example code, we demonstrate how we can use the “collection” method to create a collection that stores the film information. This includes the vector information, data types, etc.
film_id = FieldSchema(
name="film_id",
dtype=DataType.INT64,
is_primary=True,
)
title = FieldSchema(
name="title",
dtype=DataType.VARCHAR,
)
release_year = FieldSchema(
name="release_year",
dtype=DataType.INT64,
)
genres = FieldSchema(
name="genres",
dtype=DataType.STRING_VECTOR,
dim=10,
)
schema = CollectionSchema(
fields=[film_id, title, release_year, genres],
description="Film information",
)
collection_name = "film"
collection = Collection(
name=collection_name,
schema=schema,
using="default",
shards_num=2,
consistency_level="Strong",
)
collection.schema
{
auto_id: False,
description: "Film information",
fields: [
{
name: film_id,
description: "",
type: 5,
is_primary: True,
auto_id: False,
},
{
name: title,
description: "",
type: 1,
},
{
name: release_year,
description: "",
type: 5,
},
{
name: genres,
description: "",
type: 101,
params: {"dim": 10},
},
],
}
collection.description
'Film information'
collection.name
'film'
collection.is_empty
True
collection.primary_field
{
name: film_id,
description: "",
type: 5,
is_primary: True,
auto_id: False,
}
The previous example demonstrates a fundamental usage of the “collection” method in setting up a new collection with the provided schema.
The following are some properties to define when creating a new collection:
Conclusion
We explored how to use the “collection” constructor method to create a new collection or get a collection with the provided name. Feel free to explore the docs for more information.