A Milvus field schema defines the structure and properties of a specific field within a collection.
A field represents a particular attribute or feature of a vector. For example, in an image search application, the fields could define the attributes like color, shape, or texture.
Each field can have its field schema which specifies how the data in that field should be treated and indexed.
Anatomy of a Milvus Field Schema
The field schema in Milvus is composed of the following properties:
Field name – A unique identifier for the field within a collection.
Data type – It defines the data type of the data that is stored in the field such as int, float, or binary.
Dimension – It sets the dimensionality of the field. For example, if the field represents an image feature vector, the dimension could be the number of dimensions in the vector.
Index type – This property defines the type of index to be used for the field. Milvus supports the various index types that are optimized for scenarios such as flat, IVF (Inverted File), or HNSW (Hierarchical Navigable Small World).
Description – It describes the field as a string type.
is_primary – It specifies whether to set the field as the primary key field or not.
Extra parameters – Depending on the chosen index type, you can specify the other parameters for the given field. These parameters allow you to fine-tune the index behavior for optimal performance and accuracy.
Supported data type:
The following are the supported data types for the field schema:
Primary Key Field Data Types
- INT64: numpy.int64
- VARCHAR: VARCHAR
Scalar Field Data Types
- BOOL: Boolean (true or false)
- INT8: numpy.int8
- INT16: numpy.int16
- INT32: numpy.int32
- INT64: numpy.int64
- FLOAT: numpy.float32
- DOUBLE: numpy.double
- VARCHAR: VARCHAR
Vector Field Data Types
- BINARY_VECTOR: Binary vector
- FLOAT_VECTOR: Float vector
Let us cover the basics of configuring a field schema in Milvus for an image search application using Python.
Requirements:
To use this tutorial, ensure that you have the following:
- Installed Python 3
- The Milvus Python Client
- The Milvus database server
With the given conditions met, we can proceed to the next step.
Create a Field Schema in Milvus
The following example demonstrates how we can define a field schema using the PyMilvus client:
id = FieldSchema(name="id", dtype=DataType.INT64,
is_primary=True, description="primary key")
field_schema = FieldSchema(name="image", data_type=DataType.BINARY_VECTOR,
dimension=2048, index_type=IndexType.IVF_SQ8, index_params={"nlist": 2048})
The given code allows us to define two field schemas using the “FieldSchema” class from the PyMilvus library.
The first field schema is the “id” field. This field represents a primary key field with the following properties:
- Name: id
- Data Type: int64 (6-bit integer)
- is_primary – True (indicates that the id is used as a primary key field)
Description – primary key (an optional description for the field)
The second field schema is the image field which represents an image field. It has the following properties:
- Name – image
- Data type – BINARY_VECTOR (binary data representing a vector)
- Dimension – 2048 (the dimensionality of the image feature vector)
- Index type – IVF_SQ8 (Inverted File with Product Quantizer based on 8-bit quantization)
- Index parameters – {“nlist”: 2048} (specifying the number of clusters or lists used in the index).
We define the image field as a binary vector field which is suitable to store the image feature vectors.
Conclusion
You learned how you can use the PyMilvus package to define the various types of field schemas in Milvus.