AI

Create a Field Schema in Milvus

Milvus is an open-source vector database that is designed for handling and querying the large-scale vector data which are often used for tasks like similarity search and recommendation systems.

A Milvus field schema defines the structure and properties of a specific field within a collection.

A field represents a particular attribute or feature of a vector. For example, in an image search application, the fields could define the attributes like color, shape, or texture.

Each field can have its field schema which specifies how the data in that field should be treated and indexed.

Anatomy of a Milvus Field Schema

The field schema in Milvus is composed of the following properties:

Field name – A unique identifier for the field within a collection.

Data type – It defines the data type of the data that is stored in the field such as int, float, or binary.

Dimension – It sets the dimensionality of the field. For example, if the field represents an image feature vector, the dimension could be the number of dimensions in the vector.

Index type – This property defines the type of index to be used for the field. Milvus supports the various index types that are optimized for scenarios such as flat, IVF (Inverted File), or HNSW (Hierarchical Navigable Small World).

Description – It describes the field as a string type.

is_primary – It specifies whether to set the field as the primary key field or not.

Extra parameters – Depending on the chosen index type, you can specify the other parameters for the given field. These parameters allow you to fine-tune the index behavior for optimal performance and accuracy.

Supported data type:

The following are the supported data types for the field schema:

Primary Key Field Data Types

  • INT64: numpy.int64
  • VARCHAR: VARCHAR

Scalar Field Data Types

  • BOOL: Boolean (true or false)
  • INT8: numpy.int8
  • INT16: numpy.int16
  • INT32: numpy.int32
  • INT64: numpy.int64
  • FLOAT: numpy.float32
  • DOUBLE: numpy.double
  • VARCHAR: VARCHAR

Vector Field Data Types

  • BINARY_VECTOR: Binary vector
  • FLOAT_VECTOR: Float vector

Let us cover the basics of configuring a field schema in Milvus for an image search application using Python.

Requirements:

To use this tutorial, ensure that you have the following:

  1. Installed Python 3
  2. The Milvus Python Client
  3. The Milvus database server

With the given conditions met, we can proceed to the next step.

Create a Field Schema in Milvus

The following example demonstrates how we can define a field schema using the PyMilvus client:

from pymilvus import DataType, FieldSchema, IndexType

id = FieldSchema(name="id", dtype=DataType.INT64,

is_primary=True, description="primary key")

field_schema = FieldSchema(name="image", data_type=DataType.BINARY_VECTOR,

dimension=2048, index_type=IndexType.IVF_SQ8, index_params={"nlist": 2048})

The given code allows us to define two field schemas using the “FieldSchema” class from the PyMilvus library.

The first field schema is the “id” field. This field represents a primary key field with the following properties:

  1. Name: id
  2. Data Type: int64 (6-bit integer)
  3. is_primary – True (indicates that the id is used as a primary key field)

Description – primary key (an optional description for the field)

The second field schema is the image field which represents an image field. It has the following properties:

  1. Name – image
  2. Data type – BINARY_VECTOR (binary data representing a vector)
  3. Dimension – 2048 (the dimensionality of the image feature vector)
  4. Index type – IVF_SQ8 (Inverted File with Product Quantizer based on 8-bit quantization)
  5. Index parameters – {“nlist”: 2048} (specifying the number of clusters or lists used in the index).

We define the image field as a binary vector field which is suitable to store the image feature vectors.

Conclusion

You learned how you can use the PyMilvus package to define the various types of field schemas in Milvus.

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list