AI

Pinecone Insert Metadata Into an Index

Index metadata in Pinecone refers to the additional information or attributes associated with vectors that are stored in an index. Pinecone allows us to attack the key-value pairs to vectors that are treated as metadata. This enables us to provide contextual information or descriptive attributes that are associated with each vector.

One usage of vector metadata is vector filtering. For example, you can specify a filter expression when querying a Pinecone index which filters only the vectors that match that metadata.

Such filter expressions allow to limit the vector searches based on specific metadata criteria. We can retrieve only the nearest-neighbor results that match the specified metadata filters by including the filter expressions.

This capability enables more precise and targeted searches as we can leverage the metadata to narrow the search space and retrieve the results that meet a specific criteria.

This tutorial teaches us how to insert the metadata in a given index. We will also learn how to use these indexes to perform the metadata filtering for more granular searches.

Requirements:

To follow along with this tutorial, ensure that you have the following:

  1. Installed Python 3.10 and above
  2. Basic Python programming knowledge

Installing the Pinecone Client

Before interacting with the Pinecone server using Python, we need to install the Pinecone client on our machine. Luckily, we can do this with a simple “pip” command as follows:

$ pip3 install pinecone-client

The previous command should download the latest stable version of the Pinecone client and install it in your project.

Creating a Sample Index

The first step is to set up a basic index which we will use for demonstration purposes. In this case, we create a basic index that stores the book information.

import pinecone

# init pinecone configuration

pinecone.init(api_key="YOUR_API_KEY", environment="YOUR_ENV")

# create basic index

pinecone.create_index("book", dimension=8, metric="euclidean", pod_type="p1", pods=1, replicas=1)

# connect to the index

index = pinecone.Index("book")

The previous code initializes the Pinecone instance and creates a book index with a dimension of 8.

Inserting Vectors with Metadata

Once we have an index created, we can use the upsert operation to insert the vectors with metadata as shown in the following example code:

index.upsert([
("A", [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1], {"genre": "comedy", "year": 2020, "title": "Book A"}),
("B", [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2], {"genre": "mystery", "year": 2019, "title": "Book B"}),
("C", [0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3], {"genre": "comedy", "year": 2019, "title": "Book C"}),
("D", [0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4], {"genre": "drama", "title": "Book D"}),
("E", [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5], {"genre": "romance", "title": "Book E"})
])

The previous code inserts the vectors that represent the book information. It also includes the metadata such as the genre, year, and title.

Filtering with Metadata

Once we insert the vectors with metadata, we can use this information to perform the granular filtering as shown in the following example code:

index.query(
vector=[0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1],
filter={
"genre": {"$eq": "comedy"},
"year": 2020
},

The previous code should query the index and return only the vectors where the genre is equal to comedy and the year is equal to 2020.

The supported metadata filters are as follows:

You can combine the metadata filters using the AND and OR operators.

  • $eq – Equal to (number, string, boolean)
  • $ne – Not equal to (number, string, boolean)
  • $gt – Greater than (number)
  • $gte – Greater than or equal to (number)
  • $lt – Less than (number)
  • $lte – Less than or equal to (number)
  • $in – In array (string or number)
  • $nin – Not in the array (string or number)

There you have it!

Conclusion

You learned about the concept of vector metadata in Pinecone, how to insert the vectors with metadata, and how to use the vector metadata to filter the vectors that match the specified criteria.

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list