It works by organizing the embeddings to optimize the nearest-neighbor queries which allows you to find the most similar vectors quickly. As such, a Pinecone index can handle the large-scale vector datasets with incredibly swift insert, update, and delete operations while maintaining a high-search efficiency.
What Is an Update in Pinecone?
In Pinecone, an upsert is an operation that allows us to update the vector data in a given index. Unlike an upsert operation which performs both an insert and update operation, an update operation only updates the existing vector data.
In this tutorial, we will learn how to carry out an update operation in Pinecone using the Pinecone client for Python.
Requirements:
To follow along with this tutorial, ensure that you have the following:
- Installed Python 3.10 and above
- Basic Python programming knowledge
Installing the Pinecone Client
Before interacting with the Pinecone server using Python, we need to install the Pinecone client on our machine. Luckily, we can do this with a simple “pip” command as follows:
The previous command should download the latest stable version of the Pinecone client and install it in your project.
Creating an Index and Insert
Once we install the Pinecone client, we can create an index to store the vector data.
We can do this using the create_index() method as shown in the following example code:
import numpy as np
# init pinecone configuration
pinecone.init(api_key="YOUR_API_KEY", environment="YOUR_ENV")
# create a basic index
pinecone.create_index("sample", dimension=1024, metric="euclidean", pod_type="p1", pods=1, replicas=1)
# connect to the index
index = pinecone.Index("sample")
# Create three sets of 8-dimensional vectors
vectors_a = np.random.rand(15, 8).tolist()
# Create ids
ids_a = map(str, np.arange(15).tolist())
# Insert into separate namespaces
index.upsert(vectors=zip(ids_a,vectors_a),namespace='linuxhint_namespace_a')
The previous code starts by initializing Pinecone. It then creates a basic index named “sample” with specified parameters and establishes a connection to the index.
Next, we generate a set of 15-dimensional vectors and the corresponding IDs.
Finally, it inserts the vectors into the “linuxhint_namespace_a” namespace of the “sample” index using the upsert functionality.
Pinecone Index.Update()
This method updates the vectors in a namespace. If a value is included, it overwrites the previous value.
The function syntax is as follows:
The function accepts the parameters as follows:
- id - The vector’s unique ID.
- Values – It specifies the vector data.
- Set_metadata – It specifies the metadata to set for the vector.
- Namespace – It defines the namespace that contains the vector.
Function Usage
The following example demonstrates how to update the sample index that we created earlier:
id=ids_a,
values=np.random.rand(15, 8).tolist(),
set_metadata={'info': 'numpy random'},
namespace='linuxhint_namespace_a'
)
The previous code updates the index vectors with new values that are generated by the NumPy random module. We also add the metadata to the vectors.
Conclusion
You learned about the update operation when working with vectors to update an existing vector with new values or add the metadata.