It works by organizing the embeddings to optimize the nearest-neighbor queries which allows you to find the most similar vectors quickly. As such, a Pinecone index can handle the large-scale vector datasets with the incredibly swift insert, update, and delete operations while maintaining a high-search efficiency.
What Is Upsert in Pinecone?
In Pinecone, an upsert is an operation that allows us to update or insert the vector embeddings into an existing index. It combines the update of current vectors and the insert of new vectors in a single process.
When you invoke an upsert operation, the Pinecone engine checks if the specified vector already exists on the index. It performs an update operation with the new vector values if it does. Otherwise, if the vector does not exist, it creates a new one in the index.
The upsert functionality is useful when you have a dynamic dataset and you need to continuously update or add new vectors without rebuilding the entire index.
Due to such functionality, it enables the support for real-time updates to the index while allowing the synchronized changes in the data. This can be particularly valuable in applications such as recommendation systems where the user preferences or item features change over time.
In this tutorial, we will learn how to conduct an upsert operation in Pinecone using the Pinecone client for Node.js.
Requirements:
To follow along with this post, ensure that you have the following:
- Installed Node.js 17 and above
- A configured Pinecone cluster
Installing the Pinecone Client
The first step is ensuring that the Pinecone client for Node.js is installed on the machine. We can do this by running the following command:
Once installed, we can proceed and learn how to configure Pinecone with Node.js.
Initializing the Client
Before interacting with the Pinecone database, we must create a client with the server configuration using the API key and the environment properties.
The following code shows how to use the “PineconeClient” and the init method:
const pinecone = new PineconeClient();
await pinecone.init({
environment: "us-west1-gcp-free",
apiKey: "0f57b6af-ea59-4fd3-a0ce-3c7f0c1d419f"
});
In this case, we initialize a new Pinecone client using the provided environment and API Key.
Create an Index in Pinecone Using Node.js
Once connected to the server, we can proceed and create an index to store the target data. The Node.js client provides us with the CreateIndex() method which enables us to quickly configure a new index as shown in the following example code:
createRequest: {
name: "sample-index",
dimension: 8,
metric: "cosine"
}
});
In the given example, we use the createIndex() method to create an index called “sample-index” with a dimension of 8 and a cosine distance metric.
Pinecone Node.js Upsert
In the Pinecone Node.js client, we have access to the index.upsert() method which allows us to write the vectors into a namespace. As mentioned, an upsert operation combines an insert and update operations in a single query. Hence, the operation overwrites the previous value if you upsert a new value for an existing vector ID.
The method accepts the following parameters:
- requestParameters – This defines an upsert operation wrapper.
- upsertRequest – This sets the actual upsert request.
The upsert request is composed of the following values:
- Vectors – This is an array that contains the vectors that you wish to insert.
- Id – It defines the unique ID of the vector that you wish to upsert.
- Values – The vector values.
- Metadata – This is an object that defines the metadata of the vector.
- Namespace – The namespace parameter defines the namespace on which you wish to insert the data. If the provided namespace does not exist, Pinecone creates one automatically.
The method returns an integer number which denotes the number of records that are upserted in the index.
The following code demonstrates the upsert() method to add a data to a given Pinecone index:
const upsertResponse = await index.upsert({
upsertRequest: {
vectors: [
{
id: "vec1",
values: [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8],
metadata: {
active: true,
},
},
{
id: "vec2",
values: [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8],
metadata: {
active: false,
},
},
],
namespace: "linuxhint-namespace",
},
});
console.log(upsertResponse)
In the given example, we connect to the index where we wish to insert the data.
Next, we define the upsert request wrapper and pass the upsert request with the vector data that we wish to insert.
We also provide the target namespace that we wish to use. Since the namespace does not exist, Pinecone creates it automatically before storing the data.
Conclusion
We explored the workings of an upsert operation in Pinecone using the Node.js client to add a sample data to an existing Pinecone index.