What Is a Pinecone Namespace?
In Pinecone, a namespace refers to a logical container or a partition within the database that allows the data to be organized and separated into various logical containers that match a specific criteria.
Think of namespace as grouping the related vectors into individual partitions, making searching and matching the related data easier.
A namespace in Pinecone is a distinct entity that encapsulates a specific collection of vectors and defines a separate space for indexing, retrieval, and management operations. This, in turn, helps to maintain the data isolation and provides a layer of abstraction for different applications or data sources that may coexist in the database.
Once you define a partition or namespace, you can limit the operations such as queries to a given namespace, limiting the search index to a given subset.
Every index in the Pinecone database comprises one or more namespaces, while every vector exists in exactly one namespace.
You can specify the namespace that you wish to target by defining it by its unique name. Each namespace must contain a unique name. By default, Pinecone creates a default namespace that is identified by an empty string.
Let us explore how we can create the namespaces in Pinecone DB.
Create a Namespace in Pinecone
In Pinecone, you can create a namespace during a vector upsert. A vector upsert refers to inserting or updating a vector within the database’s namespace. Think of it as an amalgamation of the update and insert operations, representing a unified operation that handles both tasks simultaneously.
If the destination namespace that is specified during the upsert operation does not exist, Pinecone creates the specified namespace automatically.
Example:
Let us cover a basic example which demonstrates how to use the upsert feature and define a namespace that you wish to create/insert.
NOTE: This tutorial assumes that you have a Pinecone project configured in the cloud. If not, you can check the Pinecone Cloud to learn how to configure it.
Install the Pinecone Client and cURL
For this tutorial, we use both Python and cURL to demonstrate how to create a namespace if it does not exist yet.
Install the pinecone client using pip:
If you wish to use cURL instead of Python, you can install cURL with the “apt” command:
Create a Namespace in Pinecone Using Python
The following code demonstrates how to create a namespace called “linuxhint” during an upsert operation of generated vectors:
# init pinecone configuration
pinecone.init(api_key="YOUR_API_KEY", environment="YOUR_PROJECT_ENVIRONEMT")
# create basic index
pinecone.create_index("sample", dimension=8, metric="euclidean")
# connect to the index
index = pinecone.Index("sample")
# upsert operation while creating a new namespace
index.upsert(vectors=[("A", [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1])],
namespace="linuxhint")
# query data in that namepsace
index.query(vector=[0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1], namespace="linuxhint", top_k=1)
In the previous code, we start by importing the Pinecone client. We then initialize the Pinecone configuration by providing the API key and specifying the project environment. You can get these values from the Pinecone console.
Next, we create a new index called “sample” with a dimension of 8 and a Euclidean distance of metric.
Once created, we connect to the sample index using the pinecone.Index() method.
Finally, we perform an upsert operation in a new namespace called “linuxhint” with a vector id of “A” and some sample values.
NOTE: Since no namespace is called “linuxhint”, Pinecone automatically creates one during the upsert operation.
Finally, we retrieve the top 1 most similar vector based on the search criteria. However, we limit the search to the linuxhint namespace.
Create a Namespace in Pinecone Using cURL
We can also use cURL to create a namespace in the target database server as shown in the following command:
-H 'Api-Key: YOUR_API_KEY' \
-H 'Content-Type: application/json' \
-d '{
"vectors": [
{
"id": "A",
"values": [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]
}
],
"namespace": "linuxhint"
}'
Similarly, this request should perform an upsert operation to the specified index and create the provided namespace if it does not exist.
Create Multiple Namespaces
You can also create more than one namespace as demonstrated in the following example code:
import numpy as np
# init pinecone configuration
pinecone.init(api_key="0f57b6af-ea59-4fd3-a0ce-3c7f0c1d419f", environment="us-west1-gcp-free")
# create basic index
pinecone.create_index("sample", dimension=8, metric="euclidean")
# connect to the index
index = pinecone.Index("sample")
# Create three sets of 8-dimensional vectors
vectors_a = np.random.rand(100, 8).tolist()
vectors_b = np.random.rand(200, 8).tolist()
vectors_c = np.random.rand(300, 8).tolist()
# Create ids
ids_a = map(str, np.arange(100).tolist())
ids_b = map(str, np.arange(200).tolist())
ids_c = map(str, np.arange(300).tolist())
# Insert into separate namespaces
index.upsert(vectors=zip(ids_a,vectors_a),namespace='linuxhint_namespace_a')
index.upsert(vectors=zip(ids_b,vectors_b),namespace='linuxhint_namespace_b')
# use defualt namespace
index.upsert(vectors=zip(ids_c,vectors_c))
In this case, we demonstrate how to perform multiple upsert operations while creating multiple namespaces.
Conclusion
We learned about the role of namespaces in a Pinecone database. We also learned how to create a namespace in a given index during an upsert operation using Python and cURL. Feel free to explore the docs for more information.