AI

Create a Namespace in Pinecone

Pinecone is an advanced, highly scalable, distributed vector database designed for efficient storage, retrieval, and similarity search of high-dimensional vectors. Pinecone is a highly efficient and specialized database that is optimized for vector-based data sets. Each vector represents a numerical representation of a given object. This is extremely useful for data that can be stored as vectors such as images, text documents, audio data, and more.

What Is a Pinecone Namespace?

In Pinecone, a namespace refers to a logical container or a partition within the database that allows the data to be organized and separated into various logical containers that match a specific criteria.

Think of namespace as grouping the related vectors into individual partitions, making searching and matching the related data easier.

A namespace in Pinecone is a distinct entity that encapsulates a specific collection of vectors and defines a separate space for indexing, retrieval, and management operations. This, in turn, helps to maintain the data isolation and provides a layer of abstraction for different applications or data sources that may coexist in the database.

Once you define a partition or namespace, you can limit the operations such as queries to a given namespace, limiting the search index to a given subset.

Every index in the Pinecone database comprises one or more namespaces, while every vector exists in exactly one namespace.

You can specify the namespace that you wish to target by defining it by its unique name. Each namespace must contain a unique name. By default, Pinecone creates a default namespace that is identified by an empty string.

Let us explore how we can create the namespaces in Pinecone DB.

Create a Namespace in Pinecone

In Pinecone, you can create a namespace during a vector upsert. A vector upsert refers to inserting or updating a vector within the database’s namespace. Think of it as an amalgamation of the update and insert operations, representing a unified operation that handles both tasks simultaneously.

If the destination namespace that is specified during the upsert operation does not exist, Pinecone creates the specified namespace automatically.

Example:

Let us cover a basic example which demonstrates how to use the upsert feature and define a namespace that you wish to create/insert.

NOTE: This tutorial assumes that you have a Pinecone project configured in the cloud. If not, you can check the Pinecone Cloud to learn how to configure it.

Install the Pinecone Client and cURL

For this tutorial, we use both Python and cURL to demonstrate how to create a namespace if it does not exist yet.

Install the pinecone client using pip:

$ pip install pinecone-client

If you wish to use cURL instead of Python, you can install cURL with the “apt” command:

$ sudo apt-get install curl -y

Create a Namespace in Pinecone Using Python

The following code demonstrates how to create a namespace called “linuxhint” during an upsert operation of generated vectors:

import pinecone

# init pinecone configuration

pinecone.init(api_key="YOUR_API_KEY", environment="YOUR_PROJECT_ENVIRONEMT")

# create basic index

pinecone.create_index("sample", dimension=8, metric="euclidean")

# connect to the index

index = pinecone.Index("sample")

# upsert operation while creating a new namespace

index.upsert(vectors=[("A", [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1])],

namespace="linuxhint")

# query data in that namepsace

index.query(vector=[0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1], namespace="linuxhint", top_k=1)

In the previous code, we start by importing the Pinecone client. We then initialize the Pinecone configuration by providing the API key and specifying the project environment. You can get these values from the Pinecone console.

Next, we create a new index called “sample” with a dimension of 8 and a Euclidean distance of metric.

Once created, we connect to the sample index using the pinecone.Index() method.

Finally, we perform an upsert operation in a new namespace called “linuxhint” with a vector id of “A” and some sample values.

NOTE: Since no namespace is called “linuxhint”, Pinecone automatically creates one during the upsert operation.

Finally, we retrieve the top 1 most similar vector based on the search criteria. However, we limit the search to the linuxhint namespace.

Create a Namespace in Pinecone Using cURL

We can also use cURL to create a namespace in the target database server as shown in the following command:

curl -i -X POST https://YOUR_INDEX-YOUR_PROJECT.svc.YOUR_ENVIRONMENT.pinecone.io/sample/upsert \
-H 'Api-Key: YOUR_API_KEY' \
-H 'Content-Type: application/json' \
-d '{
"vectors": [
{
"id": "A",
"values": [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]
}
],
"namespace": "linuxhint"
}'

Similarly, this request should perform an upsert operation to the specified index and create the provided namespace if it does not exist.

Create Multiple Namespaces

You can also create more than one namespace as demonstrated in the following example code:

import pinecone
import numpy as np
# init pinecone configuration
pinecone.init(api_key="0f57b6af-ea59-4fd3-a0ce-3c7f0c1d419f", environment="us-west1-gcp-free")

# create basic index
pinecone.create_index("sample", dimension=8, metric="euclidean")

# connect to the index
index = pinecone.Index("sample")

# Create three sets of 8-dimensional vectors
vectors_a = np.random.rand(100, 8).tolist()
vectors_b = np.random.rand(200, 8).tolist()
vectors_c = np.random.rand(300, 8).tolist()

# Create ids
ids_a = map(str, np.arange(100).tolist())
ids_b = map(str, np.arange(200).tolist())
ids_c = map(str, np.arange(300).tolist())

# Insert into separate namespaces
index.upsert(vectors=zip(ids_a,vectors_a),namespace='linuxhint_namespace_a')
index.upsert(vectors=zip(ids_b,vectors_b),namespace='linuxhint_namespace_b')

# use defualt namespace
index.upsert(vectors=zip(ids_c,vectors_c))

In this case, we demonstrate how to perform multiple upsert operations while creating multiple namespaces.

Conclusion

We learned about the role of namespaces in a Pinecone database. We also learned how to create a namespace in a given index during an upsert operation using Python and cURL. Feel free to explore the docs for more information.

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list