An index operates within one or more pods, ensuring the efficient data handling. It acts as the central entity for accepting vectors, facilitating query serving, and enabling diverse vector-based computations within its scope.
What Are Pinecone Pods?
In Pinecone, a pod refers to a pre-configured hardware unit to run the Pinecone server. Each index runs on one or more pods. The number of pods can determine the storage space for the index, the total capacity, latency, throughput, and more.
Hence, you can increase the number of pods to scale your Pinecone index horizontally. The higher the number of pods, the higher the storage capacity, lower latency, and higher throughput.
Types of Pinecone Pods
There are various pod types that are supported in the Pinecone indexes as described in the following:
S1 Pods
The first type of pod in Pinecone is the s1 pod. These are storage-optimized which aim to provide a large storage capacity at the cost of higher latency. These pod types are essential when storing an extreme amount of vector data in your index with minimal latency requirements.
A typical storage capacity of an s1 pod is around 5 million vectors of 768 dimensions.
P1 Pods
The next types of pods in Pinecone are p1 pods. These types of pods are heavily optimized for performance by providing an extremely low query latency at the expense of vector capacity. They are effective for high-performance vector applications.
The typical size of a p1 pod is around 1 million vectors of 768 dimensions.
P2 Pods
Finally, we have access to p2 pods. These pod types provide a higher query throughput but with low latency. A p2 pod can store around 1M vectors with a dimension of 768. However, the storage capacity can vary depending on the vector dimensions.
Pod Size and Performance
The performance of pods in Pinecone can vary based on several factors. To gauge the performance of your workloads on a specific pod type, you can run an experiment for your data.
For each pod type, Pinecone offers four pod sizes namely x1, x2, x4, and x8. As you move up in pod sizes, the index’s storage and compute capacity double with each step.
The default pod size is x1, but you have the flexibility to increase the size of a pod after creating the index. This allows you to scale up your resources according to your requirements and optimize the performance of your system.
Configure the Pod Size in Pinecone
Let us now learn how to increase your pod’s size after creation.
NOTE: Changing the pod size for your index does not result in downtime. Hence, all read and write operations can continue uninterrupted during scaling.
We can use the configure_index() method from the Pinecone Python client to increase the pod size after creation.
An example is as follows:
The previous code doubles the size of the s1 type pod for the sample index.
You can use the HTTP requests and API endpoints to perform the same action. An example command is as follows:
-H 'Api-Key: YOUR_API_KEY' \
-H 'Content-Type: application/json' \
-d '{
"pod_type": "s1.x2"
}
}'
In the previous request, you can change the URL to your target Pinecone index.
Check the Scaling Status
Once you initiate the scaling process, the scaling process usually takes up to 10 minutes. You can check the scaling process by describing the index as follows:
Output:
"database": {
"name": "sample-index",
"dimensions": "1024",
"metric": "euclidean",
"pods": 6,
"replicas": 2,
"shards": 3,
"pod_type": "s1.x2",
"index_config": {},
"status": {
"ready": true,
"state": "ScalingUp"
}
}
}
The status field contains the key-value pair “state”: “ScalingUp” which indicates the ongoing resizing process.
Conclusion
You understood the workings of pods in Pinecone, what role they play, and how a specific pod type and size can impact the various aspects of your index performance. Explore the docs to learn more about pod scaling.