“Like any other data storage system or database, when working with Elasticsearch, you will come across an instance where you need to determine the disk usage for your cluster or index. This can help you plan your cluster arrangement and nodes.”
In this tutorial, you will learn various methods and techniques for determining the disk usage for your cluster or Elasticsearch index.
Let’s dive in.
Method 1 – Per Shard Disk Stats
Using the cat shards API, you can view the disk usage for each shard in the cluster. In addition, the API should return detailed information about the shards, including information such as the node, number of documents, disk usage, etc.
We can use this API to show disk usage per shard, as shown in the query below.
The request above should return information per shard basis. You will find disk usage for each shard in the store column.
An example output is as shown:
The output above should disk usage for each size in a human-readable format.
Method 2 – Disk Usage for Node Basis
We can also retrieve disk usage information on a node basis using the cat allocations API. An example command is as shown:
The command should return, such as the number of shards in each node, disk used, disk available, and disk total. Using the human parameter produces the disk usage in a human-readable format.
An example output:
You can also use nodes statistics API. An example command is as shown:
The command returns the node information, including disk usage, as shown:
Method 3 – Disk Usage Information in Index (Experimental)
As of writing this tutorial, Elasticsearch has an experimental disk usage API. You can use this API to get the disk usage information of a specific index.
The syntax is as shown:
The query above requires the run_expensive_task parameter to be true. This is because the disk usage API is regarded as a resource-intensive operation.
Otherwise, you will get an error as:
For example, we can get the disk usage information of an index called earthquake:
The disk usage information is as shown:
The query will return the disk usage of the specified index. Note that the command will also return each field and its corresponding size.
Closing
In this tutorial, you learned various methods and techniques for fetching disk usage information in the Elasticsearch cluster.
Thanks for reading!!