Elastic Search

Elasticsearch Clone Index

In this post, we will learn how to clone an existing index in an Elasticsearch cluster. This can allow you to perform modifications to an index without altering the original index. It also allows you to quickly rename, index, and transfer the old data in a single command.

Let’s dive in.

Request Syntax

The following snippet shows the syntax for the clone index API:

POST /<index>/_clone/<target-index>
PUT /<index>/_clone/<target-index>

 
The request clones the index into a new index where each of the original primary shard is copied into a new primary shared on the new index.

Keep in mind that Elasticsearch will not clone the index templates and index metadata to the clone index. Such metadata include aliases, CCR followers, etc.

How Index Cloning Works

Elasticsearch performs the following actions when cloning an index:

    1. Elasticsearch creates a new index with similar definitions as the source index.
    2. The next step is hard-linking the segments of the source index into the new clone index.
    3. Lastlty, Elasticsearch recovers the clone index as though it has recovered from a closed state.

Conditions for Index Cloning

The following conditions are necessary for cloning an index:

    1. An index with a similar name as the target must not exist in the cluster.
    2. The source index must be index.
    3. The number of shards of a source and target index must be similar.
    4. The node on which the clone process is performed must have sufficient disk space to accommodate the new index.

Example Illustration

Let us look at an example on how to use the Elasticsearch clone API to clone an existing index.

Suppose we have an index called “netflix”. We can create a clone index with the request as shown:

We can start by setting the index to read-only. Elasticsearch will not clone an index in the write mode:

curl -XPUT "http://localhost:9200/netflix/_block/read_only" -H "kbn-xsrf: reporting"

 
This should return as follows:

{
  "acknowledged": true,
  "shards_acknowledged": true,
  "indices": [
    {
      "name": "netflix",
      "blocked": true
    }
  ]
}

 
Finally, we can clone the index as shown in the following:

curl -XPOST "http://localhost:9200/netflix/_clone/netflix_copy" -H "kbn-xsrf: reporting"

 
Running the previous request should create a clone of the index and return an output as:

{
  "acknowledged": true,
  "shards_acknowledged": true,
  "index": "netflix_copy"
}

 
If you wish to specify the index settings and aliases, you can run the following command as shown:

curl -XPOST "http://localhost:9200/netflix/_clone/netflix_cp" -H "kbn-xsrf: reporting" -H "Content-Type: application/json" -d'
{
  "settings": {
    "index.number_of_shards": 1,
    "number_of_replicas": 3
  },
  "aliases": {
    "netflix_alias": {}
  }
}'

 
Keep in mind that the number of shards of the clone index must be similar to the number of shards of the source index.

Output:

{
  "acknowledged": true,
  "shards_acknowledged": true,
  "index": "netflix_cp"
}

 
NOTE: Elasticsearch returns an acknowledged: true status immediately. The request does not wait for the cloning process to complete.

Conclusion

In this post, we discussed the fundamentals of working with the Elasticsearch cloning API. This allows you to create a copy of an existing index in your cluster.

Thanks for reading!

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list