Elastic Search

How to Use Elasticsearch Bulk API

Elasticsearch allows you to execute multiple CRUD operations using a single API request using the bulk API. Using the bulk API can help reduce overhead and increase indexing operations. When performing consecutive CRUD operations, it is better to use the bulk API instead of separate requests for each function.

This short article will illustrate using bulk API to carry out multiple CRUD operations in a single API request.

Elasticsearch bulk API Basics

We can use the bulk API by sending an HTTP POST request to _bulk API endpoint. The request should contain the operation performed in the API call, such as creating or deleting an index.

Consider the following request.

 GET /test-index
POST _bulk
{"index": {"_index": "test-index-1", "_id": 1}}
{"field1": "value1"}
{"update": {"_id": 1, "_index": "test-index-1"}}
{"doc": {"field2": "value2"}}
{"delete": {"_index": "test-index-1", "_id": 1}}

The above example request performs three consecutive actions at once. It creates an index and adds a document.

Next, we update the record and delete it.

You should see the output similar to the one shown below:

Explanation

As mentioned, the bulk API allows you to execute multiple actions such as index, create, update and delete in a single query.

Each action is specified in the request body using newline delimited JSON format.

Both the index and create operations require you to specify the source. The index action adds or performs a replace on the index as specified. It is good to note that the index operation will fail if a document with a similar index already exists.

An update operation, on the other hand, requires a partial doc specified.

Understanding the Request Body

The bulk API accepts the operations to execute in the body. The entries in the body are in the form of JSON delimited format.

Each entry in a new line includes the action and the related data for the target operation.

Let us break down the operations you can specify in the request body and the accepted parameters:

Create

The create operation will index a specified document if the document does not exist. Essential parameters for the create operation include:

_index – Sets the name of the index or index alias on which to execute the index operation. This parameter is not optional if you don’t have the target parameter set in the request path.

_id – The id of the document to index. If you have no value specified, Elasticsearch will generate the document ID automatically.

Update

The update operation will carry out a partial document update. Must-know parameters for the update operation include:

_index – specifies the name of the index or index areas to carry out the update operation.

_id – document id, generated automatically if not specified.

Doc – Sets the name of the partial document to index.

Index

The index operation indexes a specified document. If the specified document exists, the index operation will replace the document and increment its version. The essential parameters for this operation include:

_index – Sets the name of the index or index alias to index on.

_id – Id of the document.

Delete

The delete operation deletes a document from the index. Must know parameters for this operation include:

_index – sets the name or alias of the index.

_id – The id of the document to remove from the index.

NOTE: It is good to pay attention to the response from the bulk API to determine information such as failed and successful operations.

Conclusion

The bulk API in Elasticsearch can be a time saver both in the number of requests to make and the indexing performance. This guide provides the basics of how to work with the API to perform multiple operations.

To learn more about the bulk API, check out the documentation.

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list