This short article will illustrate using bulk API to carry out multiple CRUD operations in a single API request.
Elasticsearch bulk API Basics
We can use the bulk API by sending an HTTP POST request to _bulk API endpoint. The request should contain the operation performed in the API call, such as creating or deleting an index.
Consider the following request.
POST _bulk
{"index": {"_index": "test-index-1", "_id": 1}}
{"field1": "value1"}
{"update": {"_id": 1, "_index": "test-index-1"}}
{"doc": {"field2": "value2"}}
{"delete": {"_index": "test-index-1", "_id": 1}}
The above example request performs three consecutive actions at once. It creates an index and adds a document.
Next, we update the record and delete it.
You should see the output similar to the one shown below:
Explanation
As mentioned, the bulk API allows you to execute multiple actions such as index, create, update and delete in a single query.
Each action is specified in the request body using newline delimited JSON format.
Both the index and create operations require you to specify the source. The index action adds or performs a replace on the index as specified. It is good to note that the index operation will fail if a document with a similar index already exists.
An update operation, on the other hand, requires a partial doc specified.
Understanding the Request Body
The bulk API accepts the operations to execute in the body. The entries in the body are in the form of JSON delimited format.
Each entry in a new line includes the action and the related data for the target operation.
Let us break down the operations you can specify in the request body and the accepted parameters:
Create
The create operation will index a specified document if the document does not exist. Essential parameters for the create operation include:
_index – Sets the name of the index or index alias on which to execute the index operation. This parameter is not optional if you don’t have the target parameter set in the request path.
_id – The id of the document to index. If you have no value specified, Elasticsearch will generate the document ID automatically.
Update
The update operation will carry out a partial document update. Must-know parameters for the update operation include:
_index – specifies the name of the index or index areas to carry out the update operation.
_id – document id, generated automatically if not specified.
Doc – Sets the name of the partial document to index.
Index
The index operation indexes a specified document. If the specified document exists, the index operation will replace the document and increment its version. The essential parameters for this operation include:
_index – Sets the name of the index or index alias to index on.
_id – Id of the document.
Delete
The delete operation deletes a document from the index. Must know parameters for this operation include:
_index – sets the name or alias of the index.
_id – The id of the document to remove from the index.
NOTE: It is good to pay attention to the response from the bulk API to determine information such as failed and successful operations.
Conclusion
The bulk API in Elasticsearch can be a time saver both in the number of requests to make and the indexing performance. This guide provides the basics of how to work with the API to perform multiple operations.
To learn more about the bulk API, check out the documentation.