Elastic Search

How Do I Use Elasticsearch in Python?

Elasticsearch is a free and open-source, highly available search and analytics engine built on the Apache Lucene project. Elasticsearch stores its data in JSON format, making it very easy to use.

It provides a simple and powerful REST API for performing a collection of tasks from creating documents, monitoring cluster health, and more.

Python is one of the most popular programming languages, and it tends to complement Elasticsearch very well.

In this guide, we will look at how to go about using the Elasticsearch Python client to interact with the Elasticsearch cluster.

Environment Setup

Before connecting the Elasticsearch Python client, it is good to ensure we have the environment configured.

Step 1: Installing Elasticsearch

The first step is to install and set up the Elastisearch cluster on our system. In this guide, we will use a Ubuntu server.

Start by updating your repositories:

sudo apt-get install update

Import the Elasticsearch PGP key.

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

Install required apt-transport-https package:

sudo apt-get install apt-transport-https

Save the repository.

echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-7.x.list

Update and install Elasticsearch

sudo apt update
sudo apt install elasticsearch

Enable and start the service:

sudo /bin/systemctl enable elasticsearch.service
sudo systemctl start elasticsearch.service

Once the service is up and running, perform a curl to the Elasticsearch endpoint:

curl http://localhost:9200

If the service is running, you should see an output as shown below:

{
 "name" : "ubuntu2004",
 "cluster_name" : "elasticsearch",
 "cluster_uuid" : "lUk9qSQtSaSfZXMsyxQdyg",
 "version" : {
   "number" : "7.15.0",
   "build_flavor" : "default",
   "build_type" : "deb",
   "build_hash" : "79d65f6e357953a5b3cbcc5e2c7c21073d89aa29",
   "build_date" : "2021-09-16T03:05:29.143308416Z",
   "build_snapshot" : false,
   "lucene_version" : "8.9.0",
   "minimum_wire_compatibility_version" : "6.8.0",
   "minimum_index_compatibility_version" : "6.0.0-beta1"
 },
 "tagline" : "You Know, for Search"
}

Step 2: Installing Python

The next step is to install Python. On Ubuntu/Debian, open the terminal and enter the command below to confirm the installed python version:

python --version

If you have Python 3 installed, you should see an output similar to the one shown below:

Python 3.10.0

If not, install Python 3 using the command:

sudo apt-get install python3.10

Step 3: Installing Elasticsearch Client

The final step is installing the Elasticsearch client. We can do this using the pip utility as:

Start by installing pip as:

sudo apt-get install python3-pip

Finally, install Elasticsearch client as:

pip3 install elasticsearch

Connecting the Elasticsearch Client

Once our environment is set up and configured, we can interact with elastic using the Elasticsearch client.

Start by creating a python file.

touch elastic.py
vim elastic.py

Ensure the cluster is up and running

Before interacting with the Elasticsearch cluster, ensure the service is up and running using the requests module.

import requests
substring = "You Know, for Search".encode()
response = requests.get("http://127.0.0.1:9200")
if substring in response.content:
   print("Elasticsearch is up and running!")
else:
   print("Something went wrong, ensure the cluster is up!")

Save and run the file as:

python elastic.py

Output:

Elasticsearch is up and running!

Connect to the Elasticsearch cluster

To connect to the Elasticsearch cluster, we can implement the following simple script:

import requests
from elasticsearch import Elasticsearch
substring = "You Know, for Search".encode()
response = requests.get("http://127.0.0.1:9200")
if substring in response.content:
   es = Elasticsearch([{"host": "localhost", "port": 9200}])

Get document with Python

To get a document using the Python client, you can do:

res = es.get(index="index-name", id=1)
print(res['_source'])

The above example should return details about the queried document.

Indexing a Document

To index a document, use the code:

from datetime import datetime
from elasticsearch import Elasticsearch
es = Elasticsearch([{"host": "localhost", "port": 9200}])
doc = {
"author": "document-author",
             "text": "A text document",
             "timestamp": datetime.now()
    }
res = es.index(index="sample-index", id=2, body=doc)
print(res['result'])

Deleting a document

To delete a document:

res = es.delete(index="index-name", id=1)

Closing

This guide discusses how to set up and use Elasticsearch with Python using the Elasticseach python client.

To learn how to utilize the full functionality of the Elasticsearch library, consider the documentation.

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list