AI

Hugging Face Inference API with Python

Hugging Face is identified as a community of open-source AI and it consists of a vast variety of open-source frameworks, tools, architectures, and models to build and interact with the AI and natural language processing models. Hugging Face provides an application programmable interference as “Inference API”. This inference API is used for the deployment of machine learning and AI models for decision-making and real-time predictions. This API allows the developers to use the pre-trained NLP models to give predictions on the new dataset.

Syntax:

There exists a variety of services that Hugging Face provides but one of its widely used services is “API”. API allows the interaction of the pre-trained AI and large language models to different applications. Hugging Face provides the APIs for different models as listed in the following:

  • Text generation models
  • Translation models
  • Models for the analysis of the sentiments
  • Models for the development of virtual agents (intelligent chatbots)
  • Classification and the regression models

Let’s now discover the method to get our personalized inference API from Hugging Face. To do so, we first have to start by registering ourselves on the official website of Hugging Face. Join this community of Hugging Face by signing up to this website with your credentials.

Once we get an account on Hugging Face, we now have to request the inference API. To request the API, go to the account settings and select the “Access Token”. A new window will open. Select the “New Token” option and then generate the token by first providing the name of the token and its role as “WRITE”. A new token is generated. Now, we must save this token. Until this point, we have our token from the Hugging Face. In the next example, we will see how we can use this token to get an inference API.

Example 1: How to Prototype with Hugging Face Inference API

So far, we discussed the method on how to get started with Hugging Face and we initialized a token from Hugging Face. This example shows how we can use this newly generated token to get an inference API for a specific model (machine learning) and make predictions through it. From the homepage of Hugging Face select any model that you want to work with which is relevant to your problem. Let’s say we want to work with the text classification or the sentiment analysis model as shown in the following snippet of the list of these models:

We choose the sentiment analysis model from this model.

After selecting the model, its model card will appear. This model card contains information regarding the training details of the model and what characteristics the model has. Our model is roBERTa-base which is trained on the 58M tweets for sentiment analysis. This model has three main class labels and it categorizes each input into its relevant class labels.

After the selection of the model, if we select the deployment button which is present in the top right corner of the window, it opens up a drop-down menu. From this menu, we need to select the “Inference API” option.

The inference API then provides a whole explanation of how to use this specific model with this inference and lets us quickly create the prototype for the AI model. The inference API window displays the code that is written in Python’s script.

We copy this code and execute this code in any of the Python IDE. We use Google Colab for this. After executing this code in the Python shell, it returns an output that comes with the score and the label prediction. This label and score are given according to our input since we chose the “text-sentiment analysis” model. Then, the input that we give to the model is a positive sentence and the model was pre-trained on three label classes: label 0 implies negative, label1 implies neutral, and the label 2 is set to positive. Since our input is a positive sentence, the score prediction from the model is more than the other two labels which means that the model predicted the sentence as a “positive one”.

import requests

API_URL = "https://api-inference.huggingface.co/models/cardiffnlp/twitter-roberta-base-sentiment"
headers = {"Authorization": "Bearer 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"}

def query(payload):
  response = requests.post(API_URL, headers=headers, json=payload)
  return response.json()
 
output = query({
  "inputs": "I feel good when you're with me",
})

Output:

Example 2: Summarization Model through Inference

We follow the same steps that are as shown in the previous example and prototype the summarization model bus using its inference API from Hugging Face. The summarization model is a pre-trained model that summarizes the whole text that we give to it as its input. Go to the Hugging Face account, click on the model from the top menu bar, and then choose the model that is relevant to the summarization, select it, and read its model card carefully.

The model that we chose is a pre-trained BART model and it is finely tuned to the dataset CNN dail mail. BART is a model which is most similar to the BERT model which has an encoder and decoder. This model is effective when it is fine tuned for comprehension, summarization, translation, and text-generation tasks.

Then, choose the “deployment” button from the top right corner and select the inference API from the drop-down menu. The inference API opens another window that contains the code and the directions to use this model with this inference.

Copy this code and execute it in a Python shell.

The model returns the output which is the summarization of the input that we fed to it.

Conclusion

We worked on the Hugging Face Inference API and learned how we can use this application’s programmable interface to work with the pre-trained language models. The two examples that we did in the article were mainly based on the NLP models. Hugging Face API can work wonders if we want to develop a fast prototype by providing the fast integration of AI models into our applications. In short, Hugging Face has solutions to all your problems from reinforcement learning to computer vision.

About the author

Omar Farooq

Hello Readers, I am Omar and I have been writing technical articles from last decade. You can check out my writing pieces.