How to Use Select by Similarity in LangChain?

LangChain is the technique used to build chatbots and Large Language Models so the user can interact with machines to get useful information. These models are built to fetch or generate text in human languages after understanding the prompt. The example selectors are designed/created based on different methods like similarity search, etc. so they can get the information according to the query/input.

This blog will illustrate how to use the select-by-similarity search in LangChain.

How to Use Select by Similarity in LangChain?

The select-by-similarity example selector uses the cosine similarity after generating the embeddings of the input and example set. The most similar or closely located output for a particular input will be extracted for the user on the screen.

To learn the process of using the search by similarity in LangChain, simply go through the listed steps:

Step 1: Install Modules
Firstly, start the process by installing the LangChain framework/module containing all the required dependencies for the process:

pip install langchain

After that, install the OpenAI module which is used to apply embedding on the text to convert it into the numerical form:

pip install openai

Now, FAISS is also required to fetch the most similar example from the set using the input provided by the user:

pip install faiss-gpu

The tiktoken tokenizer is required for splitting the examples into smaller chunks so they can become easier to manage in large datasets:

pip install tiktoken

Installing chromadb is also another requirement that will be used to store embeddings generated using the OpenAIEmbedding() method:

pip install chromadb

Set up the environment for the OpenAI using its API key after importing the “os” and “getpass” libraries:

import os
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")

Step 2: Importing Libraries
After installing all the required modules for this process, simply import the libraries like Chroma, SemanticSimilarityExampleSelector, etc. from LangChain. After that, configure the example set using the input and output parameters for each example:

from langchain.prompts.example_selector import SemanticSimilarityExampleSelector
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.prompts import FewShotPromptTemplate, PromptTemplate

example_prompt = PromptTemplate(
input_variables=["input", "output"],
template="Input: {input}\nOutput: {output}",
)

examples = [
{"input": "tiny", "output": "large"},
{"input": "hate", "output": "love"},
{"input": "ill", "output": "well"},
{"input": "shrink", "output": "grow"},
{"input": "soft", "output": "hard"},
]

Step 3: Building an Example Selector
Now, the next step is to build an example selector using the SemanticSimilarityExampleSelector() method with multiple parameters like OpenAIEmbeddings(), and Chroma:

example_selector = SemanticSimilarityExampleSelector.from_examples(
examples,
OpenAIEmbeddings(),
Chroma,
k=1
)
similar_prompt = FewShotPromptTemplate(
example_selector=example_selector,
example_prompt=example_prompt,
prefix="Give the antonym of every input",
suffix="Input: {adjective}\nOutput:",
input_variables=["adjective"],
)

Step 4: Testing the Example Selector
Once the example selector is configured successfully, simply test it by providing input and printing it on the screen:

print(similar_prompt.format(adjective="worried"))

The input is feeling so the model has fetched a similar example from the set as displayed in the screenshot below:

Try another input using the similarity example selector to print the output on the screen:

similar_prompt.example_selector.add_example({"input": "enthusiastic", "output": "apathetic"})
print(similar_prompt.format(adjective="joyful"))

That is all about using the select by similarity example selector in LangChain.

Conclusion

To use the select by similarity example selector in LangChain, install the modules that are necessary for using the similarity example selector. Once the installations are done, import libraries like OpenAI, chroma, SemanticSimilarityExampleSelector, etc. to build an example set. After that, create an example selector and test it using the input so it can fetch the example from the set based on that input. This guide has illustrated the process of using the select by similarity example selector in LangChain.

How to Use Select by Similarity in LangChain?

How to Use Select by Similarity in LangChain?

Conclusion

About the author

Talha Mahmood