This blog will illustrate how to use the select-by-similarity search in LangChain.
How to Use Select by Similarity in LangChain?
The select-by-similarity example selector uses the cosine similarity after generating the embeddings of the input and example set. The most similar or closely located output for a particular input will be extracted for the user on the screen.
To learn the process of using the search by similarity in LangChain, simply go through the listed steps:
Step 1: Install Modules
Firstly, start the process by installing the LangChain framework/module containing all the required dependencies for the process:
After that, install the OpenAI module which is used to apply embedding on the text to convert it into the numerical form:
Now, FAISS is also required to fetch the most similar example from the set using the input provided by the user:
The tiktoken tokenizer is required for splitting the examples into smaller chunks so they can become easier to manage in large datasets:
Installing chromadb is also another requirement that will be used to store embeddings generated using the OpenAIEmbedding() method:
Set up the environment for the OpenAI using its API key after importing the “os” and “getpass” libraries:
import getpass
os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")
Step 2: Importing Libraries
After installing all the required modules for this process, simply import the libraries like Chroma, SemanticSimilarityExampleSelector, etc. from LangChain. After that, configure the example set using the input and output parameters for each example:
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.prompts import FewShotPromptTemplate, PromptTemplate
example_prompt = PromptTemplate(
input_variables=["input", "output"],
template="Input: {input}\nOutput: {output}",
)
examples = [
{"input": "tiny", "output": "large"},
{"input": "hate", "output": "love"},
{"input": "ill", "output": "well"},
{"input": "shrink", "output": "grow"},
{"input": "soft", "output": "hard"},
]
Step 3: Building an Example Selector
Now, the next step is to build an example selector using the SemanticSimilarityExampleSelector() method with multiple parameters like OpenAIEmbeddings(), and Chroma:
examples,
OpenAIEmbeddings(),
Chroma,
k=1
)
similar_prompt = FewShotPromptTemplate(
example_selector=example_selector,
example_prompt=example_prompt,
prefix="Give the antonym of every input",
suffix="Input: {adjective}\nOutput:",
input_variables=["adjective"],
)
Step 4: Testing the Example Selector
Once the example selector is configured successfully, simply test it by providing input and printing it on the screen:
The input is feeling so the model has fetched a similar example from the set as displayed in the screenshot below:
Try another input using the similarity example selector to print the output on the screen:
print(similar_prompt.format(adjective="joyful"))
That is all about using the select by similarity example selector in LangChain.
Conclusion
To use the select by similarity example selector in LangChain, install the modules that are necessary for using the similarity example selector. Once the installations are done, import libraries like OpenAI, chroma, SemanticSimilarityExampleSelector, etc. to build an example set. After that, create an example selector and test it using the input so it can fetch the example from the set based on that input. This guide has illustrated the process of using the select by similarity example selector in LangChain.