LangChain
What is LangChain?
LangChain is an open source framework that lets software developers working with artificial intelligence (AI) and its machine learning subset combine large language models with other external components to develop LLM-powered applications. The goal of LangChain is to link powerful LLMs, such as OpenAI's GPT-3.5 and GPT-4, to an array of external data sources to create and reap the benefits of natural language processing (NLP) applications.
Developers, software engineers and data scientists with experience in the Python, JavaScript or TypeScript programming languages can make use of LangChain's packages offered in those languages. LangChain was launched as an open source project by co-founders Harrison Chase and Ankush Gola in 2022; the initial version was released that same year.
Why is LangChain important?
LangChain is a framework that simplifies the process of creating generative AI application interfaces. Developers working on these types of interfaces use various tools to create advanced NLP apps; LangChain streamlines this process. For example, LLMs have to access large volumes of big data, so LangChain organizes these large quantities of data so that they can be accessed with ease.
In addition, GPT (Generative Pre-trained Transformer) models are generally trained on data up to their release to the public. For instance, ChatGPT was released to the public near the end of 2022, but its knowledge base was limited to data from 2021 and before. LangChain can connect AI models to data sources to give them knowledge of recent data without limitations.
This article is part of
What is Gen AI? Generative AI explained
What are the features of LangChain?
LangChain is made up of the following modules that ensure the multiple components needed to make an effective NLP app can run smoothly:
- Model interaction. Also called model I/O, this module lets LangChain interact with any language model and perform tasks such as managing inputs to the model and extracting information from its outputs.
- Data connection and retrieval. Data that LLMs access can be transformed, stored in databases and retrieved from those databases through queries with this module.
- Chains. When using LangChain to build more complex apps, other components or even more than one LLM might be required. This module links multiple LLMs with other components or LLMs. This is referred to as an LLM chain.
- Agents. The agent module lets LLMs decide the best steps or actions to take to solve problems. It does so by orchestrating a series of complex commands to LLMs and other tools to get them to respond to specific requests.
- Memory. The memory module helps an LLM remember the context of its interactions with users. Both short-term memory and long-term memory can be added to a model, depending on the specific use.
What are the integrations of LangChain?
LangChain typically builds applications using integrations with LLM providers and external sources where data can be found and stored. For example, LangChain can build chatbots or question-answering systems by integrating an LLM -- such as those from Hugging Face, Cohere and OpenAI -- with data sources or stores such as Apify Actors, Google Search and Wikipedia. This enables an app to take user-input text, process it and retrieve the best answers from any of these sources. In this sense, LangChain integrations make use of the most up-to-date NLP technology to build effective apps.
Other potential integrations include cloud storage platforms, such as Amazon Web Services, Google Cloud and Microsoft Azure, as well as vector databases. A vector database can store large volumes of high-dimensional data -- such as videos, images and long-form text -- as mathematical representations that make it easier for an application to query and search for those data elements. Pinecone is an example vector database that can be integrated with LangChain.
How to create prompts in LangChain
Prompts serve as input to the LLM that instructs it to return a response, which is often an answer to a query. This response is also referred to as an output. A prompt must be designed and executed correctly to increase the likelihood of a well-written and accurate response from a language model. That is why prompt engineering is an emerging science that has received more attention in recent years.
Prompts can be generated easily in LangChain implementations using a prompt template, which will be used as instructions for the underlying LLM. Prompt templates can vary in specificity. They can be designed to pose simple questions to a language model. They can also be used to provide a set of explicit instructions to a language model with enough detail and examples to retrieve a high-quality response.
With Python programming, LangChain has a premade prompt template that takes the form of structured text. The following steps are required to use this:
- Installing Python. A recent version of Python must be installed. Once the Python shell terminal is open, enter the following command to install just the bare minimum requirements of LangChain for the sake of this example.
pip install langchain
- Adding integrations. LangChain typically requires at least one integration. OpenAI is a prime example. To use OpenAI's LLM application programming interfaces, a developer must create an account on the OpenAI website and retrieve the API access key. Then, using the following code snippet, install OpenAI's Python package and enter the key for access to the APIs.
pip install openai
from langchain.llms import OpenAI
llm = OpenAI(openai_api_key="...")
- Importing the prompt template. Once these basic steps are complete, LangChain's prompt template method must then be imported. The code snippet shown below does this.
from langchain import PromptTemplate
prompt_template = PromptTemplate.from_template(
"Tell me an {adjective} fact about {content}."
)
prompt_template.format(adjective="interesting", content="zebras")
"Tell me an interesting fact about zebras."
In this scenario, the language model would be expected to take the two input variables -- the adjective and the content -- and produce a fascinating fact about zebras as its output.
How to develop applications in LangChain
LangChain is built to develop apps powered by language model functionality. There are different ways to do this, but the process typically entails some key steps:
- Define the application. An application developer must first define a specific use case for the application. This also means determining its scope, including requirements such as any needed integrations, components and LLMs.
- Build functionality. Developers use prompts to build the functionality or logic of the intended app.
- Customize functionality. LangChain lets developers modify its code to create customized functionality that meets the needs of the use case and shapes the application's behavior.
- Fine-tuning LLMs. It's important to choose the appropriate LLM for the job and also to fine-tune it to adhere to the needs of the use case.
- Data cleansing. Using data cleansing techniques ensures clean and accurate data sets. Also, security measures should be implemented to protect sensitive data.
- Testing. Regularly testing LangChain apps ensures they continue to run smoothly.
For more information on generative AI-related terms, read the following articles:
Examples and use cases for LangChain
The LLM-based applications LangChain is capable of building can be applied to multiple advanced use cases within various industries and vertical markets, such as the following:
- Customer service chatbots. The most obvious use case would be customer support chatbots. LangChain enables chat applications that are advanced enough to handle complex questions and even transactions from users. These applications are able to understand and maintain a user's context throughout a conversation in the same way ChatGPT can. AI is already being widely used to enhance customer experience and service.
- Coding assistants. It's possible to build coding assistants with the help of LangChain. Using LangChain and OpenAI's API, developers can create a tool to assist those in the tech sector with enhancing their coding skills and improving productivity.
- Healthcare. AI has made its way into healthcare in several ways. LLM-centric LangChain applications are helping doctors make diagnoses. They're also automating rote, repetitive administrative tasks, such as scheduling patient appointments, enabling healthcare workers to focus on more important work.
- Marketing and e-commerce. Businesses use e-commerce platforms with LLM functionality to better engage customers and expand their customer base. An application that can understand consumer purchasing patterns and product descriptions can generate product recommendations and compelling descriptions for potential customers.
Reaping the benefits of NLP is a key of why LangChain is important. For a more in-depth understanding of NLP, there are two important subtopics to start with: natural language understanding (NLU) and natural language generation (NLG). The goal of NLU is to process a user's intended meaning, while the goal of NLG is to explain an AI system's structured data in human-readable languages for humans to comprehend.