Introduction
Generative AI, particularly the Generative Giant Language Fashions, have taken over the world since their delivery. This was solely potential as a result of they may combine with totally different purposes, from producing working programmable codes to creating absolutely GenerativeAI-managed Chat Help Programs. However a lot of the Giant Language Fashions within the Generative AI house have been closed to the general public; most weren’t open-sourced. Whereas there do exist just a few Open Supply fashions, however are nowhere close to the closed-source Giant Language Fashions. However lately, FalconAI, an LLM, was launched, which topped the OpenLLM leaderboard and was made Open Sourced. With this mannequin on this information, we are going to create a chat utility with Falcon AI, LangChain, and Chainlit.
Studying Goals
- To leverage Falcon Mannequin in Generative AI Purposes
- To construct UI for Giant Language Fashions with Chainlit
- To work with Inference API to entry pre-trained fashions in Hugging Face
- To chain Giant Language Fashions and Immediate Templates with LangChain
- To combine LangChain Chains with Chainlit for constructing UI Purposes
This text was revealed as part of the Knowledge Science Blogathon.
What’s Falcon AI?
Within the Generative AI discipline, Falcon AI is without doubt one of the lately launched Giant Language Fashions recognized for taking first place within the OpenLLM Leaderboard. Falcon AI was launched by UAE’s Know-how Innovation Institute (TII). Falcon AI’s structure is designed in a means that’s optimized for Inference. When it was first launched, Falcon AI topped the OpenLLM Leaderboard by shifting forward of state-of-the-art fashions like Llama, Anthropic, DeepMind, and many others. The mannequin was skilled on AWS Cloud with 384 GPUs connected constantly for 2 months.
At the moment, it consists of two fashions, Falcon 40B(40 Billion Parameters) and Falcon 7B(7 Billion Parameters). The primary half is that the Falcon AI makers have talked about that the mannequin can be Open Sourced, thus permitting builders to work with it for industrial use with out restrictions. Falcon AI even offers the Instruct fashions, the Falcon-7B-Instruct and Falcon-40B-Instruct, with which we will rapidly get began to construct chat purposes. On this information, we are going to work with the Falcon-7B-Instruct mannequin.
What’s Chainlit?
Chainlit library is just like Python’s Streamlit Library. However the meant objective of this Chainlit library is to construct chat purposes with Giant Language Fashions rapidly, i.e., to create a UI just like ChatGPT. Growing conversational chat purposes inside minutes with the Chainlit bundle is feasible. This library is seamlessly built-in with LangFlow and LangChain(the library to construct purposes with Giant Language Fashions), which we are going to do later on this information.
Chainlit even permits for visualizing multi-step reasoning; it lets us see the intermediate outcomes to know the way the Giant Language Mannequin reached the output to a query. So you’ll be able to clearly see the chain of ideas of the mannequin by way of the UI itself to know how the LLM concluded the given query. Chainlit is restricted to a textual content dialog and permits for sending and receiving Photographs to and from the respective Generative AI fashions. It even lets us replace the Immediate Template within the UI as a substitute of returning to the code and altering it.
Producing HuggingFace Inference API
There are two methods to work with the Falcon-7B-Instruct mannequin. One is the standard means, the place we obtain the mannequin to the native machine after which use it immediately. However as a result of this can be a Giant Language Mannequin, it’s going to want excessive GPU reminiscence to make it work. Therefore we go along with the opposite possibility, calling the mannequin immediately by way of the Inference API. Inference API is a HuggingFace API token with which we will entry all of the transformer fashions within the HuggingFace.
To entry this token, we have to create an Account in HuggingFace, which we will do by going to the official HuggingFace web site. After logging in/signing in together with your particulars, go to your profile and click on on the Settings part. The method from there can be



So in Settings, go to Entry Tokens. You’ll create a brand new token, which we should work with the Falcon-7B-Instruct mannequin. Click on on the New Token to create the brand new token. Enter a reputation for the token and set the Position choice to Write. Now click on on Generate to generate our new Token. With this token, we will entry the Falcon-7B-Instruct mannequin and construct purposes.
Getting ready the Setting
Earlier than we dive into our utility, we are going to create a perfect surroundings for the code to work. For this, we have to set up the mandatory Python libraries wanted. Firstly, we are going to begin by putting in the libraries that help the mannequin. For this, we are going to do a pip set up of the under libraries.
$ pip set up huggingface_hub
$ pip set up transformers
These instructions will set up the HuggingFace Hub and the Transformers libraries. These libraries name the Falcon-7B-Instruct mannequin, which resides within the HuggingFace. Subsequent, we can be putting in the LangChain library for Python.
$ pip set up langchain
This may set up the LangChain Package deal for Python, which we are going to work with to create our chat utility with the Falcon Giant Language Mannequin. Lastly, with out the UI, the conversational utility just isn’t finished. So for this, we can be downloading the chainlit library.
$ pip set up chainlit
This may set up the Chainlit library for Python. With the assistance of this library, we can be constructing the UI for our conversational chat utility. After putting in chainlit, we have to take a look at the bundle. For this, use the under command within the terminal.
chainlit whats up

After getting into this command, a brand new window with the deal with localhost and PORT 8000 will seem. The UI will then be seen. This tells that the chainlit library is put in correctly and able to work with different libraries in Python.
Creating the Chat Software
On this part, we are going to begin constructing our utility. We’ve all the mandatory libraries to go ahead to construct our very personal conversational chat utility. The very first thing we can be doing is importing the libraries and storing the HuggingFace Inference API in an environmental object.
import os
import chainlit as cl
from langchain import HuggingFaceHub, PromptTemplate, LLMChain
os.environ['API_KEY'] = 'Your API Key'
- So we begin by importing the os, chainlit and langchain libraries.
- From langchain, we now have imported the HuggingFaceHub. This HuggingFaceHub will allow us to name the Falcon-7B-Instruct mannequin by way of the Inference API and obtain the responses generated by the mannequin.
- The PromptTemplate is without doubt one of the parts of LangChain, obligatory for constructing purposes based mostly on the Giant Language Mannequin. It defines how the mannequin ought to interpret the consumer’s questions and in what context it ought to reply them.
- Lastly, we even import the LLMChain from LangChain. LLMChain is the module that chains totally different LangChain parts collectively. Right here we can be chaining our Falcon-7B-Instruct Giant Language Mannequin with the PromptTemplate.
- Then we retailer our HuggingFace Inference API in an surroundings variable, that’s, os.environ[‘API_KEY’]
Instruct the Falcon Mannequin
Now we can be inferring the Falcon Instruct mannequin by way of the HuggingFaceHub module. For this, first, we should present the trail to the mannequin within the Hugging Face. The code for this can be
model_id = 'tiiuae/falcon-7b-instruct'
falcon_llm = HuggingFaceHub(huggingfacehub_api_token=os.environ['API_KEY'],
                            repo_id=model_id,
                            model_kwargs={"temperature":0.8,"max_new_tokens":2000})
- First, we should give the id of the mannequin we are going to work with. For us, it will likely be the Falcon-7B-Instruct mannequin. The id of this mannequin may be discovered immediately on the HuggingFace web site, which can be ‘tiiuae/falcon-7b-instruct’.
- Now we name the HuggingFaceHub module, the place we move the API token, assigned to an surroundings variable, and even the repo_id, i.e., the id of the mannequin we can be working with.
- Additionally, we offer the mannequin parameters, just like the temperature and the utmost variety of new tokens. Temperature is how a lot the mannequin must be inventive, the place 1 means extra creativity, and 0 tells no creativity.
Now we now have clearly outlined what mannequin we can be working with. And the HuggingFace API will allow us to hook up with this mannequin and run our queries to begin constructing our utility.
Immediate Template
After the mannequin choice, the subsequent is defining the Immediate Template. The Immediate Template tells how the mannequin ought to behave. It tells how the mannequin ought to interpret the query offered by the consumer. It even tells how the mannequin ought to conclude to present the output to the consumer’s question. The code for outlining our Immediate Template could be
template = """
You're an AI assistant that gives useful solutions to consumer queries.
{query}
"""
immediate = PromptTemplate(template=template, input_variables=['question'])
The above template variable defines and units the context of the Immediate Template for the Falcon mannequin. The context right here is easy, the AI wants to supply useful solutions to consumer queries, adopted by the enter variable {query}. Then this template, together with the variables outlined in it, is given to the PromptTemplate perform, which is then assigned to a variable. This variable is now the Immediate Template, which is able to later be chained along with the mannequin.
Chain Each Fashions
Now we now have each the Falcon LLM and the Immediate Template prepared. The ultimate half can be chaining each these fashions collectively. We are going to work with the LLMChain object from the LangChain library for this. The code for this can be
falcon_chain = LLMChain(llm=falcon_llm,
                        immediate=immediate,
                        verbose=True)
With the assistance of LLMChain, we now have chained the Falcon-7B-Instruct mannequin with our very personal PromptTemplate that we now have created. We’ve even set the verbose = True, which is useful to know what occurs when the code is being run. Now let’s take a look at the mannequin by giving a question to it
print(falcon_chain.run("What are the colours within the Rainbow?"))

Right here, we now have requested the mannequin what the rainbow colours are. The rainbow comprises VIBGYOR (Violet, Indigo, Blue, Inexperienced, Yellow, Orange, and Pink) colours. The output generated by the Falcon 7B Instruct mannequin is spot on to the query requested. Setting the verbose possibility lets us see the Immediate after formatting and tells us the place the chain begins and ends. Lastly, we’re able to create a UI for our conversational chat utility.
Chainlit – UI for Giant Language Fashions
On this part, we are going to work with Chainlit Package deal to create the UI for our utility. Chainlit is a Python library that lets us construct Chat Interfaces for Giant Language Fashions in minutes. It’s built-in with LangFlow and even LangChain, the library we beforehand labored on. Creating the Chat Interface with Chainlit is easy. We’ve to write down the next code:
@cl.langchain_factory(use_async=False)
def manufacturing unit():
    immediate = PromptTemplate(template=template, input_variables=['question'])
    falcon_chain = LLMChain(llm=falcon_llm,
                        immediate=immediate,
                        verbose=True)
    return falcon_chain
Steps
- First, we begin with the decorators from Chainlit for LangChain, the @cl.langchain_factory.
- Then we outline a manufacturing unit perform that comprises the LangChain code. The code right here we’d like is the Immediate Template and the LLMChain module of LangChain, which builds and chains our Falcon LLM.
- Lastly, the return variable have to be a LangChain Occasion. Right here, we return the ultimate chain created, i.e., the LLMChain Occasion, the falcon_chain.
- The use_async = False tells the code to not use the async implementation for the LangChain agent.
Let’s Run the Code!
That’s it. Now after we run the code, a Chat Interface can be seen. However how is that this potential The factor is, Chainlit takes care of the whole lot. Behind the scenes, it manages the webhook connections, it’s chargeable for making a separate LangChain Occasion(Chain, Agent, and many others) for every consumer that visits the location. To run our utility, we kind the next within the terminal.
$ chainlit run app.py -w
The -w signifies auto-reload at any time when we make adjustments stay in our utility code. After getting into this, a brand new tab will get opened with localhost:8000

That is the opening web page, i.e., the welcome display screen of Chainlit. We see that Chainlit builds a complete Chat Interface for us simply with a single decorator. Let’s attempt interacting with the Falcon Mannequin by way of this UI


We see that the UI and the Falcon Instruct mannequin are working completely high quality. The mannequin can present swift solutions to the questions requested. It actually tried to clarify the second query based mostly on the consumer’s context (clarify to a 5-year-old). That is the start of what we will obtain with these Open Sourced Generative AI fashions. With little to few modifications, we will be capable to create rather more problem-oriented, actual scenario-based purposes.
Because the Chat Interface is a web site, it’s fully potential to host it on any of the cloud platforms. We are able to containerize the appliance, then attempt to deploy it in any container-based providers in Google Cloud, AWS, Azure, or different cloud providers. With that, we will share our purposes with the skin world.
Conclusion
On this walkthrough, we now have seen the right way to construct a easy Chat Software with the brand new Open Supply Falcon Giant Language Mannequin, LangChain, and Chainlit. We’ve leveraged these three packages and have interconnected them to create a full-fledged resolution from Code to Working Software. We’ve even seen the right way to acquire the HuggingFace Inference API Key to entry 1000’s of pre-trained fashions from the HuggingFace library. With the assistance of LangChain, we chained the LLM with customized Immediate Templates. Lastly, with Chainlit, we might create a Chat Software Interface round our LangChain Falcon mannequin inside minutes.
Among the key takeaways from this information embody:
- Falcon is an Open Supply mannequin and is without doubt one of the highly effective LLm, which is presently on the prime of the OpenLLM Leaderboard
- With Chainlit, it’s potential to create UI for LLM inside minutes
- Inference API lets us hook up with many alternative fashions current within the HuggingFace
- LangChain helps in constructing customized Immediate Templates for the Giant Language Fashions
- Chainlit’s seamless integration with LangChain permits it to construct LLM purposes faster and error-free
Continuously Requested Questions
A. The Inference API is created by HuggingFace, permitting you to entry 1000’s of pre-trained fashions within the HuggingFace library. With this API, you’ll be able to entry quite a lot of fashions, together with Generative AI fashions, Pure Language Processing Fashions, Audio Classification, and Laptop Imaginative and prescient fashions.
A. They’re. Particularly the Falcon 40B(40 Billion Parameters) mannequin. This mannequin has surpassed different state-of-the-art fashions like Llama and DeepMind and bought the highest place within the OpenLLM Leaderboard.
A. Chainlit is a Python Library that’s developed for creating UI. With Chainlit, creating ready-to-work Chat Interfaces for Giant Language Fashions inside minutes is feasible. The Chainlit Package deal seamlessly integrates with LangFlow and LangChain, different packages which can be labored with to create purposes with Giant Language Fashions.
A. Sure. The Falcon 40B(40 Billion Parameters) and the Falcon 7B(7 Billion Parameters) are Open Sourced. This states that anybody can work with these fashions to create industrial purposes with out restrictions.
The media proven on this article just isn’t owned by Analytics Vidhya and is used on the Creator’s discretion.Â
