đź§ Build a Local FAQ Chatbot with RAG, Streamlit & Ollama
Turn a simple text or PDF into a local AI chatbot using Ollama, LangChain, and Streamlit
In my previous article, I built a local chatbot that could read and answer questions based on my personal notes using RAG (Retrieval-Augmented Generation) and Ollama.
Now, let’s take it a step further. This time, I wanted to:
- Use a structured FAQ document as the source
- Add a simple UI using Streamlit
- Keep everything running locally, fast and private
🤖 What We’re Building
We’re building a chatbot that:
- Reads your FAQ document (TXT or PDF)
- Uses RAG to search relevant answers
- Runs entirely on your machine (thanks to Ollama)
- Has a browser-based UI using Streamlit
đź§ľ Sample FAQ Data
Here’s the format I used in my text file
Q: How can I reset my password?
A: Go to Settings > Account > Reset Password.
Q: Does the app work offline?
A: Yes. You can enable Offline Mode from the Settings menu.
Q: How do I export my data?
A: Navigate to Profile > Export Data to download all your information.
Q: Where can I find support?
A: Reach out to us via the Help section or email support@example.com.
đź“‚ You can find the full sample FAQ data in my GitHub repo:
👉 https://github.com/deepak786/faq-chatbot
🛠️ Backend Setup with RAG
This is powered by the LangChain + Ollama stack:
- Document Loading: Load
.txt
or.pdf
using LangChain’s loaders - Text Splitting: Divide content into manageable chunks
- Embeddings: Use
nomic-embed-text
via Ollama to convert chunks into vectors - Vector DB: Store and search chunks using Chroma
- LLM: Query the documents using a local model like
llama3.2
All of this is written in Python and available in the repo. Here’s a quick link to the code.
đź’¬ Add Streamlit for UI
To make the chatbot more user-friendly, I added a basic Streamlit interface.
What it includes:
- Input box for your question
- Answer area powered by the local LLM
You can edit faq_chatbot.py
to customize the appearance or logic.
🎥 Demo
Here’s a quick demo of the chatbot in action:
🛑 Bonus: Common Issues with RAG
Even though Retrieval-Augmented Generation (RAG) is a powerful technique, it isn’t perfect. One of the most common issues is that RAG can miss semantically relevant information if the user’s query doesn’t closely match the wording in the source documents.
🔍 Example: Missed Retrieval
Let’s say a user asks: “How to change my username?”
But the FAQs only contain: “How can I change my email address?”
RAG might not retrieve anything useful — because “username” and “email address” are technically different words, even if the intent is similar.
With a strict prompt like:
Rules:
1. Only answer based on the provided FAQs
2. If the exact question isn't in the FAQs, respond with "I don't have information about that in the FAQs"
The chatbot will respond: “I don’t have information about that in the FAQs.”
✅ This is the correct response — the model is following the rules.
❌ But from a user perspective, it feels like a failure — especially when there might be related info in the data.
đź› How to Fix It:
Query Expansion — enriching the original query with synonyms or related terms — can help retrieve better context. Read more about query expansion here https://deepakdroid.medium.com/smarter-rag-improving-faq-chatbots-with-query-expansion-ed5bcfc1beac
âś… Conclusion
In just a few steps, we created a local chatbot that:
- Reads FAQ documents
- Answers questions using a local LLM
- Runs with a clean UI in your browser
And all of it without calling external APIs or sending your data to the cloud. 🚀
đź”— Check out the full code on GitHub:
👉 https://github.com/deepak786/faq-chatbot
Happy coding! :)