A production-ready Python RAG (Retrieval-Augmented Generation) Chatbot with Streamlit UI. Chat normally or upload PDFs to have AI-powered conversations about your documents.
- 💬 Normal Chat Mode: Have natural conversations with the AI
- 📚 RAG Mode: Upload PDFs and ask questions about their content
- 🔄 Auto-Switch: Automatically switches between normal and RAG mode
- 💾 Persistent Storage: Vector embeddings are saved locally
- ⚙️ Configurable: Adjust chunk size and overlap via UI
- ⚡ Super Fast: Uses Groq API for lightning-fast inference (free tier available)
| Component | Technology |
|---|---|
| LLM | Groq (Llama 3.3 / Mixtral) |
| Embeddings | sentence-transformers (all-MiniLM-L6-v2) |
| Vector Store | FAISS |
| PDF Parsing | PyPDF2 |
| UI | Streamlit |
| Orchestration | LangChain |
Make sure you have Python 3.10 or higher installed:
python --versionGet your free API key from Groq Console:
- Visit https://console.groq.com/keys
- Sign in or create an account
- Click "Create API Key"
- Copy the key for use in the setup
cd rag_chatbot# Create virtual environment
python -m venv venv
# Activate it
# Windows:
venv\Scripts\activate
# macOS/Linux:
source venv/bin/activatepip install -r requirements.txtChoose one of these methods:
Option A: Environment Variable (Recommended)
# Windows PowerShell:
$env:GROQ_API_KEY = "your-api-key-here"
# Windows Command Prompt:
set GROQ_API_KEY=your-api-key-here
# macOS/Linux:
export GROQ_API_KEY="your-api-key-here"Option B: Direct in config.py
Edit config.py and add your key:
GROQ_API_KEY = "your-api-key-here"streamlit run app.pyThe app will open in your browser at http://localhost:8501
- Simply type your message in the chat input
- Press Enter to send
- The AI will respond naturally
- Upload PDFs: Use the sidebar to upload one or more PDF files
- Configure Chunking (optional): Adjust chunk size and overlap
- Process: Click "Process PDFs" to index the documents
- Ask Questions: Type questions about your documents
- View Sources: Expand the "Sources" section to see where answers came from
rag_chatbot/
│
├── app.py # Streamlit entry point
├── config.py # Configuration settings
├── requirements.txt # Python dependencies
├── README.md # This file
│
├── loaders/
│ ├── __init__.py
│ └── pdf_loader.py # PDF loading & parsing
│
├── embeddings/
│ ├── __init__.py
│ └── embedder.py # Sentence transformer embeddings
│
├── vectorstore/
│ ├── __init__.py
│ └── store.py # FAISS vector storage
│
├── retriever/
│ ├── __init__.py
│ └── retriever.py # Document retrieval
│
├── chat/
│ ├── __init__.py
│ └── conversation.py # Groq chat integration
│
├── prompts/
│ ├── __init__.py
│ ├── chat_prompt.py # Normal chat prompts
│ └── rag_prompt.py # RAG-specific prompts
│
├── utils/
│ ├── __init__.py
│ └── helpers.py # Utility functions
│
└── data/
├── pdfs/ # Uploaded PDFs (auto-created)
└── vectorstore/ # Saved embeddings (auto-created)
Edit config.py to customize:
# Change the Groq model
GROQ_MODEL = "llama-3.3-70b-versatile" # or "mixtral-8x7b-32768", etc.
# Adjust default chunking
DEFAULT_CHUNK_SIZE = 1000
DEFAULT_CHUNK_OVERLAP = 200
# Modify retrieval settings
TOP_K_DOCUMENTS = 4
SIMILARITY_THRESHOLD = 0.3Error: Groq API key not found
Solution:
- Make sure you've set the
GROQ_API_KEYenvironment variable - Or add it directly to
config.py - Get a free key at: https://console.groq.com/keys
Error: Rate limit reached
Solution:
- Groq's free tier has limits on requests per minute (RPM) and tokens per minute (TPM).
- Wait a few seconds before retrying.
- Switch to a smaller model like
llama-3.1-8b-instantinconfig.pyif needed.
Contributions are welcome! Feel free to:
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
- Groq for lightning-fast AI inference
- Sentence Transformers for embeddings
- FAISS for vector search
- Streamlit for the UI framework
- LangChain for orchestration