RAG PDF Chatbot

A modern, responsive chat interface for interacting with PDF documents through AI. Upload any PDF and start asking questions about its content with our sleek dark-mode UI.

Overview

RAG PDF Chatbot is a Retrieval-Augmented Generation (RAG) chatbot built using FastAPI and Llama-2. It processes PDF documents and allows users to interact with the content through a conversational interface. The project leverages advanced language models and embeddings to provide accurate and context-aware responses. The chatbot operates locally & completely offline, boosting security and supporting any model file in gguf format or integration with LMStudio for enhanced flexibility.

Features

Load and process PDF documents.
Upload PDF files dynamically via the frontend.
Generate embeddings using sentence-transformers.
Use Chroma as a vector store for efficient retrieval.
Integrate with Llama-2 for natural language understanding and generation. (Can use any other open source model in gguf format or even replace with LMStudio)
FastAPI backend with endpoints for chat, file upload, and health checks.
Next.js-based frontend for user interaction.

Prerequisites

Python 3.10 or higher
CUDA-enabled GPU (optional, for faster processing)

Installation

Clone the repository:

git clone https://github.com/Anshulgada/RAG-Chatbot.git
cd RAG-Chatbot

Install dependencies (recommend using uv package manager):
```
uv venv
uv sync
```
Ensure the Llama-2 model file is placed in the models/ directory:
- File: llama-2-7b-chat.Q4_K_M.gguf
Start the FastAPI server:
```
uvicorn app:app --reload
```
Navigate to the frontend directory and start the React app:
```
cd frontend
npm install
npm run dev
```

Usage

Access the backend at http://0.0.0.0:8000.
Use the /upload endpoint to upload a PDF file for processing.
Use the /chat endpoint to send chat messages and receive responses.
Open the frontend at http://localhost:3000 for a user-friendly interface.

Frontend Features

File Upload: Upload a PDF file directly from the interface.
Chat Interface: Type messages and receive responses in real-time.
Responsive Design: Optimized for both light and dark modes.

Backend Endpoints

POST /upload: Upload a PDF file for processing.
POST /chat: Send a chat message and receive a response.
GET /: Health check endpoint.

Project Structure

app.py: FastAPI backend implementation.
frontend/: React-based frontend.
models/: Directory for storing the Llama-2 model.
Harry Potter and the Sorcerers Stone.pdf: Sample PDF for testing.
pyproject.toml: Project dependencies and configuration.

Additional Notes

Llama-CPP-Python

For further details and updates on llama-cpp-python, refer to the following resources:

CUDA Toolkit

Before installing PyTorch, ensure that the CUDA Toolkit is downloaded and installed. The version of the CUDA Toolkit must match the version of PyTorch you are installing. For example, if PyTorch is version 12.8, the CUDA Toolkit should also be version 12.8.

For the latest CUDA Toolkit, visit: NVIDIA CUDA Downloads
If you need an older version of the CUDA Toolkit, visit: NVIDIA CUDA Toolkit Archive

At the time of writing, the latest CUDA Toolkit version is 13, while the latest PyTorch version is 12.9. Ensure compatibility by downloading the appropriate versions.

Torch Installation

Torch should be installed based on your system configuration. For Windows machines using CUDA and Python, visit the PyTorch Get Started page to find the appropriate installation command.

For example, if you are using CUDA version cu128, the installation command would be:

uv pip install torch torchvision --index-url https://download.pytorch.org/whl/cu128

Replace cu128 with the current CUDA version available at the time. Always refer to the PyTorch website for the latest instructions.

Author

Anshul Gada

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
frontend		frontend
.gitignore		.gitignore
Chatbot.ipynb		Chatbot.ipynb
Harry Potter and the Sorcerers Stone.pdf		Harry Potter and the Sorcerers Stone.pdf
README.md		README.md
app.py		app.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RAG PDF Chatbot

Overview

Features

Prerequisites

Installation

Usage

Frontend Features

Backend Endpoints

Project Structure

Additional Notes

Llama-CPP-Python

CUDA Toolkit

Torch Installation

Author

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Anshulgada/RAG-Chatbot

Folders and files

Latest commit

History

Repository files navigation

RAG PDF Chatbot

Overview

Features

Prerequisites

Installation

Usage

Frontend Features

Backend Endpoints

Project Structure

Additional Notes

Llama-CPP-Python

CUDA Toolkit

Torch Installation

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages