RAG Chatbot - Retrieval-Augmented Generation (RAG) Powered Chatbot

This repository contains a Retrieval-Augmented Generation (RAG) chatbot that leverages OpenAI's GPT models and Pinecone for semantic search and retrieval. The chatbot is designed to answer user queries by retrieving relevant documents, reranking them, and generating a response using a language model. It is built with a backend powered by FastAPI and a frontend using Streamlit.

What is RAG?

Retrieval-Augmented Generation (RAG) is a framework that combines information retrieval with generative models. Instead of relying solely on a language model's training data, RAG retrieves relevant documents from an external knowledge base (e.g., Pinecone) and uses them as context for generating responses. This approach improves the accuracy and relevance of responses, especially for domain-specific queries.

RETRIEVAL STEP

The chatbot first retrieves relevant document chunks using a vector similarity search in Pinecone. This ensures that the most contextually relevant data is available for response generation.

AUGMENTATION STEP

The retrieved document chunks are passed to the LLM as context, ensuring that the response is generated based on actual document content rather than generic knowledge.

GENERATION STEP

The LLM synthesizes a response by leveraging the retrieved context, ensuring answers remain accurate and grounded in the uploaded documents.

Features

Document Retrieval: Uses Pinecone to retrieve relevant documents based on user queries.
Reranking: Reranks retrieved documents using a SentenceTransformer model for better relevance.
Generative Responses: Generates responses using OpenAI's GPT-3.5-turbo model.
Frontend: A user-friendly Streamlit interface for interacting with the chatbot.
Backend: A FastAPI-based backend for handling queries and managing retrieval logic.

Project Structure

backend/
    lambda_handler.py
    requirements.txt
frontend/
    app.py
    query_handler.py
    data_ingestion.py
    requirements.txt
evaluation/
    evaluation.py
Dockerfile
env_template
README.md

Key Components

Backend: Handles query processing, document retrieval, and response generation.
Frontend: Provides a web interface for users to interact with the chatbot.
Model: Pretrained SentenceTransformer model for embedding and reranking.
Evaluation: Contains scripts for evaluating the chatbot's performance.

Setup Instructions

Prerequisites

Python 3.9 or higher
Docker (optional, for containerized deployment)
Pinecone account and API key
OpenAI API key

Steps

Clone the repository:

git clone https://github.com/ajith_vernekar/rag-chatbot.git
cd rag-chatbot

Set up the environment variables:

Copy the env_template file to .env:
```
cp env_template .env
```

Fill in the required values in the .env file:

OPENAI_API_KEY=<your_openai_api_key>
PINECONE_API_KEY=<your_pinecone_api_key>
PINECONE_ENVIRONMENT=<your_pinecone_environment>
INDEX_NAME=<your_pinecone_index_name>
BASE_URL=<backend_base_url>

Create a virtual environment and install dependencies:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r backend/requirements.txt
pip install -r frontend/requirements.txt

Environment Variables

The project requires the following environment variables:

OpenAI Configurations:
- OPENAI_API_KEY: Your OpenAI API key for GPT models.
Pinecone Configurations:
- PINECONE_API_KEY: Your Pinecone API key.
- PINECONE_ENVIRONMENT: The Pinecone environment (e.g., us-west1-gcp).
- INDEX_NAME: The name of the Pinecone index.
Backend Configuration:
- BASE_URL: The public URL to access the backend service.

How to Run

Running Locally

Backend

Navigate to the backend folder:
```
cd backend
```

Start the backend server:

uvicorn lambda_handler:app --host 0.0.0.0 --port 8000

Frontend

Navigate to the frontend folder:
```
cd frontend
```
Start the Streamlit app:
```
streamlit run app.py
```
Open your browser and go to http://localhost:8501.

Running with Docker

Build the Docker image:
```
docker build -t rag-chatbot .
```
Run the Docker container:
```
docker run -p 8080:8080 rag-chatbot
```
The backend will be accessible at http://localhost:8080.

Evaluation

The evaluation script is used to assess the performance of the RAG (Retrieval-Augmented Generation) chatbot pipeline using various metrics such as context_recall, faithfulness, and answer_relevancy.

Steps to Run the Evaluation

Set Up Environment Variables: Ensure the following environment variables are set in your .env file:
- OPENAI_API_KEY: Your OpenAI API key.
Install Dependencies: Make sure all required Python dependencies are installed. You can do this by running:
```
pip install -r requirements.txt
pip install ragas
```
Run the Evaluation Script: Navigate to the evaluation/ folder and execute the evaluation script:
```
python -m evaluation.evaluation
```

Output

The evaluation results will be saved to a CSV file named evaluation/evaluation_results.csv. The results will also be printed to the console.

Metrics Used

Context Recall: Measures how well the retrieved documents align with the context of the question.
Faithfulness: Evaluates whether the generated answers are consistent with the retrieved documents.
Answer Relevancy: Assesses the relevance of the generated answers to the questions.

Example Usage

The evaluation script processes a set of predefined questions and reference answers from the book Atomic Habits. It queries the RAG API, retrieves documents, and generates answers, which are then evaluated against the reference answers.

Troubleshooting

Validation Errors: Ensure that the retrieved_documents field in the API response is a list of strings.
API Errors: Check that the BASE_URL and OPENAI_API_KEY are correctly configured in the .env file.
Dependencies: Ensure all required libraries are installed and compatible with your Python version.

Usage

Backend API

Test Endpoint: GET /

Query Endpoint: POST /query

Request Body:

{
  "user_input": "What is RAG?",
  "openai_api_key": "<your_openai_api_key>"
}

Response:

{
  "response": "RAG stands for Retrieval-Augmented Generation..."
}

Frontend

Enter your OpenAI API key in the sidebar.
Upload a document for indexing.
Ask questions about the uploaded document.

Model Details

The project uses the all-MiniLM-L6-v2 model from SentenceTransformers. This model maps sentences and paragraphs to a 384-dimensional dense vector space, making it suitable for tasks like semantic search and clustering.

Pretrained Model

Source: Hugging Face Model Hub
Usage:
- For embedding: SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')
- For fine-tuning: Refer to the training scripts in the model folder.

Contributing

Contributions are welcome! Please follow these steps:

Fork the repository.
Create a new branch:
```
git checkout -b feature-name
```
Commit your changes:
```
git commit -m "Add feature-name"
```
Push to the branch:
```
git push origin feature-name
```
Open a pull request.

License

This project is licensed under the Apache 2.0 License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.devcontainer		.devcontainer
backend		backend
evaluation		evaluation
frontend		frontend
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
__init__.py		__init__.py
env_template		env_template

Folders and files

Latest commit

History

Repository files navigation

RAG Chatbot - Retrieval-Augmented Generation (RAG) Powered Chatbot

Table of Contents

What is RAG?

RETRIEVAL STEP

AUGMENTATION STEP

GENERATION STEP

Features

Project Structure

Key Components

Setup Instructions

Prerequisites

Steps

Environment Variables

How to Run

Running Locally

Backend

Frontend

Running with Docker

Evaluation

Steps to Run the Evaluation

Output

Metrics Used

Example Usage

Troubleshooting

Usage

Backend API

Frontend

Model Details

Pretrained Model

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages