Skip to content

RavulaTharun/FinSight-RAG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FinSight RAG – Financial Document Question Answering

FinSight is a Retrieval-Augmented Generation (RAG) application that lets users upload financial PDF reports and query them using semantic search and an LLM. It supports text and table extraction, chunking, embedding, vector search, and conversational memory.

This version includes only the Flask backend and is prepared for deployment on Hugging Face Spaces with Docker.


Features

  • Upload financial PDFs (one at a time)
  • Extract text using pdfplumber, camelot
  • Chunk and embed using sentence-transformers
  • Store and retrieve embeddings with FAISS
  • Query using semantic search (top-k)
  • Generate answers through Groq LLM
  • Maintains short conversation history
  • API-based backend suitable for any frontend

Project Structure

/
├── backend/
│   ├── app.py
│   ├── ingest.py
│   ├── embedder.py
│   ├── chunker.py
│   ├── groq_client.py
│   ├── vectorstore/
│   │   ├── index.faiss
│   │   └── metadata.json
│   └── uploads/
│
├── frontend/
│   ├── index.html
│   ├── package.json
│   ├── package-lock.json
│   ├── postcss.config.js
│   ├── tailwind.config.ts
│   ├── vite.config.ts
│   ├── tsconfig.json
│   ├── public/
│   │   ├── favicon.png
│   │   └── logo.svg
│   └── src/
│       ├── App.tsx
│       ├── main.tsx
│       ├── index.css
│       ├── lib/
│       │   └── utils.ts
│       ├── components/
│       │   ├── navbar.tsx
│       │   ├── pdfUploader.tsx
│       │   ├── chatInput.tsx
│       │   ├── chunkList.tsx
│       │   └── ui/
│       │       ├── button.tsx
│       │       ├── card.tsx
│       │       ├── textarea.tsx
│       │       ├── input.tsx
│       │       ├── tooltip.tsx
│       │       ├── dialog.tsx
│       │       ├── avatar.tsx
│       │       ├── scroll-area.tsx
│       │       └── toast.tsx
│       ├── pages/
│       │   ├── home.tsx
│       │   └── chat.tsx
│       ├── hooks/
│       │   └── useChat.ts
│       └── api/
│           ├── queryClient.ts
│           └── api.ts
│
├── requirements.txt
├── Dockerfile
├── app.yaml
└── README.md


API Endpoints

1. Health Check

GET /api/health

2. Upload PDF

POST /api/upload Multipart form data with file

3. Query Document

POST /api/query JSON body: { "query": "your question" }

4. Get Specific Chunk

GET /api/chunk/<chunk_id>

5. Reset Server State

POST /api/reset


Environment Variables

Set on Hugging Face (recommended):

GROQ_API_KEY=<your_key>

Or create a .env file locally:

GROQ_API_KEY=your_key

Running Locally

Create environment:

python3.10 -m venv venv
source venv/bin/activate  (or venv\Scripts\activate on Windows)
pip install -r requirements.txt

Run server:

python backend/app.py

Backend starts at:

http://localhost:7860

Deployment (Hugging Face)

app.yaml

runtime: docker
app_file: backend/app.py
port: 7860

Add secret key

Hugging Face → Settings → Variables and Secrets:

GROQ_API_KEY=your_key

Push repo to HF Space. HF will build the Docker container and run the Flask app.

About

AI-powered semantic search and insights from financial documents.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors