-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Open
Description
Proposed cookbook
A notebook demonstrating how to build a RAG pipeline for Indian financial regulatory documents using Claude.
Why this would be valuable
Indian financial regulation has unique challenges that make it an excellent RAG case study:
- Cross-referencing circulars - SEBI circulars reference and amend each other (e.g., "as amended by SEBI/HO/IMD/DF2/CIR/P/2021/024")
- Table-heavy documents - Regulatory thresholds, fee structures, and compliance limits in tables that lose context when chunked naively
- Amendment chains - Understanding a regulation requires tracing its full amendment history
- Domain-specific terminology - BFSI jargon that generic embeddings handle poorly
What the notebook would cover
- Parsing SEBI/RBI circular PDFs with structure preservation
- Domain-specific chunking strategies for regulatory text
- Hybrid search (semantic + keyword) for regulatory QA
- Citation tracking back to specific circular numbers
- Evaluation on real regulatory questions
Context
I'm building this as RegAI and would be happy to contribute a cleaned-up notebook version. India's financial services industry is one of the fastest-growing Claude use cases, and a cookbook for this domain would help many developers.
Related
- Existing RAG cookbook covers general use cases well
- This would add a domain-specific, non-English-centric example
- Data sources are all public domain (sebi.gov.in, rbi.org.in)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels