This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
This is a Course Materials RAG (Retrieval-Augmented Generation) system - a full-stack web application that enables semantic search and AI-powered question answering over course documents. The system uses ChromaDB for vector storage, Anthropic's Claude for generation, and serves a web interface for user interaction.
# Quick start (recommended)
./run.sh
# Manual start
cd backend && uv run uvicorn app:app --reload --port 8000
# Development with different port (if 8000 is occupied)
cd backend && uv run uvicorn app:app --reload --port 8001# Install dependencies
uv sync
# Create environment file (required)
cp .env.example .env
# Then edit .env with your ANTHROPIC_API_KEYThe system follows a modular RAG architecture with clear separation of concerns:
RAGSystem (backend/rag_system.py) - Main orchestrator that coordinates all components:
- DocumentProcessor: Chunks course documents into searchable segments
- VectorStore: Manages ChromaDB for semantic search using sentence transformers
- AIGenerator: Handles Anthropic Claude API calls for response generation
- SessionManager: Maintains conversation context and history
- ToolManager: Coordinates search tools for enhanced retrieval
Data Models (backend/models.py):
Course: Contains title, instructor, lessons, and course metadataLesson: Individual lessons with titles and linksCourseChunk: Text segments with course/lesson context for vector storage
Frontend Architecture:
- Vanilla HTML/CSS/JS with pink-themed UI (frontend/style.css)
- JavaScript handles API communication and UI updates (frontend/script.js)
- FastAPI serves both static files and API endpoints
- Uses Claude Sonnet 4 model (
claude-sonnet-4-20250514) - Embedding model:
all-MiniLM-L6-v2 - Document chunks: 800 characters with 100 character overlap
- Conversation history: 2 messages retained
- ChromaDB path:
./chroma_db
POST /api/query: Submit questions, returns AI response with sourcesGET /api/courses: Retrieve course statistics and titles/: Static file serving for frontend
- Course documents (
.txtfiles) placed indocs/folder - DocumentProcessor extracts course metadata and lessons from structured text
- Content is chunked and stored in ChromaDB with course/lesson context
- User queries trigger semantic search + Claude generation with retrieved context
- Session manager maintains conversation history for follow-up questions
- Uses
uvfor Python dependency management - FastAPI with auto-reload for backend development
- Frontend styling uses CSS custom properties for theming
- ChromaDB persists to local filesystem for data retention
- Requires ANTHROPIC_API_KEY environment variable