An AI-powered medicine identification and information system that uses computer vision and retrieval-augmented generation (RAG) to provide accurate, source-backed answers about medications from their official patient leaflets.
MedaBot solves the problem of quickly accessing reliable medication information by:
- Identifying medicines from photos of packaging/pills
- Retrieving official patient leaflets from regulatory databases
- Processing leaflets with AI to enable natural language queries
- Providing accurate, source-backed answers constrained to official documentation
- React 19 - Modern UI components with latest features
- TanStack Router - Type-safe routing and navigation
- TanStack Start - Full-stack React framework with SSR
- Tailwind CSS 4 - Utility-first styling with latest features
- React Webcam - Camera integration for medicine photography
- TypeScript - Type safety throughout the application
- Vinxi - Modern build tool and development server
- Node.js - Server-side runtime environment
Technology: OpenAI GPT-4.1-nano with Vision API
- Analyzes uploaded medicine photos
- Extracts structured data: name, brand, active substance, dosage
- Uses Zod schema validation for type-safe responses
- Optimized prompts for pharmaceutical accuracy
Technology: Playwright browser automation
- Automated web scraping of Portuguese INFARMED database
- Headless browser navigation and form submission
- Network interception to capture PDF downloads
- Robust error handling and timeout management
Technology: LangChain.js + OpenAI Embeddings
- PDF Processing: LangChain PDFLoader extracts text from patient leaflets
- Text Chunking: RecursiveCharacterTextSplitter creates overlapping chunks (1000 chars, 200 overlap)
- Vector Embeddings: OpenAI embeddings model creates semantic representations
- Vector Storage: In-memory vector store for fast similarity search
Technology: LangChain Retriever + ChatGPT-4
- Semantic Search: Vector similarity search finds relevant leaflet sections
- Context Assembly: Top-k relevant chunks provide context to LLM
- Constrained Generation: System prompts ensure responses use only leaflet content
- Source Attribution: Responses include references to specific leaflet sections
- Source Constraint: AI responses limited to official leaflet content only
- Medical Disclaimers: Clear guidance to consult healthcare professionals
# Install dependencies
npm install
# Set up environment variables
cp .envrc.example .envrc
# Add your OPENAI_API_KEY
# Start development server
npm run dev
# Build for production
npm run build
npm start- 📸 Capture: Take a photo of medicine packaging
- 🔍 Identify: AI extracts medicine details from image
- 📄 Fetch: System retrieves official patient leaflet
- 🤖 Process: LangChain creates searchable knowledge base
- 📝 Overview: Generate initial medicine summary
- ✅ Query: Ask natural language questions about the medicine
Built with modern web technologies and AI to make medication information more accessible and reliable.