Skip to content

Darsh29/NeuroDigest-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧠 NeuroDigest AI (End-to-End GenAI Pipeline)

An automated system that collects AI content, generates summaries using LLMs, ranks them, and delivers a personalized daily email digest — fully deployed on the cloud.


🚀 Pipeline Flow

graph TD
    A[Sources: YouTube / OpenAI / Anthropic] --> B[Scrapers]
    B --> C[Raw Data Stored in PostgreSQL]

    C --> D[Content Processing]
    D --> E[LLM Summarization]

    E --> F[Digest Table]

    F --> G[Curator Agent Ranking]
    G --> H[Top-N Selection]

    H --> I[Email Agent]
    I --> J[HTML + Markdown Email]

    J --> K[Gmail SMTP Delivery]

    F --> L[Mark as Sent]
Loading

⚙️ Overview

The system runs as a daily cron job and performs:

  • Multi-source scraping
  • LLM-based summarization
  • Personalized ranking
  • Email delivery
  • State tracking (sent_at) to prevent duplicates

🏗️ Architecture

Sources → Scrapers → DB → LLM → Ranking → Email → User

🧰 Tech Stack

  • Python 3.12
  • PostgreSQL (local + Render)
  • SQLAlchemy
  • OpenAI API
  • Docker + Render
  • uv (dependency management)

🔁 Pipeline Steps

  1. Scrape latest AI content
  2. Extract transcripts / text
  3. Generate summaries using LLM
  4. Store digests in database
  5. Rank based on user profile
  6. Send email digest
  7. Mark digests as sent

🔐 Environment Handling

Environment Database
LOCAL PostgreSQL via POSTGRES_*
PRODUCTION DATABASE_URL (Render)

📦 Key Features

  • ✅ End-to-end automated pipeline
  • ✅ LLM-powered summarization
  • ✅ Smart ranking system
  • ✅ Duplicate prevention (sent_at)
  • ✅ Cloud deployment with cron jobs
  • ✅ Clean modular architecture

🚀 Deployment

  • Uses render.yaml

  • Deploys:

    • PostgreSQL DB
    • Cron job (daily-digest-job)

📁 Project Structure

app/
├── agent/
├── database/
├── scrapers/
├── services/
├── profiles/
├── daily_runner.py
├── runner.py
main.py
Dockerfile
render.yaml

👨‍💻 Author

Darsh Vora MS Data Analytics Engineering — Northeastern University


⭐ Final Note

This project demonstrates building and deploying a real-world GenAI system combining data pipelines, LLMs, and cloud infrastructure.

If you found this useful, consider ⭐ the repo!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors