TuneHQ · Legedith · Sep 21, 2024 · Sep 21, 2024 · Sep 22, 2024
diff --git a/.DS_Store b/.DS_Store
diff --git a/video-to-notes-generator/README.md b/video-to-notes-generator/README.md
@@ -0,0 +1,148 @@
+# YouTube Video to Markdown Notes Converter
+
+This project allows you to take YouTube video links as input and automatically generate detailed notes in Markdown. It supports features like adding slides, creating transcripts, generating images for specific sections using Tune AI, and inserting timestamps linked directly to the YouTube video.
+
+## Table of Contents
+
+- [Project Overview](#project-overview)
+- [Features](#features)
+- [Installation](#installation)
+- [Usage](#usage)
+- [Configuration](#configuration)
+- [Examples](#examples)
+- [Contributing](#contributing)
+- [License](#license)
+
+## Project Overview
+
+This tool extracts meaningful notes from a YouTube video, converts them to Markdown, and enriches the content with various additional features, such as:
+
+- Time-stamped links to specific sections of the video.
+- Integration with Tune AI to automatically generate images based on the context (examples, concepts, etc.).
+- Support for code blocks, tables, graphs, and more.
+- Optional inclusion of screenshots or PDF lecture slides along with the video.
+
+We recently migrated from a custom version of Gemini to Tune AI to leverage their larger context length and faster processing.
+
+## Features
+
+- **Video-to-Markdown**: Generate structured notes from a YouTube video, with automatic time-stamped links to the video.
+- **Tune AI Integration**: Automatically generate images based on examples or references in the video.
+- **Support for Rich Media**: Handle code blocks, tables, graphs, and other complex structures in Markdown.
+- **Slide Support**: Optionally include screenshots or PDFs containing lecture slides as input.
+- **Transcript Generation**: Automatically create transcripts from the video.
+
+## Installation
+
+To set up this project, ensure you are using Python 3.9 and Poetry for environment and dependency management.
+
+### Step 1: Install Python 3.9
+
+Download and install Python 3.9 from the official [Python website](https://www.python.org/downloads/release/python-390/).
+
+### Step 2: Install Dependencies
+
+1. Clone the repository and navigate into the project directory:
+
+    ```bash
+    git clone <repository-url>
+    cd <project-directory>
+    ```
+
+2. Set up your Poetry environment with Python 3.9:
+
+    ```bash
+    poetry env use "C:\Program Files\Python39\python.exe"   # Use your Python 3.9 path
+    set PYTHONPATH=%PYTHONPATH%;%CD%    # Add current directory to PYTHONPATH
+    ```
+
+3. Activate the Poetry virtual environment:
+
+    ```bash
+    poetry shell
+    ```
+
+4. Install required dependencies:
+
+    ```bash
+    pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
+    ```
+
+5. Install all project dependencies using Poetry:
+
+    ```bash
+    poetry install
+    ```
+
+   > If the installation fails, you may need to adjust the `pyproject.toml` file to ensure compatibility with Python 3.9 and other dependencies.
+
+## Usage
+
+### Running the Tool
+
+Once the environment is set up, you can run the tool by providing YouTube video links, optional screenshots, or PDFs as input:
+
+```bash
+python core/notes_generator/create_notes.py
+```
+
+### Generating Notes with Images
+
+The tool will analyze the content of the video, generate notes, and timestamp links. It will also generate images in certain sections (such as examples) using Tune AI
+
+## Configuration
+
+If you encounter any issues with the environment, ensure your `pyproject.toml` file is set up correctly:
+
+```toml
+[tool.poetry]
+name = "youtube-markdown-notes"
+version = "0.1.0"
+description = "Generate markdown notes from YouTube videos with images and timestamps."
+authors = ["Your Name <[email protected]>"]
+
+[tool.poetry.dependencies]
+python = "^3.9"
+torch = "^1.13"
+torchvision = "^0.14"
+torchaudio = "^0.13"
+
+[build-system]
+requires = ["poetry-core>=1.0.0"]
+build-backend = "poetry.core.masonry.api"
+```
+
+## Examples
+
+Below are some examples of how the notes are structured:
+
+```markdown
+## Section 1: Introduction [00:02:30](https://www.youtube.com/watch?v=abc123&t=150s)
+
+- Overview of the video
+- Key points
+
+![Generated Image](images/intro_example.png)
+
+---
+
+## Section 2: Main Topic [00:15:00](https://www.youtube.com/watch?v=abc123&t=900s)
+
+- Explanation of key concepts
+- Example:
+  - Code snippet:
+    ```python
+    def example():
+        print("Hello World")
+    ```
+
+![Generated Image](images/main_topic_example.png)
+```
+
+## Contributing
+
+Contributions are welcome! Please submit a pull request or open an issue for any bugs or feature requests.
+
+## License
+
+This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
diff --git a/video-to-notes-generator/core/__init_.py b/video-to-notes-generator/core/__init_.py
diff --git a/video-to-notes-generator/core/notes_generator/__init__.py b/video-to-notes-generator/core/notes_generator/__init__.py
diff --git a/video-to-notes-generator/core/notes_generator/add_timestamp.py b/video-to-notes-generator/core/notes_generator/add_timestamp.py
@@ -0,0 +1,43 @@
+import json
+import re
+
+
+class TimestampAdder:
+    def __init__(self, notes_location, transcript_location) -> None:
+        self.notes_location = notes_location
+        self.transcript_location = transcript_location
+        self.phrase_to_timestamp = self._load_transcript()
+
+    def _load_transcript(self) -> dict:
+        with open(self.transcript_location) as file:
+            transcript = json.load(file)
+        return {entry["phrase"]: entry["timestamp"] for entry in transcript}
+
+    def _read_notes(self) -> list:
+        with open(self.notes_location) as file:
+            return file.readlines()
+
+    def _write_notes(self, notes, new_location) -> str:
+        with open(new_location, "w") as file:
+            file.writelines(notes)
+        return new_location
+
+    def add_timestamps(self, new_location) -> str:
+        notes = self._read_notes()
+        updated_notes = []
+
+        for line in notes:
+            updated_line = line
+            for phrase, timestamp in self.phrase_to_timestamp.items():
+                if re.search(r"\b" + re.escape(phrase) + r"\b", line):
+                    updated_line = line.strip() + f" [{timestamp}]\n"
+                    break
+            updated_notes.append(updated_line)
+
+        return self._write_notes(updated_notes, new_location)
+
+
+# Example usage:
+# adder = TimestampAdder('path/to/notes.md', 'path/to/transcript.json')
+# new_file_path = adder.add_timestamps('path/to/new_notes.md')
+# print(f"New file saved at: {new_file_path}")
diff --git a/video-to-notes-generator/core/notes_generator/create_notes.py b/video-to-notes-generator/core/notes_generator/create_notes.py
@@ -0,0 +1,163 @@
+# a class that will be given a youtube link and slides as folder path
+
+# based on this, the flow shall be as follows:
+# 1. download the audio, title from youtube
+# 2. extract the text from the audio
+# 3. clean the text
+# 4. extract the text from the images
+# 5. clean the text
+# 6. prepare the notes
+import logging
+import os
+import time
+from datetime import datetime, timezone
+
+from core.post_processing.fill_slides import SlideReplacer
+from core.post_processing.timestamp import TimestampedNoteProcessor
+from tools.audio.audio_extractor.whisper_extractor import WhisperAudioExtractor
+from tools.text.generator.notes_generator import NotesGenerator
+from tools.text.slide_processor.extractors.img_handler import ImageHandler
+from tools.text.slide_processor.slide_inserter import SlideInserter
+from tools.text.text_formatter.computer_science_text_formatter import (
+    ComputerScienceTextFormatter,
+)
+from tools.video.downloader import YouTubeAudioExtractor
+
+logger = logging.getLogger(__name__)
+logging.basicConfig(level=logging.INFO)
+
+
+class NotesCreator:
+    def __init__(
+        self,
+        youtube_url,
+        slides_folder_path,
+        path,
+        language="en",
+        tesseract_cmd=None,
+    ) -> None:
+        logger.info(
+            f"Initializing NotesCreator with YouTube URL: {youtube_url} and "
+            f"slides folder path: {slides_folder_path}",
+        )
+        self.youtube_url = youtube_url
+        self.slides_folder_path = slides_folder_path
+        self.tesseract_cmd = tesseract_cmd
+        self.audio_extractor = YouTubeAudioExtractor(youtube_url)
+        self.whisper_audio_extractor = WhisperAudioExtractor()
+        self.image_handler = ImageHandler(slides_folder_path, tesseract_cmd)
+        self.text_formatter = ComputerScienceTextFormatter()
+        self.image_path = path
+        self.language = language
+        logger.info("NotesCreator initialized successfully")
+
+    def generate_notes(self) -> str:
+        logger.info("Starting note generation process")
+
+        # Extract audio from YouTube
+        logger.info("Extracting audio from YouTube")
+        audio_path, video_title = self.audio_extractor.extract_audio()
+        if not audio_path:
+            logger.error("Error extracting audio from YouTube")
+            return "Error extracting audio from YouTube"
+        logger.info(f"Audio extracted successfully: {audio_path}")
+        print(f"\nAudio Path: {audio_path}\nVideo Title: {video_title}\n")
+
+        # Extract text from audio
+        logger.info("Extracting text from audio")
+        audio_text, segments = self.whisper_audio_extractor.extract_text(
+            audio_path,
+            language=self.language,
+        )
+        cleaned_audio_text = self.text_formatter.format_text(
+            audio_text,
+            domain=video_title,
+        )
+        logger.info("Text extracted and formatted from audio")
+        print(f"\nAudio Text: {audio_text}\nCleaned Audio Text: {cleaned_audio_text}\n")
+
+        # Extract text from images
+        logger.info("Extracting text from images")
+        image_texts = self.image_handler.process_images()
+        cleaned_image_texts = {
+            slide: self.text_formatter.format_text(text, domain=video_title)
+            for slide, text in image_texts.items()
+        }
+        logger.info("Text extracted and formatted from images")
+        print(
+            f"\nImage Texts: {image_texts}\nCleaned Image Texts: {cleaned_image_texts}\n",
+        )
+
+        # Generate notes
+        logger.info("Generating notes")
+        title = video_title
+        transcript = cleaned_audio_text
+        slides = cleaned_image_texts
+        notes_generator = NotesGenerator(title, transcript)
+        notes_content = notes_generator.generate_notes()
+        logger.info("Notes generated successfully")
+        print(f"\nNotes Content: {notes_content}\n")
+
+        # Insert slides into notes
+        logger.info("Inserting slides into notes")
+        slide_inserter = SlideInserter(notes_content, slides)
+        notes_content = slide_inserter.insert_slides()
+        logger.info("Slides inserted into notes")
+
+        # Process notes with timestamps
+        logger.info("Processing notes with timestamps")
+        note_processor = TimestampedNoteProcessor(segments)
+        new_notes_content, matches = note_processor.process_notes(
+            notes_content,
+            self.youtube_url,
+        )
+        logger.info("Notes processed with timestamps")
+        print(f"\nNew Notes Content: {new_notes_content}\nMatches: {matches}\n")
+
+        # Save notes to a markdown file
+        return NotesCreator.save_notes_to_file(
+            new_notes_content,
+            self.slides_folder_path,
+            self.image_path,
+        )
+
+    @staticmethod
+    def save_notes_to_file(notes_content, slides_folder_path, image_path) -> str:
+        logger.info("Saving notes to a markdown file")
+
+        # Determine the parent directory of the slides folder
+        parent_dir = os.path.dirname(slides_folder_path)
+        folder_path = os.path.join(parent_dir, "ai_generated_notes")
+
+        timestamp = datetime.now(timezone.utc).strftime("%Y%m%d%H%M")
+        filename = f"note_{timestamp}.md"
+        os.makedirs(folder_path, exist_ok=True)
+        file_path = os.path.join(folder_path, filename)
+        replaced_markdown = SlideReplacer.replace_slides(
+            notes_content,
+            slides_folder_path,
+            image_path,
+        )
+        print(replaced_markdown)
+
+        try:
+            with open(file_path, "w", encoding="utf-8") as file:
+                file.write(replaced_markdown)
+            logger.info(f"Notes saved to {file_path}")
+            print(f"\nNotes saved to: {file_path}\n")
+            return f"{file_path}"
+        except Exception as e:
+            logger.error(f"Failed to save notes: {e}")
+            return f"Failed to save notes: {e}"
+
+
+# Usage
+if __name__ == "__main__":
+    start = time.time()
+    youtube_url = "https://youtu.be/TSYNHb6YBEE"
+    slides_folder_path = "test/econ/slides"
+    path = "/jott/econ/lec6/"
+    notes_creator = NotesCreator(youtube_url, slides_folder_path, path, language="hi")
+    notes_creator.generate_notes()
+    end = time.time()
+    print(f"Time taken: {end - start} seconds")
diff --git a/video-to-notes-generator/core/post_processing/__init__.py b/video-to-notes-generator/core/post_processing/__init__.py