Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added .DS_Store
Binary file not shown.
148 changes: 148 additions & 0 deletions video-to-notes-generator/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
# YouTube Video to Markdown Notes Converter

This project allows you to take YouTube video links as input and automatically generate detailed notes in Markdown. It supports features like adding slides, creating transcripts, generating images for specific sections using Tune AI, and inserting timestamps linked directly to the YouTube video.

## Table of Contents

- [Project Overview](#project-overview)
- [Features](#features)
- [Installation](#installation)
- [Usage](#usage)
- [Configuration](#configuration)
- [Examples](#examples)
- [Contributing](#contributing)
- [License](#license)

## Project Overview

This tool extracts meaningful notes from a YouTube video, converts them to Markdown, and enriches the content with various additional features, such as:

- Time-stamped links to specific sections of the video.
- Integration with Tune AI to automatically generate images based on the context (examples, concepts, etc.).
- Support for code blocks, tables, graphs, and more.
- Optional inclusion of screenshots or PDF lecture slides along with the video.

We recently migrated from a custom version of Gemini to Tune AI to leverage their larger context length and faster processing.

## Features

- **Video-to-Markdown**: Generate structured notes from a YouTube video, with automatic time-stamped links to the video.
- **Tune AI Integration**: Automatically generate images based on examples or references in the video.
- **Support for Rich Media**: Handle code blocks, tables, graphs, and other complex structures in Markdown.
- **Slide Support**: Optionally include screenshots or PDFs containing lecture slides as input.
- **Transcript Generation**: Automatically create transcripts from the video.

## Installation

To set up this project, ensure you are using Python 3.9 and Poetry for environment and dependency management.

### Step 1: Install Python 3.9

Download and install Python 3.9 from the official [Python website](https://www.python.org/downloads/release/python-390/).

### Step 2: Install Dependencies

1. Clone the repository and navigate into the project directory:

```bash
git clone <repository-url>
cd <project-directory>
```

2. Set up your Poetry environment with Python 3.9:

```bash
poetry env use "C:\Program Files\Python39\python.exe" # Use your Python 3.9 path
set PYTHONPATH=%PYTHONPATH%;%CD% # Add current directory to PYTHONPATH
```

3. Activate the Poetry virtual environment:

```bash
poetry shell
```

4. Install required dependencies:

```bash
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
```

5. Install all project dependencies using Poetry:

```bash
poetry install
```

> If the installation fails, you may need to adjust the `pyproject.toml` file to ensure compatibility with Python 3.9 and other dependencies.

## Usage

### Running the Tool

Once the environment is set up, you can run the tool by providing YouTube video links, optional screenshots, or PDFs as input:

```bash
python core/notes_generator/create_notes.py
```

### Generating Notes with Images

The tool will analyze the content of the video, generate notes, and timestamp links. It will also generate images in certain sections (such as examples) using Tune AI

## Configuration

If you encounter any issues with the environment, ensure your `pyproject.toml` file is set up correctly:

```toml
[tool.poetry]
name = "youtube-markdown-notes"
version = "0.1.0"
description = "Generate markdown notes from YouTube videos with images and timestamps."
authors = ["Your Name <[email protected]>"]

[tool.poetry.dependencies]
python = "^3.9"
torch = "^1.13"
torchvision = "^0.14"
torchaudio = "^0.13"

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
```

## Examples

Below are some examples of how the notes are structured:

```markdown
## Section 1: Introduction [00:02:30](https://www.youtube.com/watch?v=abc123&t=150s)

- Overview of the video
- Key points

![Generated Image](images/intro_example.png)

---

## Section 2: Main Topic [00:15:00](https://www.youtube.com/watch?v=abc123&t=900s)

- Explanation of key concepts
- Example:
- Code snippet:
```python
def example():
print("Hello World")
```

![Generated Image](images/main_topic_example.png)
```

## Contributing

Contributions are welcome! Please submit a pull request or open an issue for any bugs or feature requests.

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
Empty file.
Empty file.
43 changes: 43 additions & 0 deletions video-to-notes-generator/core/notes_generator/add_timestamp.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
import json
import re


class TimestampAdder:
def __init__(self, notes_location, transcript_location) -> None:
self.notes_location = notes_location
self.transcript_location = transcript_location
self.phrase_to_timestamp = self._load_transcript()

def _load_transcript(self) -> dict:
with open(self.transcript_location) as file:
transcript = json.load(file)
return {entry["phrase"]: entry["timestamp"] for entry in transcript}

def _read_notes(self) -> list:
with open(self.notes_location) as file:
return file.readlines()

def _write_notes(self, notes, new_location) -> str:
with open(new_location, "w") as file:
file.writelines(notes)
return new_location

def add_timestamps(self, new_location) -> str:
notes = self._read_notes()
updated_notes = []

for line in notes:
updated_line = line
for phrase, timestamp in self.phrase_to_timestamp.items():
if re.search(r"\b" + re.escape(phrase) + r"\b", line):
updated_line = line.strip() + f" [{timestamp}]\n"
break
updated_notes.append(updated_line)

return self._write_notes(updated_notes, new_location)


# Example usage:
# adder = TimestampAdder('path/to/notes.md', 'path/to/transcript.json')
# new_file_path = adder.add_timestamps('path/to/new_notes.md')
# print(f"New file saved at: {new_file_path}")
163 changes: 163 additions & 0 deletions video-to-notes-generator/core/notes_generator/create_notes.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
# a class that will be given a youtube link and slides as folder path

# based on this, the flow shall be as follows:
# 1. download the audio, title from youtube
# 2. extract the text from the audio
# 3. clean the text
# 4. extract the text from the images
# 5. clean the text
# 6. prepare the notes
import logging
import os
import time
from datetime import datetime, timezone

from core.post_processing.fill_slides import SlideReplacer
from core.post_processing.timestamp import TimestampedNoteProcessor
from tools.audio.audio_extractor.whisper_extractor import WhisperAudioExtractor
from tools.text.generator.notes_generator import NotesGenerator
from tools.text.slide_processor.extractors.img_handler import ImageHandler
from tools.text.slide_processor.slide_inserter import SlideInserter
from tools.text.text_formatter.computer_science_text_formatter import (
ComputerScienceTextFormatter,
)
from tools.video.downloader import YouTubeAudioExtractor

logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.INFO)


class NotesCreator:
def __init__(
self,
youtube_url,
slides_folder_path,
path,
language="en",
tesseract_cmd=None,
) -> None:
logger.info(
f"Initializing NotesCreator with YouTube URL: {youtube_url} and "
f"slides folder path: {slides_folder_path}",
)
self.youtube_url = youtube_url
self.slides_folder_path = slides_folder_path
self.tesseract_cmd = tesseract_cmd
self.audio_extractor = YouTubeAudioExtractor(youtube_url)
self.whisper_audio_extractor = WhisperAudioExtractor()
self.image_handler = ImageHandler(slides_folder_path, tesseract_cmd)
self.text_formatter = ComputerScienceTextFormatter()
self.image_path = path
self.language = language
logger.info("NotesCreator initialized successfully")

def generate_notes(self) -> str:
logger.info("Starting note generation process")

# Extract audio from YouTube
logger.info("Extracting audio from YouTube")
audio_path, video_title = self.audio_extractor.extract_audio()
if not audio_path:
logger.error("Error extracting audio from YouTube")
return "Error extracting audio from YouTube"
logger.info(f"Audio extracted successfully: {audio_path}")
print(f"\nAudio Path: {audio_path}\nVideo Title: {video_title}\n")

# Extract text from audio
logger.info("Extracting text from audio")
audio_text, segments = self.whisper_audio_extractor.extract_text(
audio_path,
language=self.language,
)
cleaned_audio_text = self.text_formatter.format_text(
audio_text,
domain=video_title,
)
logger.info("Text extracted and formatted from audio")
print(f"\nAudio Text: {audio_text}\nCleaned Audio Text: {cleaned_audio_text}\n")

# Extract text from images
logger.info("Extracting text from images")
image_texts = self.image_handler.process_images()
cleaned_image_texts = {
slide: self.text_formatter.format_text(text, domain=video_title)
for slide, text in image_texts.items()
}
logger.info("Text extracted and formatted from images")
print(
f"\nImage Texts: {image_texts}\nCleaned Image Texts: {cleaned_image_texts}\n",
)

# Generate notes
logger.info("Generating notes")
title = video_title
transcript = cleaned_audio_text
slides = cleaned_image_texts
notes_generator = NotesGenerator(title, transcript)
notes_content = notes_generator.generate_notes()
logger.info("Notes generated successfully")
print(f"\nNotes Content: {notes_content}\n")

# Insert slides into notes
logger.info("Inserting slides into notes")
slide_inserter = SlideInserter(notes_content, slides)
notes_content = slide_inserter.insert_slides()
logger.info("Slides inserted into notes")

# Process notes with timestamps
logger.info("Processing notes with timestamps")
note_processor = TimestampedNoteProcessor(segments)
new_notes_content, matches = note_processor.process_notes(
notes_content,
self.youtube_url,
)
logger.info("Notes processed with timestamps")
print(f"\nNew Notes Content: {new_notes_content}\nMatches: {matches}\n")

# Save notes to a markdown file
return NotesCreator.save_notes_to_file(
new_notes_content,
self.slides_folder_path,
self.image_path,
)

@staticmethod
def save_notes_to_file(notes_content, slides_folder_path, image_path) -> str:
logger.info("Saving notes to a markdown file")

# Determine the parent directory of the slides folder
parent_dir = os.path.dirname(slides_folder_path)
folder_path = os.path.join(parent_dir, "ai_generated_notes")

timestamp = datetime.now(timezone.utc).strftime("%Y%m%d%H%M")
filename = f"note_{timestamp}.md"
os.makedirs(folder_path, exist_ok=True)
file_path = os.path.join(folder_path, filename)
replaced_markdown = SlideReplacer.replace_slides(
notes_content,
slides_folder_path,
image_path,
)
print(replaced_markdown)

try:
with open(file_path, "w", encoding="utf-8") as file:
file.write(replaced_markdown)
logger.info(f"Notes saved to {file_path}")
print(f"\nNotes saved to: {file_path}\n")
return f"{file_path}"
except Exception as e:
logger.error(f"Failed to save notes: {e}")
return f"Failed to save notes: {e}"


# Usage
if __name__ == "__main__":
start = time.time()
youtube_url = "https://youtu.be/TSYNHb6YBEE"
slides_folder_path = "test/econ/slides"
path = "/jott/econ/lec6/"
notes_creator = NotesCreator(youtube_url, slides_folder_path, path, language="hi")
notes_creator.generate_notes()
end = time.time()
print(f"Time taken: {end - start} seconds")
Empty file.
Loading