Skip to content

Yusuf-Abol/tensors2transformers

Repository files navigation

tensors2transformers

A ground-up implementation of a machine learning framework — from scalar derivatives to a working GPT model — built over 30 structured days.


Overview

This project is a self-directed, deep-dive study into the internals of modern machine learning systems. Rather than relying on high-level libraries like PyTorch or TensorFlow, every component — tensors, automatic differentiation, neural layers, optimizers, and attention — is built from scratch in Python.

The work follows a structured 30-day roadmap across 7 phases, drawing from Andrej Karpathy's Micrograd and the TinyTorch curriculum, supplemented by mathematical exploration through Engel's Problem-Solving Strategies.

Core question driving this project:
What actually happens when a neural network learns?


Project Structure

tensors2transformers/
│
├── 01_phase1_prerequisite/        # Python fundamentals, derivative intuition, math env setup
├── 02_phase2_tensors+scalars/     # Tensor class, broadcasting, matmul, shape manipulation
├── 03_phase3_autodiff/            # Autograd engine, chain rule, topological sort, backprop
├── BuildYourOwnMLSYS/          # Neural layers, activations, loss functions, training loop
├── LinAlg/                     # Linear algebra notebooks and exercises
├── notebooks/                  # Exploratory and milestone notebooks
├── scripts/                    # Utility and training scripts
├── media/                      # Diagrams, plots, and visualizations
├── Logs/                       # Training logs and experiment records
├── study-plan-resources/       # Roadmap PDF and reference materials
├── weekend-with-engel/         # Mathematical explorations (Fractals, Markov Chains, etc.)
│
└── environment.yml             # Conda environment specification

File Naming Convention

All files in this project follow a consistent naming scheme for clarity and reproducibility.

Notebooks (notebooks/)

<phase>_<day>_<topic>.ipynb

Examples:
  p1_d03_derivative_intuition.ipynb
  p2_d07_tensor_class.ipynb
  p3_d13_autograd_engine.ipynb
  p4_d20_milestone_xor_perceptron.ipynb

Scripts (scripts/)

<function>_<descriptor>.py

Examples:
  train_mnist_classifier.py
  benchmark_matmul.py
  visualize_computation_graph.py

Logs (Logs/)

<experiment>_<YYYY-MM-DD>.log

Examples:
  training_run_2025-06-01.log
  autograd_test_2025-06-05.log

Media / Figures (media/)

<topic>_<descriptor>.<ext>

Examples:
  backprop_computation_graph.png
  loss_curve_xor.png
  attention_heatmap_gpt.png

Milestone Outputs

milestone_<number>_<name>.ipynb

Examples:
  milestone_01_perceptron_1957.ipynb
  milestone_02_xor_crisis.ipynb
  milestone_05_gpt_text_generation.ipynb

Phases at a Glance

Phase Days Focus
1 1–5 Python & Math Prerequisites, Derivative Intuition
2 6–10 Tensor Engine — N-D Arrays, Broadcasting, Matmul
3 11–15 Automatic Differentiation — Backprop & Chain Rule
4 16–20 Neural Layers — Activations, Linear, XOR Milestone
5 21–25 Training Infrastructure — Loss, DataLoader, Optimizers
6 26–30 Advanced Architectures — CNNs, Attention, Transformers
7 31+ Optimization — Quantization, KV Caching, Profiling

Key Milestones

  • Milestone 1: Recreated the 1957 Perceptron from scratch
  • Milestone 2: Solved the 1969 XOR problem — proving depth enables non-linear boundaries
  • Milestone 3: Built a reverse-mode automatic differentiation engine
  • Milestone 4: Trained a full model with custom DataLoader, SGD + Adam optimizers
  • Milestone 5: Assembled a working GPT model capable of text generation

Technical Stack

  • Language: Python 3.x
  • Core Libraries: NumPy (for validation), Matplotlib (for visualization)
  • Environment: Conda (environment.yml)
  • Key Constraint: No PyTorch or TensorFlow — all ML primitives built by hand

Why This Project

Most machine learning courses teach how to use frameworks. This project is about understanding how they work. Building autograd by hand, implementing backpropagation step by step, and assembling a transformer from the attention mechanism up gives an intuition that no API call can provide.

This kind of foundational understanding is what I am bringing into postgraduate study — not just the ability to run models, but to reason about, debug, and extend them.


Setup

# Clone the repo
git clone https://github.com/<your-username>/tensors2transformers.git
cd tensors2transformers

# Create and activate environment
conda env create -f environment.yml
conda activate tensors2transformers

References

  • Karpathy, A. — Micrograd and Neural Networks: Zero to Hero lecture series
  • TinyTorch curriculum — modular deep learning framework guide
  • Engel, A. — Problem-Solving Strategies (weekend mathematical explorations)
  • Davison, B. — Exploring Mathematics with Python

About

30 days. 0 frameworks. Built tensors, autograd, and a transformer from scratch.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors