Skip to content

This repository implements a replication of GPT language model (GPT 2.0 with 124M parameters) from the ground up using PyTorch, covering all major components of a modern transformer-based large language model (LLM).

Notifications You must be signed in to change notification settings

liereynaldo/Mini-GPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

In this project, I am doing a replication of the GPT language model (GPT-2 Small with 124 million parameters) from the ground up using PyTorch, covering all major components of a modern transformer-based large language model (LLM) such as multi-head self-attention, feedforward neural networks, layer normalization, residual connections, and causal masking for autoregressive text generation.

The model architecture includes 12 transformer blocks, each with 12 attention heads and an embedding dimension of 768, closely following the original GPT-2 (124M) configuration described by OpenAI. This replication aims to provide a transparent, end-to-end understanding of how generative transformers operate — from tokenization and training loop design to inference and text sampling — while maintaining code readability and modularity for exploration or educational use.

The implementation follows the principles outlined in Sebastian Raschka’s “Building LLMs from Scratch”.

About

This repository implements a replication of GPT language model (GPT 2.0 with 124M parameters) from the ground up using PyTorch, covering all major components of a modern transformer-based large language model (LLM).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published