Skip to content

kanishkez/Paper-implementations

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

In this repo I try to implement the architectures behind modern llms

So far done with:

  1. Transformer
  2. KV Caching
  3. Flash Attention ( just a rough algorithm no cuda code )
  4. Rotary Positional Embeddings
  5. Minimal GPT implementation ( just an implementation without training or inference)
  6. Mixture of Experts module implemented

Also check out my blog where I sometimes explain these concepts: https://kanishkez.github.io/#blogs

About

Implementation of papers I read and stuff

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages