Replies: 1 comment 4 replies
-
|
Thanks for the suggestion! Coincidentally, I am already working on something like that -- although a bit more comprehensive and focused on decoder models. Thanks for suggesting, though! |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I would like to ask what do you think about adding a showcase for how to build, train and test a Transformer "from scratch" with
pytorch, for instance for a translation task like english to french?I have already conducted some research and have found following resources to that:
Attention is all you need: A Pytorch Implementation
Build your own Transformer from scratch using Pytorch
Transformers from Scratch in PyTorch
Transformers from scratch
From my point of view, it would be nice to see
I believe this would be also be great to demonstrate post-LN (as in the original paper) versus pre-LN implementation (performance boost).
In my option, the first resource would be a good reference (but not properly maintained) for an end-to-end workflow implementation working with real text data, but in the simplicity of the 2nd and 3rd resource (those just use some artificially sampled random data points), all in one single notebook.
Kind regards,
Daniel
Beta Was this translation helpful? Give feedback.
All reactions