Skip to content

konstantin-a-maslov/transformers-seminar

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Transformers seminar

Authors:
Konstantin A. Maslov,
Claudio Persello

This repository contains supplementary materials for the seminar on vision transformers—slides and a Jupyter notebook with a simple implementation of ViT and its training on MNIST.

Materials

Literature

Main paper for the discussion

  • Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2020). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. https://doi.org/10.48550/arxiv.2010.11929

Other papers on vision transformers

  • Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., Jégou, H., & Ai, F. (2020). Training data-efficient image transformers & distillation through attention. https://doi.org/10.48550/arxiv.2012.12877
  • Zheng, S., Lu, J., Zhao, H., Zhu, X., Luo, Z., Wang, Y., Fu, Y., Feng, J., Xiang, T., Torr, P. H. S., & Zhang, L. (2020). Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 6877–6886. https://doi.org/10.48550/arxiv.2012.15840
  • Strudel, R., Garcia, R., Laptev, I., & Schmid, C. (2021). Segmenter: Transformer for Semantic Segmentation. Proceedings of the IEEE International Conference on Computer Vision, 7242–7252. https://doi.org/10.48550/arxiv.2105.05633
  • Ranftl, R., Bochkovskiy, A., & Koltun, V. (2021). Vision Transformers for Dense Prediction. https://doi.org/10.48550/arXiv.2103.13413
  • Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., & Guo, B. (2021). Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. Proceedings of the IEEE International Conference on Computer Vision, 9992–10002. https://doi.org/10.48550/arxiv.2103.14030
  • Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J. M., & Luo, P. (2021). SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers. Advances in Neural Information Processing Systems, 15, 12077–12090. https://doi.org/10.48550/arxiv.2105.15203

Related papers

  • Tolstikhin, I., Houlsby, N., Kolesnikov, A., Beyer, L., Zhai, X., Unterthiner, T., Yung, J., Steiner, A., Keysers, D., Uszkoreit, J., Lucic, M., & Dosovitskiy, A. (2021). MLP-Mixer: An all-MLP Architecture for Vision. Advances in Neural Information Processing Systems, 29, 24261–24272. https://doi.org/10.48550/arxiv.2105.01601
  • Loshchilov, I., & Hutter, F. (2017). Decoupled Weight Decay Regularization. 7th International Conference on Learning Representations, ICLR 2019. https://doi.org/10.48550/arxiv.1711.05101
  • Wightman, R., Touvron, H., Jégou, H., & Ai, F. (2021). ResNet strikes back: An improved training procedure in timm. https://doi.org/10.48550/arxiv.2110.00476

If you notice any inaccuracies, mistakes or errors, feel free to submit a pull request.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors