我想要一天分享一點「LLM從底層堆疊的技術」,並且每篇文章長度控制在三分鐘以內,讓大家不會壓力太大,但是又能夠每天成長一點。
從 AI說書 - 從0開始 - 37 到 AI說書 - 從0開始 - 70 ,我們完成書籍:Transformers for Natural Language Processing and Computer Vision, Denis Rothman, 2024 第二章說明。
以下附上參考項目:
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, 2017, Attention Is All You Need: https://arxiv.org/abs/1706.03762
- Hugging Face transformer usage: https://huggingface.co/docs/transformers/main/en/ quicktour
- Tensor2Tensor (T2T) introduction: https://colab.research.google.com/github/tensorflow/ tensor2tensor/blob/master/tensor2tensor/notebooks/hello_t2t.ipynb?hl=en
- Manuel Romero’s notebook with link to explanations by Raimi Karim: https://colab.research. google.com/drive/1rPk3ohrmVclqhH7uQ7qys4oznDdAhpzF
- Google language research: https://research.google/teams/language/
- Hugging Face research: https://huggingface.co/transformers/index.html
- The Annotated Transformer: http://nlp.seas.harvard.edu/2018/04/03/attention.html
- Jay Alammar, The Illustrated Transformer: http://jalammar.github.io/illustrated- transformer/
以下附上額外閱讀項目:
- https://blogs.nvidia.com/blog/2022/03/25/what- is-a-transformer-model/
- https://docs.nvidia.com/deeplearning/ transformer-engine/user-guide/index.html