我想要一天分享一點「LLM從底層堆疊的技術」,並且每篇文章長度控制在三分鐘以內,讓大家不會壓力太大,但是又能夠每天成長一點。
從 AI說書 - 從0開始 - 73 到 AI說書 - 從0開始 - 96,我們完成書籍:Transformers for Natural Language Processing and Computer Vision, Denis Rothman, 2024 第三章說明。
以下附上參考項目:
- Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman, 2019, SuperGLUE: A Stickier Benchmark for General-Purpose Lan- guage Understanding Systems: https://w4ngatang.github.io/static/papers/superglue.pdf
- Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman, 2019, GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding: https://arxiv.org/abs/1804.07461
- Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Hao Tian, Hua Wu, and Haifeng Wang, 2019, ERNIE 2.0: A Continual Pretraining Framework for Language Understanding: https://arxiv. org/pdf/1907.12412.pdf
- Melissa Roemmele, Cosmin Adrian Bejan, and Andrew S. Gordon, 2011, Choice of Plausible Alterna- tives: An Evaluation of Commonsense Causal Reasoning: https://people.ict.usc.edu/~gordon/ publications/AAAI-SPRING11A.PDF
- Richard Socher, Alex Perelygin, Jean Y. Wu, Jason Chuang, Christopher D. Manning, Andrew Y. Ng, and Christopher Potts, 2013, Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank: https://nlp.stanford.edu/~socherr/EMNLP2013_RNTN.pdf
- Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, and Jamie Brew, 2019, HuggingFace’s Transformers: State-of-the-art Natural Language Processing: https://arxiv.org/abs/1910.03771
- Hugging Face transformer usage: https://huggingface.co/docs/transformers/main/en/ quicktour
以下附上額外閱讀項目: