我想要一天分享一點「LLM從底層堆疊的技術」,並且每篇文章長度控制在三分鐘以內,讓大家不會壓力太大,但是又能夠每天成長一點。
從 AI說書 - 從0開始 - 193 | 第七章引言 到 AI說書 - 從0開始 - 222 | GPT 4 & RAG 測試,我們完成書籍:Transformers for Natural Language Processing and Computer Vision, Denis Rothman, 2024 第七章說明。
以下附上參考項目:
以下附上額外閱讀項目:
- Alex Wang et.al, 2019, GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding: https://arxiv.org/pdf/1804.07461.pdf
- Alex Wang et al., 20192, SuperGLUE: A Stickier Benchmark for General-Purpose Language Under- standing Systems: https://w4ngatang.github.io/static/papers/superglue.pdf
- Tom B. Brown et al., 2020, Language Models are Few-Shot Learners: https://arxiv.org/ abs/2005.14165
- Chi Wang et al., 2023, Cost-Effective Hyperparameter Optimization for Large Language Model Gen- eration Inference: https://arxiv.org/abs/2303.04673
- Vaswani et al., 2017, Attention Is All You Need: https://arxiv.org/abs/1706.03762