Stanford CS224N: Natural Language Processing with Deep Learning
This highly acclaimed Stanford course provides a deep dive into NLP using deep learning, with a strong focus on transformer architectures which are fundamental to LLMs. Lecture videos and materials are usually made publicly available.
More resources on Large Language Models
The TWIML AI Podcast (This Week in Machine Learning & AI)
Covers a wide range of ML and AI topics, including frequent discussions and interviews related to natural language processing and large language models.
Lex Fridman Podcast (Interviews with AI researchers)
Lex Fridman frequently interviews leading AI researchers, many of whom are pivotal in the development of LLMs. These interviews provide deep insights into the current state and future of the field. Search for episodes with guests like Ilya Sutskever, Sam Altman, Yann LeCun, etc.
The Illustrated Transformer by Jay Alammar
A brilliant visual explanation of the Transformer architecture, making complex concepts much easier to grasp. This is often cited as a first step for many learners.
"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding" (Devlin et al., 2018)
Introduced BERT, a breakthrough in pre-training language representations using a Transformer encoder, significantly impacting the development of LLMs.
"Attention Is All You Need" (Vaswani et al., 2017)
The seminal paper that introduced the Transformer architecture, which is the foundation of almost all modern Large Language Models (LLMs). Essential reading for anyone serious about understanding LLMs.
Andrej Karpathy's "makemore" series and other lectures
Andrej Karpathy (former Director of AI at Tesla, founding member of OpenAI) provides incredibly insightful and practical lectures on neural networks and deep learning, including building language models from scratch. His content is highly regarded by the ML community.
