PaperadvancedFree

"Attention Is All You Need" (Vaswani et al., 2017)

The seminal paper that introduced the Transformer architecture, which is the foundation of almost all modern Large Language Models (LLMs). Essential reading for anyone serious about understanding LLMs.

Visit resource

More resources on Large Language Models

PodcastFree

The TWIML AI Podcast (This Week in Machine Learning & AI)

Covers a wide range of ML and AI topics, including frequent discussions and interviews related to natural language processing and large language models.

PodcastFree

Lex Fridman Podcast (Interviews with AI researchers)

Lex Fridman frequently interviews leading AI researchers, many of whom are pivotal in the development of LLMs. These interviews provide deep insights into the current state and future of the field. Search for episodes with guests like Ilya Sutskever, Sam Altman, Yann LeCun, etc.

WebsiteFree

The Illustrated Transformer by Jay Alammar

A brilliant visual explanation of the Transformer architecture, making complex concepts much easier to grasp. This is often cited as a first step for many learners.

PaperFree

"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding" (Devlin et al., 2018)

Introduced BERT, a breakthrough in pre-training language representations using a Transformer encoder, significantly impacting the development of LLMs.

YouTubeFree

Andrej Karpathy's "makemore" series and other lectures

Andrej Karpathy (former Director of AI at Tesla, founding member of OpenAI) provides incredibly insightful and practical lectures on neural networks and deep learning, including building language models from scratch. His content is highly regarded by the ML community.

YouTubeFree

"Attention Is All You Need" Explained by Yannic Kilcher

A detailed and highly praised explanation of the Transformer architecture, the cornerstone of modern LLMs. Yannic Kilcher's channel is known for its in-depth paper reviews.

See all Large Language Models resources →