Submissions from sebastianraschka.com

		A Technical Tour of the DeepSeek Models from V3 to v3.2 (sebastianraschka.com)
		23 points by ibobev 8 days ago \| past \| 1 comment
		A Technical Tour of the DeepSeek Models from V3 to v3.2 (sebastianraschka.com)
		5 points by mzl 9 days ago \| past \| 1 comment
		Recommendations for Getting the Most Out of a Technical Book (sebastianraschka.com)
		2 points by naves 9 days ago \| past \| discuss
		A Technical Tour of the DeepSeek Models from V3 to v3.2 (sebastianraschka.com)
		8 points by giuliomagnifico 9 days ago \| past \| discuss
		Getting the Most Out of a Technical Book (sebastianraschka.com)
		4 points by quietlearning 29 days ago \| past
		Beyond Standard LLMs (sebastianraschka.com)
		1 point by vismit2000 34 days ago \| past
		Beyond Standard LLMs (sebastianraschka.com)
		1 point by ibobev 38 days ago \| past
		A Researcher's Field Guide to Non-Standard LLM Architectures (sebastianraschka.com)
		2 points by ModelForge 38 days ago \| past
		Understanding the 4 Main Approaches to LLM Evaluation (From Scratch) (sebastianraschka.com)
		1 point by ibobev 58 days ago \| past
		Popular Attention Alternatives: GQA, MLA, SWA (sebastianraschka.com)
		4 points by ModelForge 58 days ago \| past
		Multi-Head Latent Attention (sebastianraschka.com)
		4 points by ModelForge 60 days ago \| past
		Understanding the 4 Main Approaches to LLM Evaluation (From Scratch) (sebastianraschka.com)
		2 points by ibobev 63 days ago \| past
		LLM Evaluation from Scratch: Multiple Choice, Verifiers, Leaderboards, LLM Judge (sebastianraschka.com)
		4 points by ModelForge 68 days ago \| past
		Understanding and Implementing Qwen3 from Scratch (sebastianraschka.com)
		1 point by ibobev 87 days ago \| past
		GPT-OSS vs. Qwen3 and a detailed look how things evolved since GPT-2 (sebastianraschka.com)
		490 points by ModelForge 4 months ago \| past \| 97 comments
		From GPT-2 to GPT-OSS: Analyzing the Architectural Advances (sebastianraschka.com)
		3 points by mdp2021 4 months ago \| past
		PyTorch in One Hour: From Tensors to Training Neural Networks on Multiple GPUs (sebastianraschka.com)
		1 point by Anon84 4 months ago \| past
		PyTorch in One Hour: From Tensors to Training Neural Networks on Multiple GPUs (sebastianraschka.com)
		4 points by mariuz 4 months ago \| past
		LLM architecture comparison (sebastianraschka.com)
		418 points by mdp2021 4 months ago \| past \| 24 comments
		The Big LLM Architecture Comparison (sebastianraschka.com)
		3 points by Quizzical4230 4 months ago \| past
		Comprehensive ML/AI questions and answers for interview prep (sebastianraschka.com)
		2 points by yaiml 5 months ago \| past
		PyTorch in One Hour: From Tensors to Training Neural Networks on Multiple GPUs (sebastianraschka.com)
		4 points by sbbq 5 months ago \| past
		Intermediate ML and AI questions and answers for interview prep (sebastianraschka.com)
		3 points by sbbq 5 months ago \| past
		Understanding and Coding the KV Cache in LLMs from Scratch (sebastianraschka.com)
		6 points by sbbq 5 months ago \| past
		Understanding and Coding the KV Cache in LLMs from Scratch (sebastianraschka.com)
		2 points by tosh 5 months ago \| past
		Coding LLMs from the Ground Up: A Complete Course (sebastianraschka.com)
		4 points by sbbq 6 months ago \| past
		Coding LLMs from the Ground Up: A Complete Course (sebastianraschka.com)
		2 points by mdp2021 7 months ago \| past
		The State of Reinforcement Learning for LLM Reasoning (sebastianraschka.com)
		8 points by yaiml 7 months ago \| past
		The State of Reinforcement Learning for LLM Reasoning (sebastianraschka.com)
		9 points by jonbaer 7 months ago \| past
		The State of Reinforcement Learning for LLM Reasoning (sebastianraschka.com)
		4 points by mdp2021 7 months ago \| past
		More