Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
|
from
login
A Technical Tour of the DeepSeek Models from V3 to v3.2
(
sebastianraschka.com
)
23 points
by
ibobev
8 days ago
|
past
|
1 comment
A Technical Tour of the DeepSeek Models from V3 to v3.2
(
sebastianraschka.com
)
5 points
by
mzl
9 days ago
|
past
|
1 comment
Recommendations for Getting the Most Out of a Technical Book
(
sebastianraschka.com
)
2 points
by
naves
9 days ago
|
past
|
discuss
A Technical Tour of the DeepSeek Models from V3 to v3.2
(
sebastianraschka.com
)
8 points
by
giuliomagnifico
9 days ago
|
past
|
discuss
Getting the Most Out of a Technical Book
(
sebastianraschka.com
)
4 points
by
quietlearning
29 days ago
|
past
Beyond Standard LLMs
(
sebastianraschka.com
)
1 point
by
vismit2000
34 days ago
|
past
Beyond Standard LLMs
(
sebastianraschka.com
)
1 point
by
ibobev
38 days ago
|
past
A Researcher's Field Guide to Non-Standard LLM Architectures
(
sebastianraschka.com
)
2 points
by
ModelForge
38 days ago
|
past
Understanding the 4 Main Approaches to LLM Evaluation (From Scratch)
(
sebastianraschka.com
)
1 point
by
ibobev
58 days ago
|
past
Popular Attention Alternatives: GQA, MLA, SWA
(
sebastianraschka.com
)
4 points
by
ModelForge
58 days ago
|
past
Multi-Head Latent Attention
(
sebastianraschka.com
)
4 points
by
ModelForge
60 days ago
|
past
Understanding the 4 Main Approaches to LLM Evaluation (From Scratch)
(
sebastianraschka.com
)
2 points
by
ibobev
63 days ago
|
past
LLM Evaluation from Scratch: Multiple Choice, Verifiers, Leaderboards, LLM Judge
(
sebastianraschka.com
)
4 points
by
ModelForge
68 days ago
|
past
Understanding and Implementing Qwen3 from Scratch
(
sebastianraschka.com
)
1 point
by
ibobev
87 days ago
|
past
GPT-OSS vs. Qwen3 and a detailed look how things evolved since GPT-2
(
sebastianraschka.com
)
490 points
by
ModelForge
4 months ago
|
past
|
97 comments
From GPT-2 to GPT-OSS: Analyzing the Architectural Advances
(
sebastianraschka.com
)
3 points
by
mdp2021
4 months ago
|
past
PyTorch in One Hour: From Tensors to Training Neural Networks on Multiple GPUs
(
sebastianraschka.com
)
1 point
by
Anon84
4 months ago
|
past
PyTorch in One Hour: From Tensors to Training Neural Networks on Multiple GPUs
(
sebastianraschka.com
)
4 points
by
mariuz
4 months ago
|
past
LLM architecture comparison
(
sebastianraschka.com
)
418 points
by
mdp2021
4 months ago
|
past
|
24 comments
The Big LLM Architecture Comparison
(
sebastianraschka.com
)
3 points
by
Quizzical4230
4 months ago
|
past
Comprehensive ML/AI questions and answers for interview prep
(
sebastianraschka.com
)
2 points
by
yaiml
5 months ago
|
past
PyTorch in One Hour: From Tensors to Training Neural Networks on Multiple GPUs
(
sebastianraschka.com
)
4 points
by
sbbq
5 months ago
|
past
Intermediate ML and AI questions and answers for interview prep
(
sebastianraschka.com
)
3 points
by
sbbq
5 months ago
|
past
Understanding and Coding the KV Cache in LLMs from Scratch
(
sebastianraschka.com
)
6 points
by
sbbq
5 months ago
|
past
Understanding and Coding the KV Cache in LLMs from Scratch
(
sebastianraschka.com
)
2 points
by
tosh
5 months ago
|
past
Coding LLMs from the Ground Up: A Complete Course
(
sebastianraschka.com
)
4 points
by
sbbq
6 months ago
|
past
Coding LLMs from the Ground Up: A Complete Course
(
sebastianraschka.com
)
2 points
by
mdp2021
7 months ago
|
past
The State of Reinforcement Learning for LLM Reasoning
(
sebastianraschka.com
)
8 points
by
yaiml
7 months ago
|
past
The State of Reinforcement Learning for LLM Reasoning
(
sebastianraschka.com
)
9 points
by
jonbaer
7 months ago
|
past
The State of Reinforcement Learning for LLM Reasoning
(
sebastianraschka.com
)
4 points
by
mdp2021
7 months ago
|
past
More
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: