Wow, fun to find this trending on HN this morning! I am currently also working o...

StefanWestfal · on Jan 11, 2023

Open accessible lectures / knowledge like yours allowed many people, me included, to turn their life around by putting in the effort and develop themselves. Thank you.

moralestapia · on Jan 11, 2023

While doing my PhD some years ago (it wasn't a PhD on AI, but very much related) I trained several models with the usual stack back then (pytorch and some others in TF). I realized that a lot of this stack could be rewritten in much simpler terms without sacrificing much fidelity and/or performance in the end.

Submissions like yours and other projects like this one (recently featured here as well) -> https://github.com/ggerganov/whisper.cpp, makes it pretty clear to me that this intuition is correct.

There's a couple tools I created back then that could push things further towards this direction, unfortunately they're not mature enough to warrant a release but the ideas they portray are worth taking a look at (IMHO) and I'll be happy to share them. If there's interest on your side (or anyone reading this thread) I'd love to talk more about it.

subbu · on Jan 11, 2023

Your youtube playlist combined with NanoGPT and your Lex Fridman podcast is like having a university level degree with a free internship guidance. Thank you!

imranq · on Jan 11, 2023

Just wanted to say thank you for all the incredible work and resources you publish. I've lost track of all the different skills I've learned from you, from computer vision, RNNs, minGPT, even speedcubing :D

gtoubassi · on Jan 11, 2023

+1. I've benefited greatly from your content, e.g. your CNN lecture was incredibly accessible [0]. I still find transformers stubbornly elude my intuitions despite reading many descriptions. I would very much appreciate your video lecture on this topic.

[0] I think https://www.youtube.com/watch?v=LxfUGhug-iQ

katsucurry · on Jan 11, 2023

I've found all of your code and lessons on youtube so incredibly useful. You're a wonderful teacher and I really appreciate all the work you've done with this!

cs702 · on Jan 11, 2023

Andrej: thank you!

--

To the mod (dang): IMHO Andrej's comment should probably be at the top of the page, not my comment. UPDATE: Looks like that's done. Thank you :-)

TheAlchemist · on Jan 11, 2023

Thank you for your amazing work. Between cs231n and your recent videos, I've learned a ton - and you have a gift to explain things in such an easy and straightforward way, that I'm always feeling like an idiot (in a positive way) for not having grasped the concept before.

goldenshale · on Jan 11, 2023

Bad ass! A great addition would be some content on tuning pre-trained language models for particular purposes. It would be great to have examples of things like tuning a GPT model trained on language and code to take in a context and spit out code in my custom API, or using my internal terminology. Not sure if this is RL based fine tuning or just a bunch of language to code examples in a fine tuning dataset? In essence, how can we start using language to control our software?

karpathy · on Jan 11, 2023

Ty agree, most people practically speaking will be interested in finetuning rather than from-scratch pretraining. I currently have some language about it in readme but I agree this should get more focus, docs, examples, etc.

highfrequency · on Jan 11, 2023

Appreciate the work to make GPT training accessible!

Do you leave hyperparams (like learning rate, batch size) the same when switching from 8xA100 to fewer GPUs, or do these need to be adjusted?

Separately, when going from 8xA100 GPU to a single A100 GPU, in the worst case we can expect the same model performance after training 8x as long correct? (And likely a bit better because we get more gradient updates in with smaller batch size)

eternalban · on Jan 11, 2023

Thank you for sharing your knowledge. Anything that can be done to democratize machine learning is an invaluable social service. Hats off to you.

silentsea90 · on Jan 11, 2023

Saying absolutely nothing new here, but your work is so damn inspiring! I wish I had such a natural connect to my work, an ability to distill complex concepts down to the fundamentals, and such inventiveness! I took your CS231N class at Stanford as well. Implementing the fundamental building blocks like backprop was fun and insightful. Thanks again for your passion and teaching!

lukebechtel · on Jan 11, 2023

Your tutorials are effective and concise. Thank you for them! Accessible, from-scratch knowledge on these topics is essential at this time in history and you're really making a dent in that problem.

nurettin · on Jan 12, 2023

Thanks, I love your video about back propagation where you painstakingly spell out every calculation. It was like a breath of fresh air compared to other materials out there.

misza222 · on Jan 11, 2023

Thanks for your work Andrej! I've been doing earlier lectures and this is absolutely fantastic educational content!

de_nied · on Jan 11, 2023

Thank you for your constant contributions.

m3affan · on Jan 12, 2023

Amazing work, much appreciated

dsabanin · on Jan 11, 2023

Thank you for your great work!

hwc · on Jan 12, 2023

just started watching your lectures! they are great!