Hacker Newsnew | past | comments | ask | show | jobs | submit | ndai's commentslogin

No pip freeze to lock the doom, No tangled trees in darkened bloom, No maintainers tricked by phishing spree — In C I hold the memory key.

Through buffer, pointer, syscall roar, I own the land, I own the shore; Let Python’s spiders weave their scheme, I’ll keep my ship rock-steady in C-stream.


Isn’t NVIDIA fabless? I imagine (I jump to conclusions) that design is less of a challenge than manufacturing. EUV lithography is incredibly difficult- almost implausible. Perhaps one day a clever scientist will come up with a new, seemingly implausible, yet less difficult way, using “fractal chemical” doping techniques.


>design is less of a challenge than manufacturing.

If so, can you explain why Nvidia's market cap is much higher than TSMC's? (4.15 trillion versus 1.10 trillion)


I'd just say "market irrationality" and call it a day. TSMC is far closer to a monopoly than NVIDIA is, and they win no matter which fabless company is buying their capacity.


You could be right. But it could also be due to things like: automatic 401k injections into the market, easy retail investing, and general speculative attitudes.


I’m curious where you got your training data? I will look myself, but saw this and thought I’d ask. I have a CPU-first, no-backprop architecture that works very well on classification datasets. It can do single‑example incremental updates which might be useful for continuous learning. I made a toy demo to train on tiny.txt and it can predict next characters, but I’ve never tried to make an LLM before. I think my architecture might work well as an on-device assistant or for on-premises needs, but I want to work with it more before I embarrass myself. Any open-source LLM training datasets you would recommend?



To my untrained eye, this looks more like an instruct dataset.

For just plain text, I really like this one - https://huggingface.co/datasets/roneneldan/TinyStories


huggingface has plenty of openai and antrophic user to assistant chains, beware there are dragons (hallucinations), but good enough for instruction training. I actually recommend distilling kimi k2 instead for instruction following capabilities.


The training data is contained inside main.rs; it looks like about 50 statements about general facts. Probably to keep training time down. If you go off-script things fall apart pretty quickly:

Enter prompt: hello Model output: What are eclipses ? Assistant : Eclipses occur when one celestial body moves into the shadow of another </s>

Enter prompt: what are facts Model output: eclipses Assistant : Eclipses Eclipses What What was when What through you ? through you meet through using of What was What tall ? of What thank help you explain using , and do you a patterns you you a patterns through air in tall you help metal metal </s>

Enter prompt: how are mountains formed? Model output: I ? ' I ' : Of : in happy Hi wind in yeast altering it it </s>


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: