ndai's comments

ndai · 2025-10-06T22:47:05 1759790825

No pip freeze to lock the doom, No tangled trees in darkened bloom, No maintainers tricked by phishing spree — In C I hold the memory key.

Through buffer, pointer, syscall roar, I own the land, I own the shore; Let Python’s spiders weave their scheme, I’ll keep my ship rock-steady in C-stream.

ndai · 2025-09-17T19:24:24 1758137064

Isn’t NVIDIA fabless? I imagine (I jump to conclusions) that design is less of a challenge than manufacturing. EUV lithography is incredibly difficult- almost implausible. Perhaps one day a clever scientist will come up with a new, seemingly implausible, yet less difficult way, using “fractal chemical” doping techniques.

hollerith · 2025-09-17T19:29:10 1758137350

>design is less of a challenge than manufacturing.

If so, can you explain why Nvidia's market cap is much higher than TSMC's? (4.15 trillion versus 1.10 trillion)

JeremyNT · 2025-09-17T20:02:18 1758139338

I'd just say "market irrationality" and call it a day. TSMC is far closer to a monopoly than NVIDIA is, and they win no matter which fabless company is buying their capacity.

ndai · 2025-09-17T19:47:07 1758138427

You could be right. But it could also be due to things like: automatic 401k injections into the market, easy retail investing, and general speculative attitudes.

ndai · 2025-09-15T10:18:30 1757931510

I’m curious where you got your training data? I will look myself, but saw this and thought I’d ask. I have a CPU-first, no-backprop architecture that works very well on classification datasets. It can do single‑example incremental updates which might be useful for continuous learning. I made a toy demo to train on tiny.txt and it can predict next characters, but I’ve never tried to make an LLM before. I think my architecture might work well as an on-device assistant or for on-premises needs, but I want to work with it more before I embarrass myself. Any open-source LLM training datasets you would recommend?

electroglyph · 2025-09-15T10:21:12 1757931672

https://huggingface.co/datasets/NousResearch/Hermes-3-Datase...

Snuggly73 · 2025-09-15T11:30:51 1757935851

To my untrained eye, this looks more like an instruct dataset.

For just plain text, I really like this one - https://huggingface.co/datasets/roneneldan/TinyStories

kachapopopow · 2025-09-15T10:20:42 1757931642

huggingface has plenty of openai and antrophic user to assistant chains, beware there are dragons (hallucinations), but good enough for instruction training. I actually recommend distilling kimi k2 instead for instruction following capabilities.

hadlock · 2025-09-15T17:03:34 1757955814

The training data is contained inside main.rs; it looks like about 50 statements about general facts. Probably to keep training time down. If you go off-script things fall apart pretty quickly:

Enter prompt: hello Model output: What are eclipses ? Assistant : Eclipses occur when one celestial body moves into the shadow of another </s>

Enter prompt: what are facts Model output: eclipses Assistant : Eclipses Eclipses What What was when What through you ? through you meet through using of What was What tall ? of What thank help you explain using , and do you a patterns you you a patterns through air in tall you help metal metal </s>

Enter prompt: how are mountains formed? Model output: I ? ' I ' : Of : in happy Hi wind in yeast altering it it </s>