Hacker Newsnew | past | comments | ask | show | jobs | submit | _0ffh's commentslogin

You'd be surprised how quickly improvement of autoregressive language models levels off with epoch count (though, admittedly, one epoch is a LOT). Diffusion language models otoh indeed keep profiting for much longer, fwiw.

Does this also apply to LLM training at scale? I would be a bit surprised if it does, fwiw.

Yup, as soon as data is the bottleneck and not compute, diffusion wins. Tested following the Chinchilla scaling strategy from 7M to 2.5B parameters.

https://arxiv.org/abs/2507.15857


An agent is an autonomous entity that makes goal-driven decisions in an environment it can (partially) observe, and influence through it's actions. It is a very general term.

I went to hacker events where someone would sell lock-picking tools and practice locks like it's the most natural thing in the world.

Yeah, I think it might be a driver thing (or driver interaction with XFCE code).

After ~10 years of using XFCE, I recently for the first time encountered flickering, after an NVidia driver update. I disabled compositing and it went away. Still happy, but clearly something broke there. Pretty sure someone's trying to fix it, somewhere.



That was the Nvidia 580 driver, its a known issue. 575 dirver is working fine.


If that's a superpower, it's a staggeringly common superpower.


Oh, I thought the added u and the bar were just two different ways to indicated that the o is stretched (the u looking like a workaround to avoid special characters).


You can still find ones that don't need to be registered online and will work without WLAN or app. They will not remember the room layouts and you won't be able to lay virtual fences, but apart from that they work fine.


Ah, no risk, no fun! };->


In case anybody is interested, when we generalize the concept we're talking about Dyck languages.

https://en.wikipedia.org/wiki/Dyck_language


I was surprised to not see a connection made to free groups in the article.

EDIT: The wikipedia article that is.


No. Chaining subroutine calls is an implementation detail that is not inherent in the language, even if it may be a popular option because it is easy to do.

The usual implementation options are subroutine threading, indirect threading, and direct threading.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: