You obviously can do that though; diffusion models produce better (fsvo better) ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		astrange on Feb 26, 2024 \| parent \| context \| favorite \| on: Ask HN: Do LLMs get "better" with more processing ... You obviously can do that though; diffusion models produce better (fsvo better) images the more steps you run of them. Similarly, LLMs can produce better answers if you teach them thinking strategies that remind them to put the available evidence and intermediate steps in their context window. Otherwise they'll tend to hallucinate an answer out of vaguely correct words.

profile53 on Feb 26, 2024 [–]

Diffusion models are a different architecture, namely, a recursive or iterative one. Transformer models are not recursive or iterative.

astrange on Feb 26, 2024 | [–]

Sure they are. It only natively outputs one token; the recursive process is how you get the rest out of them.

profile53 on Feb 27, 2024 | | [–]

You’re totally right … should’ve thought that one through more.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact