I'd love to see a benchmark that tests different LLMs for slop, not necessarily ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		growdark 56 days ago \| parent \| context \| favorite \| on: Antislop: A framework for eliminating repetitive p... I'd love to see a benchmark that tests different LLMs for slop, not necessarily limited to code. That might be even more interesting than ARC-AGI.

Bolwin 56 days ago | [–]

See the writing benchmarks here https://eqbench.com/creative_writing_longform.html

Der_Einzige 55 days ago | | [–]

Note this is the same first author

jampa 56 days ago | | [–]

Not a benchmark per se, but there is a "Not x, but y" Slop Leaderboard:

https://www.reddit.com/r/LocalLLaMA/comments/1lv2t7n/not_x_b...

topaz0 55 days ago | [–]

100% of LLM output is slop. Done.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact