How are you going to release an LLM eval paper in mid-2025 using *ChatGPT 3.5* Y...

IngoBlechschmid · 2025-07-04T16:33:32 1751646812

The paper was originally released in April 2023, it just got version-bumped a couple months ago :-)

suddenlybananas · 2025-07-04T18:16:07 1751652967

>The authors should have held back a few more months and turned the paper into a 3.5 to O3 or any other 2025 SOTA improvement analysis.

If they had done that, you would then be complaining about them not using Claude or whatever.

rs186 · 2025-07-04T18:29:45 1751653785

I don't see the logic in your comment.

ethan_smith · 2025-07-04T23:08:14 1751670494

The paper was published in April 2023 (not 2025), but your point about using outdated models stands - evaluating with ChatGPT 3.5 when we now have Claude 3, GPT-4o, and other SOTA models significantly limits the paper's relevance.