Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Does anyone know what the units are on the "performance improvement over SOTA" chart?


If you're talking about fig. 4, then it's some units scaled so that random performance is 0 and perfect performance is 100 (depending on task it may be accuracy or something else). Since the models are so large, good benchmarks are diverse, and different tasks require different metrics.


I was wondering the same. Without better y-axis labeling, it's not that informative of a graphic.


Poetic that the top post right now is (partially) about how science communication over-simplifying figures results in a popular misunderstanding of science, leading readers to believe that conducting research is easier than it actually is.


Turns out it's a composite of "normalized task-specific metrics", details in the paper. Shrug. Numbers go up!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: