Does anyone know what the units are on the "performance improvement over SOTA" c...

EvgeniyZh · on April 5, 2022

If you're talking about fig. 4, then it's some units scaled so that random performance is 0 and perfect performance is 100 (depending on task it may be accuracy or something else). Since the models are so large, good benchmarks are diverse, and different tasks require different metrics.

r-zip · on April 4, 2022

I was wondering the same. Without better y-axis labeling, it's not that informative of a graphic.

whymauri · on April 4, 2022

Poetic that the top post right now is (partially) about how science communication over-simplifying figures results in a popular misunderstanding of science, leading readers to believe that conducting research is easier than it actually is.

lukasb · on April 4, 2022

Turns out it's a composite of "normalized task-specific metrics", details in the paper. Shrug. Numbers go up!