Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, exactly! Just plot all the bloody data and be done with it. No one is doing this by hand anymore so it is no extra work.

To my mind, if you have a genuine EDA attitude you plot it all.



> Just plot all the bloody data and be done with it

Well no, because you can compare the datasets by eye and say questionable qualitative things about them, but you can't make definitively true quantitative statements about them.

Show me two plots of data points and I can show you two people who will in good faith argue over which one has the higher mean or higher median or higher variance. Because you often can't tell.

The entire point of something like a box plot is that it does part of the quantitative analysis for you. You can see where the median is. You can see the width of the quartiles.


But there are much better ways to do this than box plots! Lots of CS papers use CDF and it's great and very informative once you get used to it (although you do need to get used to them). You can have violin plots with all the box plots elements and more. Even if you want to restrict yourself to quartiles, author's design concepts with narrow/wide bars makes much more visual sense, and still convey exactly the same information as box plots.


It depends on the purpose.

CDF plots are great for plotting a single distributions, but contain way too much information if you want to plot 6 distributions next to each other for easy comparison.

Violin plots are interesting but also quite complicated, since you have to arbitrarily choose a kernel shape and this artificial smoothing can make it look like you have much more data than you really do.

I really don't like the author's "alternative designs" because I think they're even more open to misinterpretation than box plots. It's hard to judge though, because the central problem is that the author is trying to represent a bimodal distribution, and shouldn't be using box plots or the 2 "alternative designs" for that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: