As a scientist, I want the figures in my reports and presentations to look good. This is not just about aesthetics or achieving “polish”, though those certainly are considerations, but is primarily a matter of presenting information effectively. That the “data speak for themselves” is a well-worn truism – it’s entirely up to the author to make sure that the story that they’re telling can actually be heard.
So the subject of statistical graphics has been holding my attention for some time now. Statistics can be dauntingly complicated, especially when large quantities of data are involved. How can we best summarize their message graphically, without overwhelming viewers in a sea of detail, or worse, misleading them?
Some of the best books on this subject are by Edward Tufte, who has been hugely influential in promoting a minimalist, “less-is-more” approach to designing data graphics. He coined such terms as the “data-ink ratio” and other metrics of effectiveness. Another writer with a long-held interest in good graphics is the statistician Howard Wainer, whose 1997 book, Visual Revelations, I recently checked out from the library. The book is based mostly on previously-written columns and articles, one of which (“How to display data badly”, American Statistician vol. 38 (2): 137, 1984) I strongly recommend.
One of his observations in particular caught my attention, because it’s something that I’ve noticed before. It’s best illustrated by a diagram.
Have a look at these two curves (black lines). They are spaced some distance apart from each other. Would you say that…
- The vertical distance between the lines is shrinking (from left to right)?
- The vertical distance between them stays the same?
- The vertical distance between them grows larger?
Put in this way, perhaps you might know where this is leading. A careless glance might lead you to say that the lines are getting closer as we move from left to right. But notice that I say “vertical distance”. If this was a data graphic in some publication, where the axes mean something, then that is very likely the kind of detail that the graphic was meant to call attention to in the first place. Suppose the horizontal axis were “time”, and the vertical axis “sales”. The two lines represent sales by two different branches of a store. Is store B catching up with store A, or always lagging behind by the same amount?
The right thing to do would be to take a ruler and set square to the picture and measure the distances. That’s what I’ve indicated by the vertical blue lines. But if you’re lazy, or viewing it on the computer screen, then you’re not going to go looking for your set square. If you just eyeball the graphic, chances are that your eye is going to do what the red line does. Instead of considering the vertical spacing between the lines, you just go for the shortest line between the two.
Some mental effort is required to remind yourself that you’re supposed to look out for the vertical spacing (blue bars), and not to let your mental ruler go skew (red bar). To me, it’s surprising that this is so. It suggests that there is a strong mental tendency to think of the graph as a map (imagine the two lines as describing a river’s two banks), than as a plot with fixed axes.
Wainer and Tufte both note in their respective books how long it took for humanity to come up with the idea of the data graph. Worth reading is Michael Friendly’s profusely-illustrated review where he claims that there was a “Golden Age” of innovation in data visualization in the late 19th century. The map (i.e. cartography) of course came much earlier than that. To represent land and water by marks on paper is abstraction enough (fantasists will remember Borges’s story of the mappa mundi); geometry goes one step further and severs the connection to the ground entirely. But to plot numbers on a chart is to divorce them from tangible stuff twice over, and it is no surprise then that our eyes should find the earliest opportunity to fall back to a more intuitively appealing (and mentally less strenuous) interpretation, even when we try to tell ourselves otherwise.