Advertise here.

3d Bar Chart.jpg
[infosthetics@strataconf 2011 by guest blogger Collin Sullivan]
Tuesday's afternoon tutorial on "Communicating Data Clearly" was a stripped-down, nuts-and-bolts, dos-and-do-nots presentation with Naomi Robbins, a one-woman design consulting firm. After only a few minutes it was abundantly clear: Robbins takes no guff from graphs (Zach Gemignani might call her a connoisseur).

[Disclaimer: Robbins explained at the outset that she is less interested in conveying beauty and more interested in conveying information. The two are certainly not mutually exclusive but the focus of her talk was on "information" rather than "aesthetics," so prepare to see some ugly graphs, as above.]

Despite this being a conference largely focused on Big Data, Robbins tackled the small stuff. Right off the bat she explained that "you can't possibly communicate big data accurately if you can't handle small data first." Indeed, she convinced us all that there are many, many ways to mishandle small-data representation.

After discussing some of the more common graph forms--pie charts, bar charts, 3-D bar charts, 3-D bar charts on a 3-dimensional plane, stacked and grouped bar charts, line charts, bubble plots, scatterplots and ribbon charts (a crowd favorite, and something we will return to later) -- Robbins laid out some basic rules. They went something like this:


  1. Never, ever use pie charts. The only thing worse than a pie chart is several of them.

  2. OK, a 3-D pie chart is worse than a traditional pie chart. Really never use a 3-D pie chart.

  3. Bar charts are not inherently terrible but are very often done terribly. Use with caution and care.

  4. 3-D bar charts are inherently terrible. Never use a 3-D bar chart. (See: 3-D pie chart.)

  5. Dot plots are the crowning pinnacle of human achievement. I'm exaggerating, but Robbins emphasized the simplicity, conciseness and clarity in a dot plot as compared to other previously mentioned graph forms.

  6. Do not fear tables; sometimes they can convey information far more efficiently than any type of chart.

  7. Seriously, guys. I mean it about the pie charts.

The problem with 3-D charts is a fundamental one: they do not easily convey the information they are supposed to convey. They sacrifice function for form by skewing the image in such a way as to misrepresent data. And, Robbins noted, labels are no excuse; if the labels do not agree with the image, it runs the risk that the reader will only become confused as to which is correct.

Then there was this fiasco:

Strata Ribbon Chart.jpg
Ah, the ribbon chart. Where does one begin? Obscured information, confusing organization, angles that make comparison difficult. As an information consumer this chart hardly tells me anything. As Robbins explained, Stephen Few found this chart and broke it down, extracted the data (as best he could) and produced the following chart as an alternative:

Strata Ribbon-Bar Chart.jpg
Simple. Clean. Elegant. Easier to read. And not just because information designers say so.

Perhaps the most interesting and universally useful portion of Robbins' talk was about how humans read charts and graphs. She discussed and demonstrated the Gestalt Laws (proximity, similarity, connectedness, continuity, symmetry, closure, size and enclosure), Steven's Law (about perceived scale) and the order of elementary tasks--that is, types of visual comparisons the human brain can make, ranked from most- to least-able. Ideally the graph will employ a strategy that utilizes those tasks ranked higher, as that will make it easier for a person to interpret.

Take the following image as a demonstration of the Gestalt laws:

Gestalt Demo.jpg
Looking at these shapes, consider how your brain groups them together. Connectedness is stronger than proximity (top left), stronger than similarity in shade or fill (top right), and stronger than size (bottom left). But it is not stronger than enclosure (bottom right).

Robbins rounded out her talk with lots of advice and suggestions for making effective graphs. Rather than list them all out I'll group them into some general takeaways (for the moment I will do so without much supporting graphical evidence, but will hope to fix that as the images become available soon):

1. Make the data distinct from the graph.
Take a look at the following image that Robbins found in an old economics textbook:

Robbins Graph A.jpg
Leaving color choice aside (let's just assume those are the only two colors available to the designer), the first thing a person sees here is the grid. The data representation is not prominent; our eyes are first drawn to the medium, not the substance. Just using the same colors available, Robbins made a simple edit using Adobe Photoshop and look at the result:

Robbins Graph B.jpg
This is much clearer. The information jumps off the graph, which is still there for measurement purposes but is now in the background, as it should be. Robbins advocated for graying out gridlines as a general practice for this very reason.

2. Make the data distinct from other data.
By varying the physical characteristics of data points you can distinguish variables clearly. Altering colors or shapes of dot plots is one way. This reduces confusion and allows the reader to more quickly identify distinct information. Bear in mind that contrast is more important than hue.

3. Keep data organized and clear.
Do not allow data to become cluttered and unreadable. And keep legends and other keys distinct. Robbins showed an image with a legend inside the scale-line rectangle, and it was never made clear that it was actually a legend--it simply appeared as two labeled data points. Keep that sort of stuff outside the graph itself.

4. Use common sense.
I'm editorializing a bit here--Robbins never mentioned common sense, per se--but the theme was strong. Make sure the number of tick marks you use are appropriate, keep your measurement intervals consistent, use the proper number of decimal spaces, and so on.

5. Make sure your graph makes sense.
The message you are trying to convey should be very easy to find, and the data should be drawn to scale. The reader/user will typically be judging things on a relative basis so scale matters.

6. Make sure your data is accessible.
Use colors that will project well, and ensure that visual clarity is maintained when the image is reduced or reproduced. Also remember that not all people see color the same way. There is a fantastic website Robbins mentioned called Vischeck.com that will reproduce a website of the user's choosing to simulate what a person with visual color deficiencies would see.

Robbins' talk was less about form and more about function, but then that was clear from the title of her talk. She assured the audience that she doesn't eschew beauty, she just focuses on clarity of message more than anything else. She bemoaned what she considered an unfortunate clustering of "data art" and "statistical graphics" under the umbrella term "data visualization." When creating the former, perhaps beauty is the goal, but in creating the latter, function must be primary.

We closed out with a rousing discussion --nearly a heated debate-- about whether graphs should have a zero-value on an axis. To me the answer seems much more contextualized; use whichever method best reflects the truth. If the zero-value makes the data difficult to read and only fills the graph with unused space, it is serving no practical purpose and should be left out.

Robbins' talk was a very useful introduction to the basics of data communication, and I have already seen many of her rules put to use in other presentations here. It was well-scheduled at the beginning of the conference in that it put participants in th proper mindset to digest the more beautiful aspects of data representation.

And I probably would not have noticed before, but something that has been conspicuously absent from all other presentations thus far? Pie charts.

This post was written by Collin Sullivan. He is a research analyst for The Sentinel Project for Genocide Prevention, where data collection, analysis and visualization are being used to design an Early Warning System (EWS) to detect and prevent genocide. Collin lives in San Francisco. You can reach him at collin [at] thesentinelproject [dot] org and follow him on Twitter at @inciteinsight.

4 COMMENTS

Very interesting stuff, thanks for taking it all down.

At the risk of sounding like a dataviz n00b, is "Never use Pie Charts" commonly accepted opinion among the experts out there?

I mean I'm on board with the terrible 3d pie charts, but even regular ones too?!

Sat 05 Feb 2011 at 5:08 AM
Andy

Hi Andy,

There's a good explanation I found from Zach Gemignani of Juice Analytics about the trouble with pie charts. It's here if you like: Link

I'm not sure how accepted the opinion is, but it didn't seem particularly controversial at the conference. I think the main trouble is that it's much easier for the human mind to compare things by length than to compare by area, and pie charts provide the latter. You'll get the same kind of problem with bubble charts.

And in looking for that link, I was reminded that there were a couple of pie charts I really liked, here and

Sat 05 Feb 2011 at 8:23 AM

Yes, exactly what is the rationale for not using pie charts? She is for clarity and not beauty, but I had the impression pie charts were quite clear, if ugly.

Sat 05 Feb 2011 at 7:00 PM
John

Careful experimentation by Cleveland and McGill show that people don't judge angles very accurately. I showed an example that demonstrated that characteristics of data that show up clearly in a dot plot are hidden in pie charts. You asked what other experts on graphs say about pie charts. I was quoting page 178 of The Visual Display of Quantitative Information by Edward Tufte when I said that "the only worse design than a pie chart is several of them" and "pie charts should never be used." My examples were based on work by William Cleveland who demonstrates that pie charts have severe perceptual problems and do not convey information reliably.

Sun 06 Feb 2011 at 3:08 AM
ADD A COMMENT
Commenting has been temporarily disabled.