Advertise here.

visualization_data_gov.jpg
A few days ago, the Sunlight Labs blog put a post up, titled "Should data.gov Visualize? Probably Not" [sunlightlabs.com]. In terms of provoking you to read, I have copied their title for this post as well.

Anyway, the post somehow caught my attention.

The first reason against providing government-backed data visualizations is as follows: "We didn't just leave it out because we didn't think of it. We left it out on purpose, along with lots of other feature ideas and concepts. We think that providing a centralized repository of government data in modern developer-friendly formats is a hard enough problem for government to solve. ... If the goal is to get the data in front of the most eyeballs possible, government should be providing the data in usable formats and focusing primarily on that."

The second even makes a connection between data visualization and "bells and whistles": "The second reason why government should avoid spending time on adding visualizations or other bells and whistles to Data.gov is because it actually hurts transparency. Visualizations, like any other form of news product, can be editorial-- even inadvertently. If government puts more of a priority on producing great visualizations and user experience than on providing quality accurate data with a great feedback loop, then it runs a pretty good chance of not adhering to the goal of being actually transparent. "

However, Sunlight Labs, which among other things, is very busy redesigning aiming to encourage the discussion about the design of the government transparency website data.gov (see some preliminary screenshots here), seem to have quickly backed down its initial viewpoint, at least in a post the following day. The post "Are('R') You a visualizer?" now invites readers to provide constructive feedback: "Should we hire somebody that does good data visualizations full time? Should we have a contest for best data visualization? What kinds of data visualizations would be successful in our field?" Well, maybe Tufte can be head of the US Government Visualization Office?

Seriously though, what do you think? Are they correct in stressing that data.gov should not focus on data visualization, but rather providing clean reliable data to citizens?

This seems to be a very interesting question. Data.gov goes after the ideal of offering free, transparent governmental information, but what does "free" really mean, if it is not made readily accessible for a lay person? On one side, creating a data repository/API means that data.gov will specifically target the press, advocacy groups, political lobbyists, think tanks and maybe the occasional hobby software developer annex data junky with some free time. Or, in other words, organizations and businesses that are motivated by some sort of agenda and "editorial-- even inadvertently" visualizations. So, if not data.gov themselves, who will spend time, money and effort to create visualizations that empower the individual non-expert citizen to analyze personally relevant data?

On the other side, there might be the question what government-backed data visualizations actually will be able to accomplish. Put differently, have widely acclaimed projects like Gapminder or WorldMapper truly reached, engaged and involved a large, lay audience? In terms of the population of a whole country probably not, but one could note they did not having the backing, nor the relevancy, provided by the US Government.

Via VizWorld.

See also:
. Government Transparency Website data.gov Will Go Live in May 2009
. data.gov: How To Open Up Government Data

16 COMMENTS

I strongly agree with their initial points. In fact, that they made those points tells me that they are aimed very well and making good use of our taxes.

That they may get sucked in to doing graphs, etc. is a shame. Understandable, yes. Good people running such a site must be truly driven to want the data "out there". And, I for one, am happy to employ (through taxes) good people like that. Too, their job must include verifying that the data can actually be used! What better way to do that than to put out pretty stuff from some bits of the data? But, it would be a shame if doing that got them off track. They have, as they say, a huge job just getting the data in reasonable formats, etc. etc.

Mon 27 Apr 2009 at 9:15 PM
Felix

I strongly agree with their perspective as well. It's hard enough to get the data, forget about presenting it - someone's going to be offended whatever's presented. Make the data available and the mashups will come. The current state of the web - many pretty pictures chasing too little data - seems to be that analytic resources are often available and the data are what're scarce.

Mon 27 Apr 2009 at 9:42 PM
Martin

There was no contradiction about visualization- the point was that the *government* shouldn't prioritize visualizations over data, which doesn't preclude Sunlight Labs (*in the private sector*) hiring a visualization developer.

You also have to take their recommendation in the context in which data.gov is being imagined. By and large, government agencies focus on visualization or service and not data. The *point* of data.gov was, you know, data --- making sure the government takes data seriously in terms of transparency etc. We don't need a data.gov to do visualizations because agencies already do that to the extent they see it as part of their mission (whereas they generally don't see data as part of their mission), and agencies are in the best position to do visualizations anyway.

Mon 27 Apr 2009 at 10:27 PM

After looking at the Sunlight Labs website, I'm not convinced that they are the ones designing the Data.gov website. I think they only posted a design that they would like to see, but they're not actually involved in the design process.

So as Josh Tauberer said above, there is no contradiction. It's a difference between government vs public sector. This post appears to be a little misinformed.

Regarding the question of government visualizing data, I think they would be in a good position to provide unbiased visualization. However, the first focus really should be on the data. This will enable a lot of people to create visualizations for topics they care about. After the data part is smoothed out, then they can think of hiring a visualization agency or doing some community-based visualizing to provide people with unbiased ways of consuming that data. I think visualization will need to be an essential part of this website if the aim is to make information "available".

Mon 27 Apr 2009 at 11:05 PM

That should say "government vs private sector" above.

Mon 27 Apr 2009 at 11:07 PM
Natasha Lloyd

Natasha, you seem to be right about the post wrongly giving the impression that Sunlight Lab is redesigning data.gov (versus trying to get involved in the discussion about its design). I have changed the post accordingly.


Others: there is a difference between prioritizing clean data access versus providing objective, trustworthy visualizations, and deliberately giving up visualization all together as a government responsibility. Within the momentum building around the data.gov initiative, it seems like a missed opportunity?

Mon 27 Apr 2009 at 11:20 PM

I assert there's absolutely no such thing as "objective, trustworthy visualizations". The moment you start graphing and charting, you're making a point. There's just no way around it. Even the scale of the axes contributes to the impression you want to leave the viewer with.

If this thing is going to be truly impartial, it's got to present the raw data itself, and make ZERO visualizations from it.

Mon 27 Apr 2009 at 11:46 PM

I absolutely agree with Dan and Martin, and I think this speaks to the purpose of the site - it's not to provide graphs, charts and pretty pictures; it needs to be an open information source, and in doing so makes itself far more susceptible to scrutiny. This is what's important from a government site.

Let other utilise this information graphically and this site become an auditable, reliable source.

Tue 28 Apr 2009 at 1:08 AM
Quey Joh

I agree with most of the others here -- data visualisation is about getting a message across; you tell a story with the data. Quite often a pointed story,one that uses and highlights data to its own ends.

The government's role is not to tell a story; that's called propaganda. They need to remain entirely impartial on this front, and the best way they can do that is by putting no colour, no story on the data they provide. It's just too easy for them to get burned by claims of "manipulation".

You ask "so, if not data.gov themselves, who will spend time, money and effort to create visualizations that empower the individual non-expert citizen to analyze personally relevant data?" Well, your site is full of them, and that's what I come to see. The most powerful tales of data visualisation I have seen are from empowered individuals who feel strongly enough to tell their own story.

Tue 28 Apr 2009 at 1:13 AM

Hi,
I work on data publishing at the OECD and we get into that discussion a lot. There are people who feel that gov't agencies or any official data provider should only make raw data public. Hans Rosling, for instance, strongly believes this.

I strongly disagree.

I don't agree either that visualizations "can be" editorial. Visualizations ARE editorial. There is no such thing as a neutral way to present data, even in tabular format. But is that a problem?

what I believe is our job as official data providers is to add a layer onto data so that it can be found, understood and used properly. Visualization techniques are part of that "layer". For instance, in the USA public data world, there are lots of comparable data at the county level. To represent them on a map allows the user to make sense of it in one glimpse, which is not possible if they are only available as a bulk download.

Furthermore, a visualization done by the agency that gathers the data offers a guarantee to the viewer. Any statistician know how easy it is to mislead with biased graphs, and how difficult it can be to find the fallacy in such representations.

Finally, the people who collect the data are among those who know the data best. They can propose visualizations which non-experts could not think of.

So while I'm not saying that the private sector shouldn't try to visualize government data (only good things can come from that), I don't think that the governments role should be restricted to merely putting data online.

(and at this point I have to say that my opinions are not necessarily those of my organization)

Tue 28 Apr 2009 at 2:10 AM

I think a key distinction has to be made between providing specific visualizations of data versus providing the tools to visualize data, which might be part of the broader debate between visualization designers and visualization tool builders (disclaimer: I fall in the latter camp).

While specific visualizations can have an editorial slant, visualization tools do not. This is because, by their nature, visualization tools are designed for a type of data, before the specifics of the actual data are known.

That is not to say different types of visualizations do not create biases in interpretation, but inaccessibility of raw data for certain types of analysis creates its own biases. A structural bias is different than an editorial based on content.

At a minimum, Data.gov should provide visualization tools as an additional way for users to access and analyze the data. Whether it also provides specific visualizations, is a different question that I can see argued both ways.

Tue 28 Apr 2009 at 3:20 AM

Jerome's point about "knowing the data" is important. For anyone to make good use of data presented in tables, there needs to be very good documentation about exactly what this data means, how it was gathered, what the possible values are, and what the assumptions or gaps were -- was this only collected from certain areas? Do the gaps mean something? When looking at a single-select field, are we seeing all possible values? For the values that were never used, do we know if that's because they weren't on the paper form sent out, or because they truly never happened? (If you never see an event occur in the logs, do you know you're logging properly?)

I agree that providing data visualizations is "editorial"; providing access to raw data is too, though less so. The priority with which certain information is made available can indicate something about internal priorities; the fact that some data's collected, while other potentially interesting aspects are not, also says something about internal priorities.

Tue 28 Apr 2009 at 3:26 AM
Philip Williams

I believe Sunlight Labs can and should do both (if possible)...provide access to clean accurate data (highest priority) and provide the ability to create visualizations. (i.e. "tools" such as Trevor suggested.)
There have been many good arguments presented concerning the pros & cons of visualizations but here's what I think is the bottom line...once you get clean accurate data, you need to do something with it for it to become useful.
Sure, charts/graphs/metrics (both public mash-ups & the ones the government will produce) are manipulative but, if you think big-picture, remember that the majority of people (the tax-paying public) will need some sort of help to make sense of the raw data. (You think Obama will be downloading data, crunching numbers and analyzing the results? ...someone will be handing him a PowerPoint with charts.)
I guess the idea of providing visuals depends on the purpose of data.gov. If it is to help inform the electorate, some people will need assistance. Mountains of numbers with no reference points will turn away a large percentage of the public. If it is to provide data and data only, they can do that and let the chip fall where they may in terms of who will make the numbers sing & dance. I think there is an opportunity here to do both.

Tue 28 Apr 2009 at 10:49 AM
Steve Croteau

I agree with most of the sentiment here. And to add to Steve's points, the VAST and OVERWHELMING majority of American citizens are not like the folks commenting here. Those people need quick, easy, and relevant ways to interpretting the information.

So perhaps visualizations shouldn't be created by gov, but maybe data.gov could showcase visualizations made by the infoviz masses. And hopefully there is enough diversity in the field that the data won't be slanted.

I see this much like any other media. The information goes out in a press conference, but the individual media outlets have their own interpretation.

I also couldn't agree more with Jerome's comments about the knowledge behind the data. That's crucial to getting the meaning out of the data. Not sure how that should be handled by data.gov.

Tue 28 Apr 2009 at 11:59 AM

I agree that data.gov should emphasize the providing of good, documented data way above providing visualizations, and that visualizations are editorial.

Data.gov could, as well, provide links to or a directory of data visualizations created by Federal agencies. So, if www.whitehouse.gov/omb/ has various charts illustrating the breakdown of the Federal budget, I'd be able to find it via data.gov.

Wed 29 Apr 2009 at 12:15 AM

I would also suggest that OECD eXplorer seems to prove it is possible.

On the other side, I guess that private initiatives such as Many Eyes and Google Public Data might rush to integrate the data from data.gov. However, are there many private initiatives rushing to do this with currently available data repositories, lets say UNData or World Bank Data?

Wed 29 Apr 2009 at 1:58 PM
ADD A COMMENT
Commenting has been temporarily disabled.