Friday, December 26, 2008

Bias in Visualization

When faced with a large amount of data, one if the first things I do is graph the data in some way to get a visual impression. I'll even graph simple linear calibration plots since a quick glance will give a better impression of the data than looking at the slope, intercept and correlation coefficient. In this case, the visualization shows more than the individual data.

However, the opposite can happen, even though I'm sure you're familiar with the adage, "a picture is worth a thousand words." Recently the Flowing Data site held a visualization contest. The results were interesting. Even though everyone started with the same data set, each visualization tended to emphasize something different about the data set. As a whole, the visualizations presented a complete picture and highlighted aspects of the data one couldn't see from just the numbers, but each individual visualization tended to focus on one thing at the expense of others.

This may be your intent when visualizing data, but watch out for your own bias. Always include the data used to create your visualization (or when this is not practical a reference to it) so that others with a different perspective can visualize the data their own way and perhaps glean something different than you did.

There is fine line between illuminating data for your audience and prejudicing them.

