Flow: Ambient Visualization

Sunday, June 27, 2010

Transparency and Visualization of City Data

Last year my city was considering closing one or two schools (HamptonRoads.com, WAVY.com). One of the justifications cited was decline in student enrollment. Though I can no longer find it anywhere online, one of the enrollment artifacts put forth was this image of a column chart from a (presumably) Excel spreadsheet.

From Flow

To look at this picture (of data) one would assume there was a HUGE drop in enrollment, something in the vicinity of 50-60%.

But a closer look at the values reveals quite a different quantitative result. In fact, the actual decline from the peak in 1994-95 to the valley in 2008-09 appears to be 5,600 students. Keep in mind, I'm doing my best to derive the actual number based on the Y-axis scale on the picture. The data weren't provided. To make matters more difficult, the column chart is shown in 3D. This undermines the utility of the visualization by making it hard to determine an actual value on the chart.

The actual difference is 6,824 students, from the high in 1994-95 to the projected low in 2012-13.

6,824 students is 18.1% of the peak (6,824/37,707). While significant, 18.1% is a far cry from what appears to be a 50-60% decline.

Fortunately, when I requested the data they were provided to me.

This is valuable for two reasons. First and foremost, I get the data, not an interpretation of the

data. Second, and more the point of this post, I can show what this chart should look like and what else the data have to say.

Notice the slope of the decline and the differences between yearly observations. Neither are so exaggerated as the first image.

The differences in view are the aspect ratio of the graph and the value of the Y-axis origin. In the first chart the aspect ratio of the picture is nearly square, or 1:1. The starting value of the Y-axis is 28000 (an arbitrary value not found anywhere in the data). In the second chart the Y-axis begins at zero (0) and the aspect ratio is closer to 3:1 (which is probably too wide).

So then, which image and which set of chart variables is correct?

The answer depends on the basis for the query...or what you are trying to say. Actually the answer has nothing to do with "what you are trying to say" since you/we shouldn't be trying to say anything except to represent quantitative values in visual form. The right thing is to understand the nature of the analysis and the decision trying to be made.

In this case, we can assume with confidence that the Y-axis has no business starting at any value other than zero. In this case what is being claimed, via what is shown, is a long term (since 1990) decline, not a year-to-year fluctuation. Over the 14 year span the decline is obviously not in the range of 50-60%, therefore the chart should no imply so. Otherwise, if year-to-year deltas are of interest it probably does not make sense to show a 14 year span.

By the way, the data for recent yearly decline are -216, -200, +30, -17 between the years 2006-2010. 200 students city wide is hardly justification for closing a school. (Our school has over 600 students.)

In the end, my city chose not to close a school. I think that was a wise choice. But I have to wonder what was going on in the minds of the decision makers. What analyses were they using? Did they, too, realize the truth in these numbers or was it some other (political) factor. Did we just get lucky?

When cities make claims based on data they should publish the data. When cities, or anyone, shows you a picture of data you should question the picture and ~~demand~~ politely request a reference to the data. At minimum, know what you are seeing.

Monday, May 18, 2009

On Ambient Visualization

I want visualization to be less a part of a specific application that I go to and to be more of a natural extension to the computer itself, available from everywhere. I want visualization to an ambient experience. When I encounter any table of data in any document container I'd like to be able to quickly view it as a column chart without starting up a chart-making or data processing application, without shuffling around through copy & paste. I just want to select rows and columns an pop a "window" with a chart in it in one easy step. If I can recognize with my eyes that a table contains place names or lat/lon pairs then a computer ought to be able to map it with minimal intervention on the part of the user. I should also be able to put my selected, obviously geospatial data on a virtual Earth model. With just a little more imagination I can see turning lists and tables into nodes and edges, viewable with graph layouts. Think Enso for visualization. I might want to do more than just look at my chart, map/Earth model, and graph. I might want to start to interact with these views (assume independently for now). It starts to seem like I need an application to do that, but I'm not ready to jump the gun. This is still in the realm of a capability and not necessarily an application. Applications start to assume containers and domain-specific use cases. Most visualization techniques have standard, "off-the-shelf" things you can do with them given basic commands or input devices. Charts can be sorted and transformed into different layouts. One can pan, zoom, and rotate maps and terrain. The technique of "drill-down" and "roll-up", which can be applied to any visualization technique, is nothing more than navigation of linked data at multiple levels of detail and sometimes across multiple view contexts. At what point do is a specialized application needed more than a capability? We may be overly conditioned to assume the application model when we think of software as having utility. This is changing rapidly on the web. (It was always thus on the Unix command line, yes?) Visualization ought to change with it. Leave the application building up to subject matter experts with an application domain, not to software programmers. Ah but wait, lest too much be read into a passing editorial remark. Obviously software programmers play a key role here. The tendency among programmers who attempt to answer that call is to build an "application building framework." Again, the assumption is that subject-matter experts always need an app to make use of viz. I wonder why visualization software shouldn't be a part of an operating system; a core capability for any application or purpose. There have been wonderful advances made through web browser extensions, but even here visualization is at best an after thought applied to a mostly universal application. (I say "mostly" b/c there are no less than 3 different web browsers installed on my one operating system.) What happens when someone emails me some data in a flat file that I open in a text editor? Instead what is need are document object models for visualization techniques and runtime software that can parse viz documents on the fly. The runtime is optimized for robust interaction and attribute manipulation of high level visual artifacts, not application-specific tasks. This runtime can be invoked from a background process or "embedded" (contained in, called from) an application runtime. Devices having different display and user interfaces can choose how to represent what are otherwise well understood visual metaphors. Data can be more easily passed around and visualized simply by passing text documents describing interactive, dynamically updatable (or not) views. (This seems inherently more secure, too.) Only this way is ambient visualization possible; something that is available everywhere on my computer, no matter what kind of computer/device/hardware platform I am using.

Sunday, June 27, 2010

Transparency and Visualization of City Data

Monday, May 18, 2009

On Ambient Visualization

Twitter / kmcurry

Somewhere Out There

Blog Archive