Thursday, June 24, 2010

Mashing up Hampton Roads School System Data

I'm tempted to not provide a narrative for this. The pictures really tell the story.

I was reading an article from my local "paper," The Virginian-Pilot, online. When I finished the article I noticed the "Quick Links" just below. To my pleasant surprise, the first category of links was "Databases." Truth be told, I'm not that surprised to find data at The Pilot, but what unfolded when I followed that link was far from the norm when it comes to online (local) journalism and data.

Pilot_Databases_QuickLink


I clicked the link to "School Violence" and found the following page. If you look closely you will see there is a form:


Pilot_Databases_Form


Clicking "Search" took me to this next page. I was not surprised to see that the results were available in HTML only:


Pilot_Databases_Example

Whenever I come across data like this the first thing I want to know is whether or not it can actually be used as data. In order to be used/usable as data the contents of this HTML table need to be, at minimum, copy-and-paste-able into a spreadsheet.

Lo and behold... In about 20 minutes I had pasted all of the contents of the database into a Google spreadsheet:

Pilot_Databases_Google_Spreadsheet

From there I was able to create a few transformations of the data and about a half-dozen meaningful charts like this:
number_of_violations_by_type_and_school_division

What if more news organizations published databases? What if journalists had a habit of digging into those databases with with these simple tools? Would we create better news products? Would we become more accurately and insightfully informed as citizens? Would our governments, businesses, and civic organizations make wiser strategy and better decisions? And if we think that the answer to any of these questions is yes, how do we sustain the "model?"

No comments: