Sunday, December 28, 2008

Q: How Does Web 2.0 Make Money? A: Government.

A lot of folks are wondering how Twitter will monetize.  Will they sell premium services to businesses that want to make Twitter part of a communications strategy?  What about Flickr, YouTube, and Facebook?  Are the ads working?  Regardless, if this is Web 2.0 then why are we still talking about subscriptions and eyeballs on pages?! 

I want to suggest another strategy:  Sell to government.  

I don't mean the general sense.  I  mean, specifically, Twitter, YouTube, and Facebook and a great many others should set up for-government operations.  I wrote to the Delicious team years ago asking when they were coming out with a solution that I could bring to government, i.e., in-house.  I never got an answer.  Regardless, figuring out how to get these companies oriented toward government is not straightforward.  Most Web 2.0 companies literally could not be further outside The Beltway.  I suspect they don't have much in the way of strategy for state, tribal, or local either.  Google does a good job selling into government with its enterprise appliance model, and with more than just search.  But, of course, Google is a massive company.

Culture has a lot to do with things.  The Pentagon is not a T-shirt and flip flops kind of environment.  "The Bigs," i.e., large-cap companies that provide most of the contracting labor, are not at all oriented to innovate in the Web 2.0 technology space.  You don't see Macs anywhere.  You do see MS Office everywhere.  I'm not entirely sure what conclusions can be drawn from these observations but I am sure the observations are significant.

I suppose the best example of Web 2.0 penetration into the government space is tele-presence.  Adobe Breeze is ubiquitous on Defense Knowledge Online (DKO).  Just about anyone with a DKO government account can create or attend a meeting.  But tele-presence probably isn't the first thing that comes to people's minds when asked to name a Web 2.0 technology and I'm not sure Adobe is the best example of a Web 2.0 company.

Yes, there is something different about Twitter, Flickr, YouTube, Facebook, Delicious, et al.  I happen to think that something different - whatever it is - translates into unrealized opportunity for both buyers and sellers in the government space.  I choose to focus on these technologies specifically and there are others that I include.  ProgrammableWeb has a solution for the registry problem, for example.  I don't represent any of these companies, by the way.  I call them out as (mostly) well-known examples of capabilities the government needs.  I don't really care if Flickr, Picasa, or PhotoBucket is the image repository of choice.  Vimeo and YouTube can and should compete for the video infrastructure.  

The point is that government needs platform solutions.

Simple, content-based, platform solutions are the most obvious plays for Web 2.0 in government; images, video, audio.  The federal government processes a staggering amount of this stuff.  The DoD may be the first to get into the mix with TroopTube

There also are plenty of outfits that would use social tagging tools if only they could bring them in house.  By bring them in house, what I mean and recommend is providing enterprise services solutions.  It's nice to have applications that provide tagging, but applications are seldom the best enterprise solutions and are hardly social (except, perhaps, in MOSS).  Tagging is a domain, a Web 2.0 partition, if you will, unto itself.  It is a simple-enough-but-not-too-simple utility that scales and it can be integrated with just about any other application, regardless of whether or not the application was designed with tagging in mind.  Yes, government needs strategic guidance and support for tagging services and Web 2.0 tagging companies are just the ones to provide it...if we can figure out how.

The current providers, a.k.a. "The Bigs" are not oriented to provide Web 2.0 tech support.  

This is either an opportunity to get new business before The Bigs or to create a business of showing them the way.  In any case it is an opportunity to make money.  More specifically, it is a way for Web 2.0 technologies to make money.  In the government services space the biz-speak is generally referred to as "priming" and "subbing," as in:  you are either a prime contractor or a subcontractor.  I don't see many Web 2.0 companies subcontracting to big, traditional system integrators, though.

The government is not set up to acquire Web 2.0 technologies.  

While services contracts are fairly common and understood, things are less well-defined in the products and solutions space unless the products and solutions are ubiquitous or otherwise extremely well known.  Government, especially federal, wants to buy everything in bulk.  See also:  Indefinite Delivery, Indefinite Quantity (IDIQ).  New technology is extraordinarily difficult to insert, especially in secure environments.  Protocol, procedure, and process rule the land.  Force-fitting into an existing model is too often the preferred method.

We are getting better at defining and avoiding undue process but process, by definition, is:  1)  necessary and 2)  inherently situational.

When doing business with the federal government, it's important to know how business is done inside The Beltway.  There are most probably things in the mix that need to be undone, too.  To a meaningful extent the situation is not different at state, tribal, and local levels.  Education is needed on both sides of the buyer/seller relationship.  We might "change the world" in the process of implementing Web 2.0 for government, but requisite is the obligation to have a fair understanding of "the world" first.  The obligation goes both ways but mostly falls on the shoulders of sellers.

Choices for Web 2.0 companies to make money by doing business with government:
  1. Become or spin-off an enterprise systems integrations unit
  2. Sell enterprise systems solutions a) to government b) to system integrators (Note:  probably can't sell solutions to government without either being an integrator or having the support of one.)
  3. Consult a) to government b) to system integrators on enterprise systems
Of course, "partner" is an option, but still implies one or more the previously listed options.

[Update 9:47 PM - I've decided I really don't like these choices at all.  Need to come up with an altogether new business model, perhaps...probably]

There's so much more to this than I can wrap my head around now, certainly more than I am prepared or qualified to comment here.

[Update 9:47 PM - Forgot to comment on need for and evidence of government investment in backbone instracture and understanding of cloud architecture; significant issues arise once a bunch of these services are running around on a single network.  And as always, security is different and harder.]

I think it's time Web 2.0 companies, government, and large-cap contracting companies had a grand introduction to one another.  Believe it or not, there are plenty of people who have never even heard of Web 2.0.

Monday, December 22, 2008

Common Web 2.0 Services for Government Brainstorm

  1. SMS - ex., Twitter
  2. Expertise Location - ex., Facebook, LinkedIn, semantic profilers
  3. Tagging - ex., Delicious, Magnolia, Stumble Upon
  4. Image Sharing- ex., Flickr, Picasa, PhotoBucket
  5. Video Sharing- ex., YouTube, Vimeo, TroopTube! (12.28.08)
  6. Audio Sharing - ex., SoundCloud, HuffDuff
  7. Document Sharing - ex., Google Docs, MOSS, Word Press, Blogger, Tumblr
  8. Registry - ex., ACME.ProgrammableWeb, Wiki (?)
  9. Tele-presence - ex., Breeze, WebEx, GoToMeeting
  10. Search - ex. Google, Yahoo!
  11. Visualization - ex., ManyEyes, Swivel
  12. Dictionary - ex., ([WORD])
  13. Data Transformation- ex., ??
  14. [Update 12.29.08] Geospatial, mapping - ex., Google Earth/Maps, Microsoft Live Virtual Earth/Maps
  • Utility model (i.e., like electric, water, natural gas)
  • Specialized content and application services infrastructure
  • Government needs a strategy for inserting these technologies
  • Government strategy must be capabilities-based and vendor indifferent, yet cannot be generic
  • How exactly/specifically can government do business with Facebook or LinkedIn, for example?
  • To what extent will/should third-party integrators be involved?  Is Twitter likely to provide labor resources for technology insertion or would they just want to license the platform to third parties?
  • Companies listed above need a strategy for doing business with government.  Some have them, but most don't.
  • The mutual strategy should be for companies to implement their architectures for these platform services in public and government-managed domains.
  • Probably the biggest hurdle is the massive amount of process and procedure required to navigate the government marketplace and interact with customers
  • Culture gaps

More later...

[Update 12.23.08]
  • Security model is much more complicated than just protecting access to personal information; ease of and tracking of information flow a greater concern
  • Security models need to be reconsidered by both buyers and sellers

Saturday, December 13, 2008

A Pattern is as a Pattern Does

I had a great conversation the other day with Mike McKinney.  One topic in particular really stuck with me and now I can't seem to shake it.  Mike and I had been talking about design philosophies and what it takes to do smart implementation, especially as a government contractor.  We both lamented the environment of design and implementation bloat in which we seemed to find ourselves.  Bloat does not fly in government contracting.  Time and money resources are extremely tight and skillsets are scarce.  Mike noted that "ever since the 'Gang of Four' book came out everyone thinks they are an architect and everyone wants to build a framework."  I couldn't have agreed more.  Ugh, that word:  "framework."  I've used it many times myself, but if I never see or hear or use it again I wouldn't be unhappy about it.

The purpose of design patterns isn't necessarily to build huge, abstract libraries of software components.  The purpose of design patterns isn't necessarily to solve common programming problems for other programmers.  A wonderful thing about design patterns is that they are something familiar to rely on when confronted with new problems.

I've used dozens of patterns and I've even participated in writing a  framework or two.  But the world needs only so many frameworks.  At the same time there are no shortages of hard problems to solve for which design patterns are quite, quite useful.  I don't have time, money, or expertise for all the frameworks I'd otherwise need to get the job done.  I need robust, reliable patterns and people skilled enough to recognize how to implement them to solve customers' problems.

So, attention all developers:  Don't just think of design patterns as architectural building blocks. Sure, they are that.  But think also of design patterns as tools in a toolbox.  

When we're confronted with a domain-specific problem we should build neither point-to-point solutions nor frameworks.  Instead, use patterns to solve a domain-specific problem in a reliable way.  Point-to-point solutions require niche skillsets, are complicated, and don't scale well.  Yet, we don't have time or resources to build frameworks and there wouldn't be much of a market for them if we did.  Either that or our frameworks have to be raised up to such a level of abstraction that we end up forcing domain-specific stakeholders into our patterns rather than molding our patterns around domain-specific use cases.

So, what is a good example of how we use patterns to solve domain-specific problems without building a framework?  ETLV is a good example:

You don't need a framework to implement this pattern.  Simply recognize that the solution to interoperability among domain-specific applications is the production and consumption of well-formed Unicode data.  If someone gives you a data source and is looking for a way to visualize it you could look for a visualization API and use point-to-point integration to read the data and constuct objects from the API.  But then what do you do when someone hands you another data source?  What if the new source lends itself to a new visualization technique?  What if the API you chose the first time around doesn't support that technique?  Well, perhaps you have job security as a software integrator.  On the other hand, if you want to maintain separation of concerns and implement a robust, flexible solution you could follow a pattern based on decoupling data from proprietary or domain-specific formats and transforming data into views using standard, ubiquitous processors.  This effectively changes the integration and interoperability problem into a easier-to-solve scalability problem.  Point-to-point integration is neither simple nor scalable.  However, given any data source expressed as well-formed Unicode we can write a practically boundless number of transformations to produce a practically boundless number of views and applications.  It takes a pattern but not a framework.

Tuesday, December 9, 2008

Models and Simulations as Data

A colleague, Anthony Watkins, and I recently conducted a feasbility assessment of ETL-V applied to a DoD tactical simulation called JCATS:  Joint Conflict and Tactical Simulation.  JCATS was created by Lawrence Livermore National Labs in 1997 and has been supported in some fashion or another ever since.  JCATS is not unlike other military sims I've encountered over the years; OTB, TDF.  In fact, JCATS is generally not unlike all other models and sims with which I have worked in the past 11 years - aerospace apps, power/energy sims, custom apps.


I took a class in graduate school, did pretty well.  But I studied Computer Science and Applications.  That's my slant.

When I reflect on my scant 11 years of experience designing, developing, implementing, and integrating business processes and software - a fair amount of it in the M&S domain - I come to one conclusion.

Interoperability stinks.

A bit of personal history on why that I think that - three words:  separation of concerns.  As in, we haven't maintained it.  I observed the problem in my first job at ODU Research FoundationCenter for Coastal Physical Oceanography:  oceanographers were writing code...bad code.  Eventually the computer scientists got involved, but all we did was write occasionally elegant code wrapped tightly around a domain application.

Then, in September of 2001, I found Object Serialization, Reflection, and XML all at the same time (XML, casually, in '99).  My colleagues and I were using these to create 3D visualization, behavior, and human-computer interaction for the Web.  With the introduction of XML-RPC we were communicating with "back end" data services to drive client-side 3D.  The 3D applications were entirely dynamic without compilation,  the same way all other content was being delivered over HTTP to a standard browser.  It was cooler than dirt.  

Along came Web Services, circa 2003-4 for us. Finally, everything was starting to click.

That brings me back to models and simulations as data and where I am in 2009.  

Simulations aren't applications, they're data sources.

I've written previously about how we use enterprise architecture models as data sources for analytics.  It's basically the same thing for sim integration.  There is one additional component.  Simulations often stream data in real time.  The transactional model of the Web and relational databases is not always sufficient.  Regardless, the methodology is not dependent on transfer rate or protocol.  The methodology is based on separation of concerns.  The difference between yesterday and today is that we realized how to take what were doing above the level of a single application.

Most of the applications with which I am experienced are heavy...very heavy.  Whether we are talking about footprint, documentation, training, number of participants, facilities, or duration most simulation apps require a lot of it.  Technology insertion and training are costly.  Learning curves are steep.  Exposure and knowledge is limited to very small groups.

It doesn't have to be that way, either.

[Update:  12/10/08 - Well, maybe it does.  I appreciate that M&S is hard and hard takes time.  The point I really want to make is that we can and should reuse the data generated by M&S to increase their ROI and overall value of their insights as information assets.  We do this by decoupling their data so it can be more easily integrated with other things - things people who aren't modelers or enterprise architects care about.]

If I could wave a magic wand, I'd strip every simulation down to it's algorithms and databases.  To be sure, there are some sweet front-ends out there, but they aren't maintaining separation of concerns.

JCATS, for example, is a powerful tactical simulation.  There are good smarts in there.  But JCATS has a limited graphical user interface (2D) and is strictly designed for tactical operational scenarios on maps.  While the designers of JCATS may have thought about 3D and some statistical diagnostics, they certainly did not, nor could they have anticipated or accomodated all the many ways we can use the output of JCATS simulations.

The good news is that JCATS saves huge DAT files, i.e., data files, i.e., JCATS data is portable.  JCATS produces ordinary delimited text files (comma-separated) and puts similar data on the wire in real time (either TCP/IP or UDP, I think).

From here it's easy:

All of these interactive views are transformations of JCATS data using ETL-V

Saturday, December 6, 2008

Government 1.0 -> Government 2.0

Observing, describing, and defining a 1.0 -> 2.0 transition may be a difficult undertaking.  Take Government 2.0, for example.  What is our context for "government;" local, state, federal, international?  All of these?  I think if we are going to attempt any articulation of a 1.0 -> 2.0 transition we should start by following a model.  In the case of all things 2.0, that model is obviously the seminal piece by Tim O' Reilly, entitled "What is Web 2.0?  Design Patterns and Business Models for the Next Generation of Software."

I suppose we should first start by questioning the appropriateness of trying to draw such inferences from one domain to another.  However, this is occuring regardless.  The 2.0 moniker is being attached to everything under the sun, perhaps without consideration of anything but brand/buzz recognition.  So, like it or not, we are now at a point where we must articulate what we mean by Government 2.0, indeed, by 2.0 in general.

I do think it is appropriate to use Web 2.0 as a model for other domains.  Web 2.0 is not just about technology.  It is about technology that recognizes and leverages the profound role of human behavior and scale.  It certainly seems relevant to Government 2.0

Literally, it may be appropriate to relate the seven top elements of O' Reilly's paper to other domains.  Intuitively I can see such relationships in Government 2.0.  Harnessing Collective Intelligence strikes me as very Democratic.  What could be more E Pluribus Unum?  Data is the next Intel Inside is Open Government Data (see also:  Kundra).  The sub-elements relate as well:  "A Platform Beats an Application Every Time" is exactly the point being made by Robinson, et al,  in "Government Data and the Invisible Hand."

But even before we get too carried away with that exercise, perhaps we should start where O'Reilly and MediaLive International started; with a brainstorm.  What are examples of the 1.0 to 2.0 transition in government?

Here are a few examples that are obvious to me:

NOAA active weather bulletins -> NOAA active weather RSS
Agency reports -> Open Government Data
Budget competition -> transparent investment
representatives' web sites -> representatives' blogs
Public council meeting -> public council WebEx
Requests for Proposals -> Contests for Apps (solutions)

Well, perhaps it's a start anyway.

Enterprise Architectures as Data

Enterprise architectures are more than static structures, use cases, and process/sequence models.  Enterprise architectures are also queryable data sources that, once constructed, can be used to answer a great many questions relevant to decision making based on multiple stakeholder concerns; operational, technical, financial, logistical.  In my experience, this assertion usually surprises people.  Operators want to know what complex (in their minds complicated) EA models have to do with getting the job done.  Appropriators want to know how system producer-consumer dependencies relate to purchasing decisions.  To anyone unfamiliar with enterprise architecture, EA can be seen as not only having no value, but as an unwelcome cost burden.  Yet, each of these perspectives is relevant to an enterprise architecture.  Many architects understand this problem but have been helpless to address it.  Enterprise architecture is a rigid, rigorous discipline.  The language and views of architects are complex and detailed.  The tools architects use are highly specialized.  All of this contributes to formidable barrier to information and knowledge sharing.

The good news is that there is a relatively simple technology solution that can cut through the complexity and lead to better decision making informed by a variety of stakeholder perspectives.  Now, I'm not saying this is a case of technology riding in on a white horse to save the day, but it's darn close.  To be sure, technology's job here is to get out of the way; to provide the least amount of resistance and friction to business processes and people communicating.  By focusing on a strategy for how EA data is stored, extracted, and transformed we can make the data more versatile.  By making the data more versatile we can make the information that data describe more useable.  

The solution strategy, then, is to make enterprise architecture data more versatile using standards.  

By implementing this strategy we can use enterprise architectures as data sources to answer a diverse set of typical stakeholder questions.  Using this strategy we clearly see that otherwise detailed, complex data can be easily queried, extracted, transformed, and visualized in entirely new ways.

Saturday, November 22, 2008

Priorities for Technology and Innovation?

If I could choose our country's top five strategic priorities for technology and innovation, they would be:

  1. Diversify the energy ecosystem
  2. Transform human transportation
  3. Open more government data
  4. Boost resurgence in manufacturing
  5. Transform farming practices
Diversifying the energy ecosystem means that we diversify production and consumption.  Diversifying production means that we increase the number of different means for producing energy in terms of both type and scale; everything from storing the energy you produce in your exercise machine up to expanding nuclear.  If they ever figure out the photosynthesis model of using the sun then we're literally set for life.  Diversifying energy consumption means that we increase the number of different kinds of energy a consumer can use and decrease the amount of energy the consumer requires in order to be effective.  There will always be both high energy and low energy systems that have purpose and value.  Efficiency of consumption still matters.   Ideally we will produce and require less energy to do more.  This is not a pipe dream.  This is innovation and efficiency.

Transforming human transportation means that we change the way we think about how people move around geographically.  We need to look at the logistics of national and local transportation free from the constraints of mode.  When it comes to moving people, we will also place great emphasis on the preferences of the individual.  These might be combined into the notion of demand.  Given an understanding of demand (past, present, future?) we can make better decisions among biomotive, automotive, locomotive, hydromotive, and aeromotive modes of transportation (most of which are not English words, by the way).  I'm not sure I have good answers, but I'm certain transportation is A) complex and B) broken.  That usually tells me we need to turn the problem on its head and look at it sideways to recognize what will work.

Opening more goverment data means that local, state, and federal governments and their agencies commit to putting all information already in the public domain online in standard, ubiquitous formats that are easy to publish, syndicate, transform, integrate, and view.  Governments should spend less time and money on applications (analysis and visualization tools, websites/portals).  Governments should spend more time and overall less money on opening up our data vice building applications.  Naturally, government-internal uses of data will require tools and applications.  But in all cases better tools and applications will emerge when data is open.  This may be the easiest and least costly priority to implement.

Boosting resurgence in manufacturing means that we recreate and reinvigorate our strength and ingenuity as skilled manufacturers.  More specifically, we invest in:
  1. High-tech, lightweight manufacturing; tools, devices, sensors, robotics and the like
  2. Transportation manufacturing; planes, trains, and automobiles 
Maybe farming equipment becomes an emerging market (see priority 5).  Obviously auto manufacturing needs a new direction.  Flying is a horrendous and toxic experience on many levels.  And trains are poised for a comeback here in the U.S.  Clearly transportation of goods is important and deserves attention.  But I think transportation of goods is largely solvable while transportation of people seems nearly, but not quite, intractable.

Transforming farming practices means we find the middle ground between niche farming and factory farming.  Niche farming is not efficient or practical and is more costly than Americans will endure.  The factory farming model is failing in the current environment.  There is entirely too much land not being farmed in an effort to control prices and trade, yet prices are high and trade is poor.  Some argue that cheap food is gone.  I'm not so sure.  The reason sited is the high cost of transportation undercutting the (tightly) controlled production practices that created cheap food in the first place.  I agree those practices should end.  But as long as we are purposely not farming land I don't know how we can make accurate determinations about price realism in food.  Here in Virginia and in southerly neighbor North Carolina we have acres and acres of underutilized, fertile farmland.  Whether it is land that goes unused because a mid-sized farmer can't compete in a factory-farm environment or old tobacco farms that must transform or die, a new strategy for farming can make use of that land, uplift local economies, keep choices up and prices low.

Then again, what do I know?  This is more feeling than fact.  I could be terribly, terribly wrong.  Clearly I'm short of specifics.  What I'm after is strategic priorities.  This feels like what matters.

Friday, November 14, 2008

Pentagon Internet Utility Model

Pentagon gets it right:

Resources by Content, not Application

Now that you have a space in your cloud to put all your videos, you can come up with much simpler plans for multi-level security in that cloud using technology means that have been thoroughly vetted for decades.  Better results with less technology burden.  This can solve nearly any of your needs, not only the just-in-time requirements driven by the popularity of personal video production and sharing.  Certainly every unclassified activity is covered and that alone makes the world better by several orders of magnitude.  But, replicate this model on NIPR, SIPR, and JWICS and every possible thing you could want to do with digital video will be covered by (roughly) the same technology solution.  People who have to manage security layers will still have to manage security layers, but that concern is more easily separated from the entanglements of technology now.  This is a good thing all around in ways I hope and think you realized before I ever commented here.  

P.S. "OneSource" is a bad term.  When it comes to shared content, there is never one source.  That's the point.  You are the infrastructure, just need to figure out how to make that catchy for PR and marketing.  One Destination, maybe.  ...needs work in any case.

Friday, October 24, 2008

Pattern Recognition in Graphs

Recognition of a pattern in a graph is a curious thing. On the one hand, if patterns can be seen among relationships of nodes by their edges, then relationships exist whether or not they can be seen with the eye. In other words, the relationship is inherently mathematical and can be conclusively tested. On the other hand, decision making is not strictly binary, i.e., does the relationship exist? yes/no. Even when decision making is highly quantitative it is most often multi-dimensional. This is where graph visualization comes in. There is significant utility in having the capability to see multiple data dimensions simultaneously by affecting the many attributes of graphical objects. When people are considering multi-dimensional problem spaces then they are engaged in a level of thinking that is poorly supported by low-level primitives (ex., tables of numbers).  Decision-making based on tables of tuples is no way to go.  Pattern recognition is mind numbing.  The best we can do is perform operations on rows and columns; counting, sorting, grouping, filtering.  While these are immensly useful and powerful, they overlook the possibility that patterns exist among the atomic, non-aggregate entities themselves.  Furthermore, we don't know what they are until we see them and therefore cannot apply operations a priori.  On the other hand, given layered, multi-dimensional graphics we can map these many atomic attributes of abstract data to the attributes of graphical objects.  Then we can use operations for manipulating graphical attributes to reveal patterns.  This is particularly useful in graph visualization, i.e., of networks:

Revealing two distinct communities of interest

Monday, September 29, 2008

A Portable Language for Real Time, Interactive, Rich Media

Following is a draft section from an in-progress whitepaper on ETL-V:

The key to enabling visualization as a core capability is robust description of functionality that is machine portable and parsable through a universal processor that enables the serialization and deserialization of any object, attribute, and behavior in a standard way. Such a description captures the language of visualization in a way that can be understood by any program or person.

Our machine portable language is not intended to express low-level artifacts like vertex points and surface normal vectors, although it can. Most often we follow the model of HTML, which uses URLs to reference images, movies, and other digital multimedia. A 3D Model is not unlike an image:

<img src="images/car.jpg" />

<Model url="objects/car.lwo"/>

Moreover, our language expresses high level artifacts that describe things a person without prior subject-matter expertise might easily recognize. Camera, Model, and Light are but a few examples of rudimentary artifacts in our language. Yet, to establish a robust platform for visualization we must be able to describe artifacts of greater complexity. We must describe the attributes of, for example, ColumnChart, AreaChart, FlowChart, TreeMap, and RadialTree.

Some objects have implied behaviors such as a Sequencer that iterates through a list of items or a Directive that traverses a graph.

Our language must contain more than nouns; it must also contain verbs. Therefore we have command objects like Append, Remove, Interpolate, Morph, Notify, and many others (see also Command schema, below).

Our language must be capable of describing attributes having multiple dimensions, for example a 3-dimensional position. We classify attributes as belonging to one of four basic categories: primitive, complex, compound, and collection. These are defined as follows:

  1. Primitive Attribute – an attribute that is expressed using a single value; ex., integer, floating point number, character; we also consider string to be primitive
  2. Complex Attribute - an attribute that is expressed using multiple primitives; ex., 3D vector of floats
  3. Compound Attribute – an attribute that is itself an Object containing other attributes; Animation has Keyframe attribute, Keyframe is an Object with its own attributes
  4. Collection Attribute - an attribute that represents a data structure, such as a list, vector, or hash map. A Collection attribute can hold any other kind of attribute.

Given this language, we can express any visualization technique, behavior, or interaction style as a human and machine readable document capable of being interpreted at runtime. To demonstrate our concept for this machine portable language, we have created Bridgeworks Markup Language (BML). BML is defined using Extensible Schema Definition language (XSD). Documentation and source schemata are available online in the following locations:

Documentation of BML:

Schemata: (relative to

[Update 12.02.08: realized the schemata listed above do not contain all of the elements documented...not sure what happened...see BwSchools for full list. Schemata are best viewed in XMLSpy.]

We use Bridgeworks to provide our customers with the largest and most flexible set of visualization capabilities available to them on the market. We deliver these capabilities through a single lightweight runtime that scales between personal devices and large facilities. The runtime engine operates through simple input/output mechanism for parsing text-base, well-formed data, making it easy to integrate without a lot of API programming.

Platform Functional Requirements:

  1. The capability must provide graph layouts (ex., radial tree, cone tree, force-directed graph)
  2. The capability must provide chart layouts (ex., bar charts, line charts)
  3. The capability must provide geospatial layouts (ex., maps, globe, terrains)
  4. The capability must provide high-fidelity 3D modeling and animation
  5. The capability must provide video and audio
  6. The capability must provide hybridized views (ex., Extrude Map: geospatial + chart)
  7. The capability must provide split views (ex., side-by-side comparison)
  8. The capability must provide layered views
  9. The capability must provide a temporal dimension
  10. The capability must provide a mechanism for user-defined attributes
  11. The capability must provide a mechanism for user-defined events, and triggers

Platform Technical Requirements:

  1. The capability must provide multi-dimensional graphics (2D, 3D, 4D+)
  2. The capability must provide an engine that can change views and behaviors at runtime using only a text based, interpreted language that does not require compilation in order to add/modify/delete content in real time
  3. The capability must provide serialization and de-serialization of data and objects into and out of rich media views using:
    1. XML
    2. JSON
    3. CSV
  4. The capability must provide serialization and de-serialization of standard object notations (SONs, ex., JSON) into and out of views
  5. The capability must provide Object/Attribute Reflection
  6. The capability must provide storage of domain data as Objects’ attributes loaded in memory
  7. The capability must provide an extensible API
  8. The capability must provide ECMA script language binding
  9. The capability must provide COM language binding
  10. The capability must provide Java language binding
  11. The capability must be able to render from multiple 2D and 3D file formats, including but not limited to:
    1. ESRI ArcView Shape
    2. LightWave Object
    3. Keyhole Markup Language (KML)
    4. CAD Models (AutoDesk, SolidWorks)
    5. JPEG
    6. Bitmap
    7. Extensible to other formats as needed

  12. The capability must be able to render from data located in local and remote URLs
  13. The capability must embeddable in the following software applications:
    1. Internet Explorer versions 6 & 7
    2. Microsoft Office 2003 Products: Word, Excel, PowerPoint
    3. Mozilla Firefox
    4. Win32 Graphical User Interfaces (GUIs)
    5. .Net/C# GUIs
    6. Java GUIs
  14. The capability must provide an XML schema and a document object model (DOM)

Visualization as a Service

Visualization as a service is an extension of our core capability concept of visualization as a platform. Because we can express views, behaviors, and interaction models without deviating from our standards for reasonable portability we can publish and share views, behaviors, and interaction models just like any other content. This means that web services and other data servers can produce human and machine readable documents that describe visualization, behavior, and interaction. Furthermore, these documents can be parsed to create or modify any visualization, behavior, or interaction technique at runtime without intermediate compilation. Depending on transfer frequency, volume of data, and security constraints, data describing visualizations, behavior, and interaction can also be cached for use in “offline-mode” with so-called “rich Internet applications” (RIAs).

[Update 03.14.2009 - Just watched Tim Berners-Lee's TED Talk on Linked Open Data and recently watched Vint Cerf's Feb 09 lecture to FAA. Want to add here that a document-based approach is the best way to ensure that visualization follows the important lessons taught by these great teachers. Binary objects can be be linked using HTTP and metadata, but binary objects are not open. Binary objects also always require a specialized software reader to be interpreted. Documents, on the other hand, are what the Web was designed to link and text will always be readable by readily available software on the web. Even if one rendering engine disappears it is always possible for another to come along; ex., if rendering an old visualization document as graphics ever becomes important hundreds of years in the future. I realize I am contradicting this principle by encouraging the use of existing binary formats for images and 3D models, but this is a compromise I am willing to make at this point. Check out BML Hello World. KML is important and wonderful, but it is just scratching the surface.]

Tuesday, September 23, 2008

Friday, September 5, 2008

Flare gets DDV

I have learned much from the work Stuart Card and Jeffrey Heer. It was their research on graph layout algorithms that made possible the modularization of graphing capabilities in software. Heer's premier software implementation of this research is Prefuse. Computer Science taught us Graph Theory and the structure, manipulation, and traversal of nodes and edges. Prefuse taught us Radial Tree, Force-directed Graph, and Degree-of-Interest Tree. And Prefuse is excellent not just because it good at graphics and layouts but because it is also document-driven, i.e., it runs on XML. Sure, there's a Java API, and that's probably what most developers focus on. But the fact that it parses XML, that it recognizes a vocabulary and language for visualization, and that this language can be expressed as documents is what I think makes Prefuse special.

After Prefuse came Flare. Flare is a wonderful application of prefuse ported to ActionScript.

Flare really gets document-driven visualization. I was just browsing through the source when I unfolded the data/converters package:

See also (emphases added by me):


ConvertersFactory class for looking up the appropriate IDataConverter for a given data format.

DelimitedTextConverterConverts data between delimited text (e.g., tab delimited) and flare DataSet instances.

GraphMLConverterConverts data between GraphML markup and flare DataSet instances.

JSONConverterConverts data between JSON (JavaScript Object Notation) strings and flare DataSet instances.
This is certainly familiar.

Thursday, September 4, 2008

Tufte on document-based platforms

This just posted at the end of a new thread on Ask E.T. about Google Chrome:

(Emphasis added by me)

"How's the browser?

Thanks to the many contributors who pointed out Scott's kind link in the example of amazon search.

David Pogue's report in The New York Times suggests this browser leads to an integrated platform with non-proprietary formats (very good) but is still application-based rather than document-based (not very good)."

-- Edward Tufte, September 3, 2008

This is something I think constantly about - document-based platforms can serve end-to-end needs from data extraction, transformation, portability, and visualization. See also, the flow on Document Driven Visualization

Saturday, August 30, 2008

Visualization as a Core Capability

I came back from a recent DARPA workshop ever more convinced that visualization as a core capability is both necessary and attainable. "Visualization as a core capability" means that the full spectrum of visualization techniques is available from any application. It doesn't involve so-called "pluggable" components; no automatic code generation to force big-bang integration between heterogeneous apps, tools, and APIs. Anywhere I come across some columns of data I ought to be able to generate a chart. If I can recognize that columns of data are latitudes and longitudes then the computer should also make that recognition and generate a map. The chart and the map should be independent of any application and at the same time embeddable into any application. I am increasingly of the opinion that we need visualization built into the operating system as a platform service.

Wednesday, August 13, 2008

Thursday, July 31, 2008


I love that people continue to add relevant and useful information to 5 year old threads on Edward Tufte's bulletin board. It happens all the time. Talk about a simple and effective lessons learned resource...

Monday, July 28, 2008

"Information Technology Support" is an Oxymoron

The purpose of technology is to help people perform tasks. Isn't it ironic that we need so much human support for managing technology? Or maybe oxymoron isn't quite right. "Technology support" is redundant. It's support for support, is it not? I mean, is technology doing its job if it needs lots of ancillary support to keep it going?

Sunday, July 27, 2008

Wednesday, June 4, 2008

Calling All OS Designers: Redesign "Save As..."

It's time to redux "Save As..." Keyword tagging is everywhere and there is a resurgence in metadata collection, now that benefits are being realized. Indeed applications have included metadata forms for generations but no one ever uses them because they have always been hidden from our awareness and critical task paths. In other words, you have to go out of your way to collect metadata. But this doesn't need to be the case. We learned while working a very complex information organization problem for the FCS Lead Systems Integrator that it is entirely practical to insert minimal steps into the task of bookmarking in order to create a much richer and more easily shared collection. It wasn't a matter of automated versus manual tagging since senior subject matter experts were already using browser-based, hierarchical bookmarking techniques to keep track what they were finding while conducting their research online. (More on that later.) Instead, it was a matter of suggesting that, "Hey, since you are already actively browsing and bookmarking, would you mind if we pop up a window and collect a few key words?" The cost was the number of keystrokes and/or clicks to enter key words plus one to submit a pre-populated tag form.

And so this is what I am suggesting: We should add this kind of metadata collection to the "Save As..." task. Of course it can be opted out, but Save As... is the semantic equivalent of browser-based, hierarchical bookmarking. I won't argue the pros and cons of keyword tagging versus hierarchical cataloging. I know from experience that both have value and that one is much more versatile than the other. I would just ask, how many times do you find yourself saving files with names like:


Maybe it's just me and since I can really only speak for myself, I will: I damn sure would enter a few key words during the Save As... process if only I had the means. It needs to be universal, too, or as close as we can get given the hodge-podge of technology on our machines.

Monday, June 2, 2008

I Can't Believe I'm Thinking This...

I just stumbled across a post from Bruce Sterling on Wired's "Threat Level" blog: "Threat Level is fascinated by the National Cyber Range." I found it because I happened to have read the BAA last week and I was now doing a little homework for a proposal. Let me first record that I applaud Mr. Sterling's calling attention and his quippish commentary. I have to admit that I've had many of the same thoughts over several years of working in this business and even today as I consider implications of potentially working this concept. But given rationalization, a most useful tool and frequent resolver of guilt, I am compelled to counter Mr. Sterling's remarks.

I'll start with the obvious: DARPA invented the Internet; the thing Mr. Sterling is using to exercise his freedom of speech (in the wide open, by the way - something that will get you jailed or killed in other places in the world). Brilliant and now legendary inventors like Vinton Cerf worked for DARPA. Two prominent California universities participated, too. Even then the Internet started with a trepidatious title: "Galactic Network," but here we are.

But enough of the obvious. It is easy to pique our fears by tapping into what is surely a bottomless well of sarcasm and criticism, particularly when one is not bothering to give treatment to any other relevant possibilities. At very least, an entirely contrary position owes some consideration of likelihood. Generally from there we do a good job of protecting ourselves from those downsides, particularly, believe it or not, where our privacy and rights are concerned. A more appropriate criticism would be one that counters the supposed benefits an endeavor such as this program may suggest to offer. Since none are suggested by Mr. Sterling I will have to come up with my own.

I have to start by asking myself, "What existing problems can this solve?" Or, "What questions can this answer?" Or better yet, just plain, "Why?" I know I have to consider these questions in "DARPA-hard" terms. It seems naive to reduce this to a matter of...what? DoD's attempt to CONTROL EVERYONE AND RULE THE WORLD!? ...muhahahaha! (I get sarcasm.) I am absolutely certain that the kind of folks who would work on a program like this are smart enough to know that the concept of people exchanging information in the form of data packets over wavelengths has no constraints on infrastructure. There is always peer-to-peer.

I do suppose it is worth making explicit, by law, the fundamental right to operate your own network on your own infrastructure...

If there was ever an obvious example of a lab experiment having massively unforseen consequences beyond the lab it is the Internet. Ironically, the very spread of the Internet itself must have all of the same characteristics of the spread of an Internet virus or else Internet viruses would lack the very means attributing their basic behavior. Spread of cell phone technology is even more pervasive. And it is all fantastic. But there is some really bad, bad shit that happens on OUR Internet - the one conceived of, invented, and owned by civil people - and it has nothing to do with privacy.

Spam is beyond out of control. It may be intractable. DARPA does intractable. Who would tolerate hundreds or thousands of phone calls daily? Why is Internet spam so disproportionately leveraged as an advertising medium compared to junk mail? Is it because spammers think it's better marketing? Or could it be because criminals are exploiting our Internet for personal gain?

Why would DARPA want the "ability to rapidly generate and integrate replications of new machines?" Could it have anything to do with the existence of multimillion node botnets wreaking havoc?

Then there's this CNN report of researchers demonstrating the feasibility of hacking and destroying critical infrastructure.

So, I just don't know. What exactly is wrong with trying to fix that stuff? Why can't that be DARPA's reason?

How about this one: "The ability to replicate large-scale military and government network enclaves." De facto, the military and government have every right to pursue that ability for themselves if they think they need it. Er, ourselves...if we think we need it. Whether we like it or not the military and the government is us and us and us. But, hey, wouldn't it be great if DARPA invented a revolutionary new way to replicate [FOO] large-scale network? Like the ones on which our banks run.

And this one: "The ability to connect to distributed, custom facilities and/or capabilities as necessary to incorporate specialized capabilities, effects, or infrastructures." Hello, system integrators. DARPA should call Apple for this one.

And: "Realistically replicate human behavior and frailties." Well, that's just DARPA being DARPA-weird.

Thursday, May 8, 2008

Wonderfully Simple Praise of the Hyperlink?

(Prologue: This title comes from posts written by Ben Fry and Jeremy Keith. I hope they will not mind. I find myself increasingly attracted to what they have to say on technology matters I care about.)

Here is near perfect example of a complexity problem, that I have yet to define, when comparing the integration of two APIs: Facebook and One API seems to indicate what happens when you let developers define use cases and one shows what happens when you actually stop to think about how information is used by consumers. (I can say that because I write code.)

So in this case, I want to get a list of my friends. It's a dead simple request. I tell the platform I want my friends. I get back a well-formed (hopefully) document containing my data about my friends, right?

Exhibit A, The Facebook API:


The name suggests a clean, intuitive interface...until we get into how we use the interface.

facebook.friends.get has nine parameters. Two of the parameters are separate authentication values.

The parameter call_id is documented as follows:

"The request's sequence number. Each successive call for any session must use a sequence number greater than the last. We suggest using the current time in milliseconds, such as PHP's microtime(true) function."

Which is to say that the caller is responsible for scheduling calls with the service?! Why do we have to agree on a "sequence number" in order to do business?

How about the "variable" that isn't:

string v This must be set to 1.0 to use this version of the API.

Which is to say, "Please pass me this parameter even though I already know the value."

It's a "placeholder," right? Because in "the future" we might want to leave room for specifying which version we want...per interface.

It's goes on and on. Then you tell facebook.friends.get what format you want, then whether or not you want the results wrapped in a function call so you can get around cross-domain data security, yada, yada... And don't forget to hash the URL first! I'm only talking about one endpoint here.


Am I wrong to be disappointed? Am I the only one who is confused? Who thinks this is unnecessary? This is the standard API-centric view of interoperability but I'm not buying it.

Now, I have little doubt there is a good reason for every parameter and for this kind of programming. I've heard every reason. I argued many of them myself...until I learned better. To gnash teeth over the legitimacy of each thing is to argue details regardless of their relevancy to the matter at hand. The matter at hand is that I am requesting my friends from the platform and I am expecting back a well-formed document as a result. That's all. From here the platform should do everything it possibly can limit the amount of additional information it requires of me. I'll grant that platform expects me to authenticate my request.

The problem with facebook.friends.get is that it is trying to serve multiple use cases through a single request, as if that is more efficient or more simple. Most of the use cases serve developers (or so they may think). Most use cases are more easily supported than the basic task I am trying to perform. Remember? It's "get my friends. " I find myself restating this important detail like a mantra in order for it to have a greater voice. Why is the service burdening the consumer with all of this minutia for a simple data call that should return a document?

Exhibit B: The interface for the same data:

It's one simple URL. It returns well-formed data as JSON. Oh, the simplicity! Oh, the possibilities!

Detractors would no doubt point to the fact that this URL is neither authenticated nor encrypted. I would counter: that was a choice made to serve customers in this case as delicious certainly supports authenticated, encrypted interfaces that are just as easily used as /feeds/json/network. Beyond that, everything else is a decoration.

And, OK, so developers do need more options and features than can be provided through a simple URL interface. But those should be added as one or more layers on top of a layer that just provides well-formed data. I'll accept that if someone can show me: where's the layer that provides the data? It's obfuscated by the business layer. This is no doubt an artifact of the traditional three-tiered architecture (a model of which I am a proponent by the way).

On the Internet, the way in which data is located is via the URI and the way it is passed is HTTP. It's rudimentary: every container of data has a uniquely addressable identifier. Web service endpoints are URLs. We need to get developers and Internet platform providers back to this fundamental principle.

Exhibit C: Yahoo! Internet Location Platform:

Click the URL, get back an XML document. Now THAT'S data portability; that's interoperability.

Update 11.17.09: BUMMER, this link is broken. It looks like an auth issue.