Thursday, October 4, 2007

Not An All Or None Situation

Document-driven visualization represents a disruptively powerful innovation for visualization interoperability. It means that we can treat visualization in the same way that we treat the clear majority of data on the web.

But I'm having a mini-reawakening to the bigger picture in which we see the value in APIs and ultra-lean (and more secure) data transfer. When we tie into simulators we still use XML to configure the system, but in a way that tells it to use the API to connect to values directly as they come off a TCP/IP socket connection. The same is true for online collaboration. The object in these cases is to move data efficiently and reliably. There's no need to pull in all the DOM overhead and seldom any need to perform transformation of the data. (But even complex rules for converting Attribute types can be encoded into Objects that are then configurable through Reflection, Serialization, and ultimately XML. We call them "AttributeConverters.") The other network protocols (TCP/IP, UDP...not really FTP) contribute by giving us a lot more flexibility in terms of quality of service. Again, OO fundamentals are key. Object Reflection means we can get and set any attribute value in a standard way...in the API. Any integration more complicated than that usually indicates that our core visualization capability is missing something. Ultimately we want to move closer to the GPU...I think...dual core processors also represent a new (finally stable) option for better performance laptops and PCs.

I digress. Document-driven viz is super neat, but it isn't everything. I suppose I'm just echoing a common theme in life. Not a Panacea. Not a Silverbullet. Not A Cure for World Hunger. Just the best thing I've seen in my brief career.

Wednesday, October 3, 2007

Thoughts on secure data publishing and transformation governance and best practices

I'm not a security guy, so maybe I'm out a limb here. Douglas Crockford recently called the DOM and huge security problem. I'm looking at a huge traceability and lack-of-symmetry problem. The problem I usually run into in big enterprises is not with permission to trace data, but the complete in ability to do so. When people can't trace data back to sources and when transformation processes are asymmetrical, it becomes extremely, exhorbitantly difficult to make sense from information; V&V goes out the door.

Claim:

Data exports should pass metadata describing, at minimum, their immediate source, and any transformation applied by that source when producing the data. This metadata forms a singly-linked, directed list from data targets to their sources (ex., from an Excel spreadsheet back to a System Architect model). The link list can be followed only by persons and systems having clearance to do so.

Motivation:

Better traceability and symmetry in the data transformation and publishing process

Specification for metadata:

Required field: URL for source

Required field: Security Classification of source

Required field: Security Classification of metadata and traceability permission

Required field: Timestamp

Required field: Author, publisher name

Optional field: Name (aliases?) of source

Optional field: Name (aliases?) of transformation

Optional field: URL for transformation, when used (bad practice to not include when present)

Optional field: Description of source

Optional field: Description of transformation

Technical Requirements:

It is the producer's responsibility to describe itself by placing metadata into the exported target's DOM. This is not a "must" requirement. Data producers have the right to be anonymous.

Use Cases:

Cleared receivers of data can trace data back to the data's source by following a hyperlink to the source, or to metadata when the source is incapable of acting as an Internet asset. The link chain can be followed as long as the tracer has clearance to go to the previous node, i.e., to visit a URL. The burden for granting or denying access to URLs is already maintained by existing systems (ex., Web server permissions, CAC).

Traceability is stopped by denying permission to get to the URL for the source .

Metadata is scrubbed by pruning its node(s) from target's DOM.