Wednesday, October 3, 2007

Thoughts on secure data publishing and transformation governance and best practices

I'm not a security guy, so maybe I'm out a limb here. Douglas Crockford recently called the DOM and huge security problem. I'm looking at a huge traceability and lack-of-symmetry problem. The problem I usually run into in big enterprises is not with permission to trace data, but the complete in ability to do so. When people can't trace data back to sources and when transformation processes are asymmetrical, it becomes extremely, exhorbitantly difficult to make sense from information; V&V goes out the door.

Claim:

Data exports should pass metadata describing, at minimum, their immediate source, and any transformation applied by that source when producing the data. This metadata forms a singly-linked, directed list from data targets to their sources (ex., from an Excel spreadsheet back to a System Architect model). The link list can be followed only by persons and systems having clearance to do so.

Motivation:

Better traceability and symmetry in the data transformation and publishing process

Specification for metadata:

Required field: URL for source

Required field: Security Classification of source

Required field: Security Classification of metadata and traceability permission

Required field: Timestamp

Required field: Author, publisher name

Optional field: Name (aliases?) of source

Optional field: Name (aliases?) of transformation

Optional field: URL for transformation, when used (bad practice to not include when present)

Optional field: Description of source

Optional field: Description of transformation

Technical Requirements:

It is the producer's responsibility to describe itself by placing metadata into the exported target's DOM. This is not a "must" requirement. Data producers have the right to be anonymous.

Use Cases:

Cleared receivers of data can trace data back to the data's source by following a hyperlink to the source, or to metadata when the source is incapable of acting as an Internet asset. The link chain can be followed as long as the tracer has clearance to go to the previous node, i.e., to visit a URL. The burden for granting or denying access to URLs is already maintained by existing systems (ex., Web server permissions, CAC).

Traceability is stopped by denying permission to get to the URL for the source .

Metadata is scrubbed by pruning its node(s) from target's DOM.

No comments: