Monday, 17 March 2014

Structuring Privacy Requirements pt 1

One of the toughest problems I'm having to solve, not just for my book on privacy engineering, but in my daily job as well is to formulate a set of privacy requirements to the software engineers and the R&D teams.

Actually it isn't that the privacy requirements themselves are hard, we've plenty at the policy level and extrapolating these down into the more functional and security related requirements at the implementation level (ok, it is difficult but there are harder things in this area).

Placing all these in a structure that ties together the various classifications and aspects of information, data flow modelling, requirements and risk management has been interesting and fiendishly difficult to say the least. Getting this structure into a state that supports all of these and getting the semantics of the kinds of thing it is structuring is critical to understanding how all of this privacy engineering works.

We assume that we understand the classification systems, for example the almost traditional Secret-Confidential-Public style security classifications and the Location, Time etc classifications of information type; as well as the other aspects such as purpose, usage, provenance, identity etc. Each of these has its own set of classification elements, hierarchy and other ontological structure. For example for information type:

Information Type Hierarchy
We also assume we understand data flow modelling with its processes, flows, sources, sinks and logical partitionings. We can also already see the link between elements here (as shown below) and the classification systems above
Example Data Flow with Annotations from Previously Described Ontologies/Classification Systems
Now the structure of our requirements needs to take into consideration the various elements from the classification systems, the aspect of the requirement we want to describe (more on this below) and the required detail of requirement relevant to the stage in the software process. This gives us the structure below:



So if we wish to find the requirements for User's Home Address Policy for Local Storage then we take the point corresponding to those coordinates. If there happens to be nothing there then we can use the classification systems' hierarchies to look for the requirement corresponding to a parent; a "user's home address" is-a "Location":

So if we take the example data flow from earlier then for each of the flows, storages and processes we can construct a set of requirements simply by reading off the corresponding requirements from the structure above.

This leads to an interesting situations where it is possible to construct a set of requirements which are overconstraining. That is simply that we can not build a system that can support everything, for example, one data point might trigger a secret classification and key management, encrypted content etc.

We then need to weaken the requirements such that a system can be constructed according to some economic(!) model. As we weaken or retrench our requirements we are introducing risk into the system,

Aside: Retrenchment - here's a good reference: Banach, Poppleton. Controlling Control Systems: An Application of Evolving Retrenchment. ZB 2002:Formal Specification and Development in Z and BLecture Notes in Computer Science Volume 2272, 2002, pp 42-61.

This gives us our link to risk management and deciding whether each "weakness" we introduce is worth the risk. And as risk can be quantified and we can perform further tasks such as failure mode and effects analysis then we obtain a rigorous method to predict failures and make informed decisions about these. I'll describe more on this in later articles.

No comments: