Monday, 29 June 2015

UK Ontology Network 2015 Presentation

I really have been a bit lax here of late - not that I have little to write, but rather time (as always!) and the day job takes me away from privacy into new uncharted, more security research related areas - which I will admit is great fun!

Back in April I attending a fascinating little workshop run by the UK Ontology Network to talk about ontologies for privacy. What makes this workshop fun is the sheer amount of interaction between the participants: the distinction between presenter and audience is completely blurred. As a presenter you get a 5 minute slot then after a number of presenters on similar subjects, a 20 minute panel session where the audience really gets into the conversation.

I'm going to be unfair to all the presenters at UKON 2015 but I'm going to pick out a presentation by Adam Nogradi on Sparqlycode - a tool for semantically annotating source code and establishing compliance against, in this case, certain security guarantees. Of course, to move to a different area all you need is an ontology for your subject area, say, privacy, and you have a tool for privacy compliance...more on this later :-)

So here's my presentation based on the work in the book Privacy Engineering: A Dataflow and Ontological Approach.




And the book containing a much more detailed description can be bought from Amazon, Barnes and Nobles etc etc.

Amazon UK/EUAmazon UK/EU

Saturday, 27 June 2015

An article on The Semantics of PII


A while back I wrote a short article for the IAPP's Privacy Tech Blog. With permission I'll reproduce it here for additional reference. Also,a tip of the hat for the administrator of the blog: Jedidah Bracy of the IAPP for his spell checking, grammar checking and editorial skills!


The Semantics of PII
Privacy Tech | Feb 26, 2015


Last year, Profs. Peter Swire and Annie Antón wrote a compelling piece in Privacy Perspectives about the need for privacy engineers and lawyers to get along. Establishing a common language in which to communicate will be essential to appropriately connect policy with technology.

It’s probably safe to say that the most common terms used in privacy are personally identifiable information (PII) and personal data, depending upon whether you come from a U.S. or European background. I think these terms are more or less self-explanatory.

But what do they really mean?

Take PII, for example. It means a chunk of data that reveals some knowledge about a person that can be unambiguously identified. Sounds more or less about right, doesn’t it? Is a computer's IP address personally identifiable? What if that IP address belongs to a router for a large, multinational corporation? Is it PII then? And what if it belongs to a family using multiple computers, tablets, phones or other devices?

We will soon find ourselves delving into the minutiae of meaning—the what-does-personal-really-mean type questions. Plus, we must ask what isinformation, and what does identifiable denote?

There is a whole area of linguistics, philosophy and mathematics—take your pick—that deals with the meaning of things, otherwise known as semantics, or even semiotics if you want the overall field.

Mathematicians took years to fully understand the semantics of even simple statements such as 1+1=2, which looks obvious until you try to explain what 1 is, what 2 is, what + means, what = means and then what it means to say 1+1. The English philosophers Bertrand Russell and Albert Whitehead spent most of their careers writing Principia Mathematica to answer this question, and after four editions and 300 pages of dense mathematics, they had an answer. That was, until a young German by the name of Kurt Gödel came along and shook mathematics to its foundations with an equally "trivial" result.

So if it took 300 pages by two of the brightest minds in mathematics to give us a semantics for 1+1=2, how many pages—and years of work—will it take to give "PII" a semantics?

Now here's an interesting point: The definition of PII that is used in contemporary privacy is perfectly well defined in the privacy-legal context. I can go to various legal documents and read a formal definition of what PII or personal data means. But as we move between disciplines—in our case from privacy-legal to privacy-engineering disciplines—these definitions no longer hold, or at the very least, they don't work well.

If we move to the other end of the scale from legal to mathematics, we find concepts such as information entropy, which provides a clear, unambiguous and precise definition of what information is as well as the identifiability of a data set with respect to some population and so on. Information entropy, however, is not an easy concept with which to work. We can state now that the legal definition of PII can be defined in terms of the mathematical definition; it's just that this is obscenely difficult to do.

Somewhere between these two extremes lies software engineering, the discipline that actually implements privacy law into our systems, in ostensibly mathematical (programming language) terms.

Software engineers, much to the chagrin of privacy lawyers, do not understand legal terms. Well, ok, they do to a point, but you try coding a statement such as "reasonable privacy" into C++ or Java!

Plus, privacy lawyers don't understand all the subtle ramifications of virtual machines, machine language, object orientation, distributed computing, network protocols, XML, RDF—the list goes on!—again, much to the chagrin of software engineers.

Yet, as we stated earlier, there is a relationship between the terms and language that privacy lawyers use and the terms and language that software engineers use. That link provides the translation mechanism that allows both groups not just to talk but to properly communicate with each other.

We can spend as much time as we’d like writing manifestos and principles, designing processes, inventing new job titles such as privacy officer, privacy compliance tsar, grand chief-overseer-of-the-worshipful-court-of-privacy-dudes and so on, but without grounding semantics into terms such as PII and personal data—terms that will allow us to translate between legal-speak and engineer-speak—all of this work will be in vain.

Thursday, 25 June 2015

What is quality?

Everyone talks about quality, whether it be a quality customer experience, or quality product, or even a quality process...still not sure what that is...but ask the question of "what is quality?" and you won't get a good answer.

There are two good places to start:
  1. Robert Pirsig's Zen and the Art of Motorcycle Maintenance
  2. John Guaspari's "I know it when I see it"
Anyone working in "quality" should have not just read but completely internalised both books before uttering a further word on the subject.

Interestingly, quality doesn't necessary mean expensive or the best - though being called the "best" occurs when you have quality.

For example, SAS actually have a "quality" long-haul economy product. This doesn't mean it is great, but they do their best to keep passengers (on an 11 hour flight) well fed and watered. Now just because I had a good experience and perceived their economy product to be of a high quality doesn't necessarily mean that they shouldn't improve on it. Trust me, there are LOTS of things SAS could do to improve on their economy product.

Airlines are abound with examples of quality: for example, Finnair vs Norwegian - the former just feels to be an expensive Ryanair but without the friendliness while the latter gives you what you need plus free Wifi (slow Wifi, but Wifi still). To me, Norwegian provides me with a quality product. Another example is Lufthansa: while their long-haul economy offering is poor from a seating point of view, their food (SAS take note, I get beer and wine for free with my meal!) and superb crews, plus the added advantage of using the A380 means I choose to fly Lufthansa when I can.

But today I came across something interesting while stopping at a cafe to buy ice-cream for my children. There is no doubt this particular cafe has good service, smiling and ample staff, and excellent ice-cream, though a bit pricey! But even with, I counted 3 members of staff and a manager present they couldn't actually keep the 10 or so tables clean. It really isn't pleasant to have to sit at a table with someone else's food still left there, or having to clean it up yourself.

While everything about this cafe says "we provide a quality product" this is let down by a very simple action of not cleaning up.

This got me thinking, especially when referring back to Pirsig's observation that even though you might have the most expensive and best built motorcycle, it only takes a single screw to become threaded, or the head to become ground away to reduce that most expensive of machines to being totally unfit for purpose - in Pirsig's case, unrepairable because of a single flaw in the cheapest, most innocuous of components.

Funny how a simple act such as clearing up a table can ruin a cafe's reputation; or, how a faulty IFE screen can ruin a flight. Sometimes it is just attitude that ruins everything...when a much respected company fails to ever ask of, or even respect even a semi-regular customer then the quality of the whole is reduce to nothing.