Wednesday, 3 December 2014

Category Theory and the Meaning of Life

I was warned many years ago, by more than one person, that dabbling in the dark arts of category theory only leads to, well, becoming a category theorist...

OK, I admit it, I've been playing with Spivak's Ologs for a while on an actual problem and I particularly like the insights, or at least structure it gives to certain problems. A long while back we even attempted to use CT on a definition for what MDA is.

Coming back to the present and category theory itself, I'm of the opinion that topology or at least topological thinking provides a very neat way of conceptualising and understanding many problems. At the moment my work is certainly deeply grounded in metric spaces and the like.

Given all this foundational work and the fact that CT is proposed by some to be the "true" foundation for mathematics. Take a look at John Baez's work on mathematics and biology/ecology for example.

So I wasn't too surprised when Amazon's suggestion engine made the same conclusion. This can not be a coincidence can it? I mean I buy books on science and mathematics but this particular juxtaposition of suggestions must have a deeper Amazon sentient I wonder?

Does this mean we have a functor between CT and spirituality?

Is '42' an initial or terminal object?

Or is the a deeper meaning in this suggestion:

Maybe that's it...."Category Theory?"...RUN AWAY. SAVE YOURSELVES POOR MAMMALS!

Note: This post contained humour!!!

Monday, 24 November 2014

CFP: TrustCom 2015

Helsinki, Finland, 20-22 August, 2015

With the rapid development and increasing complexity of computer systems and communication networks, user requirements for trust, security and privacy are becoming more and more demanding. Therefore, there is a grand challenge that traditional security technologies and measures may not meet user requirements in open, dynamic, heterogeneous, mobile, wireless, and distributed computing environments. As a result, we need to build systems and networks in which various applications allow users to enjoy more comprehensive services while preserving trust, security and privacy at the same time. As useful and innovative technologies, trusted computing and communications are attracting researchers with more and more attention.

The 14th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (IEEE TrustCom-15) will be held in Helsinki, Finland on 20-22 August 2015. The conference aims at bringing together researchers and practitioners in the world working on trusted computing and communications, with regard to trust, security, privacy, reliability, dependability, survivability, availability, and fault tolerance aspects of computer systems and networks, and providing a forum to present and discuss emerging ideas and trends in this highly challenging research field.

Accepted and presented papers will be included in the IEEE CPS Proceedings.
Distinguished papers presented at the conference, after further revision, will be recommended to high quality international journals.

Monday, 17 November 2014

A Definition of PII and Personal Data

There's been an interesting discussion on Twitter about the terms PII and "personal data", classification of information and metrics.

Personally I think the terms "PII" and "personal data" are too broadly applied. Their definitions are poor at best; when did you last see a formal definition of these terms? Indeed classifying a data set as PII only comes about from the types of data inside that data set and by measuring the amount of identifiability of that set.

There now exists two problems in that a classification system underneath that of PII isn't well established in normal terminology. Secondly metrics for information content are very much defined in terms of information entropy.

Providing these underlying classifications is critical to better comprehending the data that we are dealing with. For example, consider the following diagram:

Given any set of data, each field can be mapped into one or more of the seven broad categories on the left - If we wanted we could create much more sophisticated ontologies to express this. Within each of these we can specialise more and this is somewhat represented as we move horizontally across the diagram.

Avoiding information entropy as much as possible, we can and have derived some form of metric to at least assess the risk of data being held or processed. A high 'score' means high risk and a high degree of reidentification is possible, while a low score the opposite - though not necessarily meaning that there is no risk. Each of the categories could be further weighted such as using location is twice as risky as financial data.

There could be and are some interesting relationships between the categories, for example, identifiers such as machine addresses (IPs) can be mapped into personal identifiers and locations - depending upon the use case.

I'm not going to go into a full formalisation of the function to calculate this, but a simple function which takes in a data set's fields and produces a value, say in the range 0 to 5 to state the risk of the data set might suffice. A second function to map that value to a set of requirements to handle that risk is the needed.

What about PII?  Well, to really establish this we should go into the contents of the data and the context in which that data exists. Another, rather brutal way, is to draw a boundary line across the above diagram such that things on the right-hand-side are potentially PII and those on the left not. This might then become a useful weighting metric, that if anything appears to the right of this line then the whole data set gets tagged with being potentially PII. I guess you could also become quite clever in using this division line to normalise the risk scoring across the various information classifications.

In summary, we can therefore give the term PII (or personal data) a definition in terms of what a data set contains rather than using it as a catch-all classification. This allows us then to have a proper discussion about risk and requirements.


Ian Oliver. Privacy Engineering: A Data Flow and Ontological Approach. ISBN 978-1497569713

Tuesday, 11 November 2014

First International Workshop on Privacy Engineering (IWPE'15)

First International Workshop on Privacy Engineering

21 May 2015 - The Fairmont, San Jose, CA 

Deadline of paper submission:  23 January, 2015
Notification of acceptance:    16 February, 2015 
Accepted Paper camera ready:   3 March, 2015  

We are pleased to invite you to participate in the premier annual event of the International Workshop on Privacy Engineering (IWPE'15).

Ongoing news reports regarding global surveillance programs, massive personal data breaches in corporate databases, and notorious examples of personal tragedies due to privacy violations have intensified societal demands for privacy-friendly systems. In response, current legislative and standardization processes worldwide aim to strengthen individual’s privacy by introducing legal and organizational frameworks that personal data collectors and processors must follow.

However, in practice, these initiatives alone are not enough to guarantee that organizations and software developers will be able to identify and adopt appropriate privacy engineering techniques in their daily practices. Even if so, it is difficult to systematically evaluate whether the systems they develop using such techniques comply with legal frameworks, provide necessary technical assurances, and fulfill users’ privacy requirements. It is evident that research is needed in developing techniques that can aid the translation of legal and normative concepts, as well as user expectations into systems requirements. Furthermore, methods that can support organizations and engineers in developing (socio-)technical systems that address these requirements is of increasing value to respond to the existing societal challenges associated with privacy.

In this context, privacy engineering research is emerging as an important topic. Engineers are increasingly expected to build and maintain privacy-preserving and data-protection compliant systems in different ICT domains such as health, energy, transportation, social computing, law enforcement, public services; based on different infrastructures such as cloud, grid, or mobile computing and architectures. While there is a consensus on the benefits of an engineering approach to privacy, concrete proposals for processes, models, methodologies, techniques and tools that support engineers and organizations in this endeavor are few and in need of immediate attention.

To cover this gap, the topics of the International Workshop on Privacy Engineering (IWPE'15) focus on all the aspects surrounding privacy engineering, ranging from its theoretical foundations, engineering approaches, and support infrastructures, to its practical application in projects of different scale. Specifically, we are seeking the following kinds of papers: (1) technical solution papers that illustrate a novel formalism, method or other research finding with preliminary evaluation; (2) experience and practice papers that describe a case study, challenge or lessons learned from in a specific domain; (3) early evaluations of tools and other infrastructure that support engineering tasks in privacy requirements, design, implementation, testing, etc.; (4) interdisciplinary studies or critical reviews of existing privacy engineering concepts, methods and frameworks; or (5) vision papers that take a clear position informed by evidence based on a thorough literature review.

IWPE’15 welcomes papers that focus on novel solutions on the recent developments in the general area of privacy engineering. Topics of interests include, but are not limited to:

  • Integration of law and policy compliance into the development process
  • Privacy impact assessment
  • Privacy risk management models
  • Privacy breach recovery Methods
  • Technical standards, heuristics and best practices for privacy engineering
  • Privacy engineering in technical standards
  • Privacy requirements elicitation and analysis methods
  • User privacy and data protection requirements
  • Management of privacy requirements with other system requirements
  • Privacy requirements operationalization
  • Privacy engineering strategies and design patterns
  • Privacy architectures
  • Privacy engineering and databases
  • Privacy engineering in the context of interaction design and usability
  • Privacy testing and evaluation methods
  • Validation and verification of privacy requirements
  • Engineering Privacy Enhancing Technologies
  • Models and approaches for the verification of privacy properties
  • Tools supporting privacy engineering
  • Teaching and training privacy engineering
  • Adaptations of privacy engineering into specific software development processes
  • Pilots and real-world applications
  • Privacy engineering and accountability
  • Organizational, legal, political and economic aspects of privacy engineering

This topic list is not meant to be exhaustive; since IWPE'15 is interested in all aspects of privacy engineering. However, papers without a clear application to privacy engineering will be considered out of scope and may be rejected without full review.


We solicit unpublished short position papers (up to 4 pages) and long papers reporting technical, research or industry experience (up to 8 pages) on all dimensions of the privacy engineering domain. Each paper, written in English, must follow IEEE Proceedings format. Submission of a paper should be regarded as an undertaking that, should the paper be accepted, at least one of the authors will attend the workshop to present the paper. All papers must be submitted via EasyChair at

All IWPE'15 Papers will be published in IEEE eXplore, which is indexed by EI Engineering Index, ISI Conference Proceedings Citation Index (CPCI-S), Scopus etc.

If you have any questions regarding IWPE'15, please contact:

Jose M. del Alamo (
Norman Sadeh (
Seda Gurses (
Dawn Jutla (

Spam and Category Theory

A long time ago I came across an interesting article on the n-Category Cafe about a presentation by Fernando Zalamea on Sheaf Logic and Philosophical Synthesis.

For some reason over the past week this blog has been inundated with requests to the page I wrote on this - basically a reminder for myself to read up on this work.

I happy with a few tens of hits to this blog per day, but yesterday's 135 hits to that one page all came from the home of philosophy: France. So a merci to those who found a way to this blog and maybe onto Zalamea's original presentation - glad to be of assistance.

Part of me is delighted by the collective interest in this topic, while part of me suspects that this is spam.


Thursday, 30 October 2014

Finnish Literature

I got a request the other day for a or a few quintessential Finnish books, those that capture the Finnish psych or those that have stood the test of time and become the books of the nation. While you might consider the national epic The Kalevala, it would actually miss the point in that it needs to have author, style and content to qualify.

Actually qualifying the criteria is almost impossible so it is a lot easier to specify the books by example and the quality properties be extracted or inferred from those.

There are three books:
Translations in other languages are available.

Tuesday, 28 October 2014

Interview on KUCI 88.9FM

I've been interviewed by Mari Frank of the Privacy Piracy programme broadcast on Irvine (California) based radio station KUCI 88.9 FM.

Available via the KUCI website and on iTunes

Protect Your Privacy in the Information Age
Now on every MONDAY morning from 8:00 AM - 8:30 AM, Pacific Time
on 88.9 FM in Irvine
and WORLDWIDE live audio streaming at

I talk about privacy engineering, my book and some topics related to the adoption of engineering practices in privacy. The show will be broadcast on November the 3rd and please refer to KUCI's programme listings for more information.

I guess this is the nearest I'm going to get to Hollywood.

Monday, 13 October 2014


No one is really sure who said what and the air is full of denials and the Nuremberg Defense, but it seems like Finland is trying to suppress the word "wisky", or more specifically the Finnish "viski" just in case it corrupt, well, everyone....just think of the children...or maybe it is for my safety and security...but even if it saves just one person...

Yes, this is true:

News  |  
HS: Finnish officials ban the word "whisky" on private blog 
Organisers of a beer and whisky fair in Helsinki have been granted a license on condition that search engines do not link to the event’s website when users search for “whisky”, according to a report in Helsingin Sanomat. The officials have also asked the beer and whisky fair to remove the word “whisky” from their logo and the event’s official name.

Anyway Finnish taxes being put to work by our elected and unelected officials...good thing there's nothing else to worry about such as the economy, jobs, overly high taxation, food prices, energy prices, energy dependence, flying squirrels, child benefit reductions hirvikarpaset and a government that wants to avoid all hard decisions at all costs.

After all, banning words, moving kunta boundaries and pay rises of 6000 euros per month are far more important than the people of Finland.

Can anyone say Streisand Effect?

Whisky, Whiskey, Viski, Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski,Whisky, Whiskey, Viski ....

Thursday, 9 October 2014

Privacy Awareness Training

Awareness of the implications of information loss, data breach and privacy in general is well known, though interestingly rarely acted upon in general practice. Consider the situation I've just witnessed: someone talking loudly in a Skype conversation about some very personal details of his life, plus the odd snipped of financial information and a few names, in a coffee shop - and a relatively crowded coffee shop at that.

We actually spend a disproportionate amount of time worrying about technical solutions to privacy and yet while we are told every day about the dangers of using our technological goodies: phones, laptops, tablets etc, we seem almost oblivious about the data leakage we commit even without technology. Or maybe in the case of a video call in a crowded coffee shop, with the help of technology.

We probably panic more about sharing pictures on Facebook than exposing all our personal details in public in the manner above.

I'm also reminded of any case that happened in the UK. A man - a parent of a child at the school - was prevented taking his DSLR camera to a school sports day on grounds that he might accidentally take pictures of other children with the camera. As we all know, a big expensive camera takes good pictures. Ironically every other parent took a mobile phone - most with just as good picture resolution as the DSLR and most likely all capable of upload to various social media sites along with the all precious meta-data: location, time stamps etc. Just to add irony here he probably had a telephoto lens so that he could take a picture of just his might like to compare the capabilities of a telephoto lens with that of a camera phone. Further irony comes from the fact that how many picture were uploaded with the childrens' details to social media during and after the event without the permission of all present.

Let's us not forget the fact that an internet connected (or should I say radio network connected) mobile device is already exposing much more information than a non-network connected DSLR camera every will or can. I note in Canon's latest models this however is changing...

Buy hey, let's not let common sense and knowledge get in the way of blind panic and misunderstandings.

Let's for a moment concentrate on opposite end of the privacy spectrum, that of the software engineer or programmer trying to construct a system that processes data. For the most part these engineers receive very little in the way of specific training on algorithms, techniques etc for privacy and information processing in general.

When did you last educate your programmers and engineers on the latest data processing or security techniques?

So this brings me to the state of privacy awareness training. Most companies now mandate this for their employees and mandated training normally has a very high view rate. What is less understood is the amount of understanding gained or relevance from this training. In fact privacy awareness training is rapidly becomming the new sexual harrasment training: watch this 1 hour video and reverse 100s of years of cultural indoctrination and be reborn into a new egalitarian!

And this is one of the problems of privacy awareness training, that a short, generic introduction to the dangers of information loss and privacy magically solves everything. I am sure that many of the companies that have suffered data breaches of late have such training in place. Even the NSA surely has such training, though the outcome after this education is now well known.

One could enter into a huge sociological and cultural discussion about this, but one thing has always struck me about privacy awareness training:

It caters for the lowest common denominator

Awareness training invariably tells me of the dangers of information breaches, maybe some interesting anecdotes about Target, AOL, NSA etc, the dangers of the internet etc.

It never tells me about programming techniques, system design techniques, practical methods of protecting my email, social media use, the differences between DSLRs and mobile phones etc.

In a nutshell, privacy awareness training is rarely, if ever, relevant to the audience. By making privacy awareness training so generic it actually never properly educates the audience about privacy and information security.

Constructing training that properly addresses each of its target audiences is hard and takes time - that is not to be denied - but we can not continue with generic, information content-less material that while is tells about privacy, it does not educate.

Anyway, the man opposite me is continuing with a call to his therapist/lover...he's been through rough time recently it seems, and it is good to talk...about your religious beliefs, former marriage, your views on men/women/relationships, your friends, your financial situation etc...

Tuesday, 30 September 2014

Wither the Privacy Hero...

The Hero in computing is a well known phenomena - think of the lone programmer, sysadmin or hacker for example. However the hero also occurs in all domains including privacy. The privacy hero is the one who has hand crafted the privacy policy, set down in stone a lengthly list of privacy requirements and compliance activies without consulting the engineers and users who have to implement and use these.

As many disciplines, especially that of medicine, have discovered, the hero is the most dangerous person there is. By working against the odds, he (or she) usually creates a victory where all is solved [1]. Be it uring a patient or creating the rules by which the company is saved from an inglorious admission of a data breach. Even if there is a breach or the patient later dies, it can't be the hero's fault, but the others such as the failed care of the nurses or the engineers who never listened. In reality the nurses and the engineers are usually patching the damages caused by the hero.

Our current cultural setups in privacy, and especially now we're starting to get engineers actively involved in the privacy debate, needs to change from the Privacy Heros to a much tightly integrated team of experts.

In [2], Atul Gawande clearly states that the nurses, technicians and other personel "work for" the hero doctor. In privacy we still have the same attitude, software engineers "work for" the Privacy Officers.

We suffer from a huge lack of teamwork - the privacy hero's word is the Truth and that's it. Within the current culture of privacy, the engineers who are battling to implement or even comprehend privacy requirements written and explained at a completely different level of asbtraction than is necessary do not play any major part in those requirements.

Consider the defintions of personal data or PII for example, have these even been properly grounded in the undelying mathematical theories of what information is; or even for that matter in terms that can be properly understood by software engineers in their domain. Even within the legal domain, these terms have been defined in such a way that they are underspecified and open to legal interpretation.

In order to move from a highly ineffective privacy priesthood to a true, all encompasing and all relevant discipline based on a mutually supporting combination of legal, scientific and engineering principles we must change our culture from that of the Hero to that of the Team.


[1] Suzanne Gordon, Patrick Mendenhall and Bonnie Blair O'Connor. Beyond the Checklist
[2] Atul Gawande, Better
[3] Ian Oliver. Privacy Engineering: A Dataflow and Ontological Approach.

Wednesday, 24 September 2014

Finland's Privacy Niche

From last month, but now it is safe to post this more publicly: an article I was interviewed for by the Helsinki Times:

Finland finds niche in privacy and information security
21 Aug 2014
By David Cord
"FACEBOOK can send SMS messages on your phone, record your photos and track your location. The National Security Agency is reading your emails. Cyber criminals are trying to snatch your credit card details. In an era where we are rapidly losing all privacy online there is a demand by consumers to protect and control their personal information. As this demand grows, Finland has an opportunity to become the global leader in privacy and information security..."

Thursday, 18 September 2014

Scotland, Independence, Wales, the UK etc...

Well, what ever happens today in Scotland, I think that Salmond has won: the concessions that have been offered Scotland by the leaders of the main parties in the UK are generous to say the least. In effect a no vote will trigger practically everything but independence and full financial autonomy, while a yes vote will lead to much protracted negotiations and to be honest I'm not sure either side in this has a good idea of how these will turn out.

The questions of economy, currency union and EU membership are academic to a point. Firstly Scotland's economy will probably do quite well, though the independence vote is not specifically about the economy but rather the right of a people to decide their own future - democratically. Currency union might well become a moot point - if Scotland enters a currency union with the UK pound then there isn't really much the UK could do about it; there are alternatives such as pegging a Scottish currency against the EU in much the same way as Montenegro has done. Even if Scotland's currency did devalue then this could be a good thing for inward investment.

EU membership is interesting, especially as Spain and quite probably Belgium has vested interests in quelling their own internal independence movements - even then the pressure from their local voters and even smaller EU nations which might feel threatened by over dominance by larger members might sway this. EU membership also solves the currency issue.

Border controls and the like? Well Scotland couldn't become a Schengen country unless the UK joined too and I strongly doubt that we'd see passport checks at any time in the future other than in the more fanciful predictions.

My main issue however is - regardless of yes or no - is what happens to Wales and to a smaller extend to Northern Ireland and England. Certainly some Welsh politicians seem blissfully unaware or even naive of the implications. Even the Conservative Party in Wales has come out much more in favour of devolution of powers to Wales than Labour.

Politicians always has vested interests anyway: the Conservatives know that their only chance of any power in Wales is through the Senedd while Labour can always sit back and count on votes from their heartlands both in Wales and England. If Scotland leave, or even in Westminster is reorganised to solve the West Lothian Question this might change as Welsh MPs would have little influence even if they did vote on English specific legislations. There's an interesting discussion about this on the TrueWales web site in an article written by Rachel Banner (24 Jan 2012): The West Lothian Question.

Ironically it has been Welsh MPs who have probably made some of the most important decisions regarding England the UK: Nye Bevan and the NHS, Lloyd George and Home Rule (which unfortunately didn't come to pass due to World War 1) and even back to the Welsh advisors to Queen Elizabeth the First who promoted the idea of naval supremacy.

If we come back to what has been offered Scotland now and since the 1998 devolution votes one must seriously ask the question of why Scotland gets so much at the expense of the rest of the nations of the UK. I'm still unsure why every change to devolution in Wales requires a referendum such as that back in 2011 which granted Wales the power to make laws specific to the needs of the nation. Are politicians so weak and afraid of their decisions? Surely the populace voted those MPs into power to make such decisions for the good of the people and so they should take the responsibility themselves rather than pass it off to often under informed voters?

So on Friday, will Wales be offered DevoMax? I doubt it - no politician is that brave.

Wednesday, 20 August 2014

Design by Contract and Security

I've worked with design-by-contract for a long while; in fact its use as an essential programming concept was brought to my attention by one of my professors who didn't just teach it but demanded it as a fundamental part of any design or program. It remains today an integral technique in my toolbox of programming and design techniques. As an aside, Eiffel supports DbC as part of its programming model; it makes it one of the most elegant and easy languages not just to program in but to develop in.

DbC of course is related to the validation that one should make on web page forms and various concepts in strong typing of functions etc. Generalising DbC into other areas such as security, privacy should be an obvious step, but one seemingly not taken.

Via Schneier's brilliant security blog is a link to an article entitled Security As A Class Of Interface Guarantee. It is a long read but one more than worthwhile for software developers certainly; and maybe for any security and privacy engineers too. Actually if there's one article on security and development you read today it must be this one! Here's small sample:
Security Is Part Of Every Interface 
I prefer to think of security as a class of interface guarantee. In particular, security guarantees are a kind of correctness guarantee. At every interface of every kind — user interface, programming language syntax and semantics, in-process APIs, kernel APIs, RPC and network protocols, ceremonies — explicit and implicit design guarantees(promises, contracts) are in place, and determine the degree of “security” (however defined) the system can possibly achieve. 
Design guarantees might or might not actually hold in the implementation — software tends to have bugs, after all. Callers and callees can sometimes (but not always) defend themselves against untrustworthy callees and callers (respectively) in various ways that depend on the circumstances and on the nature of caller and callee. In this sense an interface is an attack surface — but properly constructed, it can also be a defense surface.

At the end of the article is a suprise link to a tweet :-)

Obvious isn't it...?!

Friday, 8 August 2014

Loyalty Cards

Tangentially related to privacy which I'll discuss later regarding information asymmetry, but first I wanted to comment on this article that appeared on LinkedIn: Business Travelers Are Saying "Buh-bye" to Loyalty Programs - Here's Why by Christopher Elliott, a "Reader advocate" for National Geographic Traveler

"After years of putting up with blackout dates, broken promises and bait-and-switch games, American travellers — particularly air travellers — are saying “Enough!”
They’re refusing to play the loyalty-program game, jettisoning blind brand allegiance in favor of a more pragmatic view of travel. Price and convenience are trumping mindless devotion to an airline, a car rental company or a hotel."

This applies equally to European travellers to I suspect. But let's look at the price-quality-customer service dimensions. I did as the article states a few years ago and switched allegiance from one to another, despite accumulating a sizeable amount of points and occasionally using them to upgrade it actually became too difficult to actually use those points.

I remember a few cases clearly. The first was trying to enter a business lounge at Helsinki Airport - according to the "loyalty" scheme terms and conditions I had enough flights to qualify for an upgrade to the next tier, thus allowing me access to the business lounge. Except that there was a mistranslation between the English translation and the "correct" terms and conditions. The airline customer service basically stated it wasn't their fault. Given that the lounge at the time was empty, but staffed by no less than three "customer representatives" make it all a little too surreal.

The second was at Heathrow when lounge entry was not just dependent on the card but on the airline that issued to card, despite both airlines being part of the same alliance. It worked like this: if you gained the second tier on airline A you could enter business lounges run by the alliance that A was a member of, except, if the lounges were at an airport where airline B, also of the same alliance, was the major carrier.

The third was when trying to book a flight with air miles. For my selected flight I had enough miles and successfully made the booking until it came to select my seats, where upon the whole process failed and I was unceremoniously ditched out of the booking process with a terse error about the flight no longer being available to points holders. It turns out this was an artefact of the internal IT systems and actually you had to call the customers service department to book the flights - after a long hold where you were constantly told that all of the customer service representatives were busy, but you really are valued by the company you're trying to spend you money with.

After a few flights across the Atlantic I switched to Lufthansa - Frankfurt and Munich are clean, efficient and as pleasant as airports go. Lufthansa may not have good economy seating on long haul but the crews are some of the most professional (and smiling!) I've had the pleasure to be on an aircraft with.

I actually have no idea actually how to redeem the points I've collected with Lufthansa and SAS - they'll probably expire or have expired without my knowledge without me ever receiving any bonus from my loyalty as a customer.

A few years ago a colleague of mine called a major Nordic airline in response to a request for customer feedback. It turns out that they couldn't care why he stopped flying with them. Surely with all the data analytics being applied to our data collected, the airline never stopped for a moment to question why customers suddenly stopped flying with them. 

Now due to primarily economic reasons, I fly with whoever gives me the best deal. Given that the terms and conditions are pretty much the same regardless whether you fly a national carrier or a "cheap" airline, price will always win. 

Given that many established, major airlines decided to compete with the cheap carriers by matching their standards and terms, it is no wonder that the loyalty of the customer to the airline has all but disappeared?

Then, we come to credit cards and store cards...will these go the same way as air miles as customers realise that the deal they get is barely worth the loyalty?

But as a final statement, given that this is all about customer loyalty - has anyone ever been contacted by an airline, shop or other about why they stopped using those products offered?


Thursday, 7 August 2014

Privacy and Governance

Actually this is a presentation by Prof. Alastair Scotland of the UK's NHS on governance and, ultimately, patient safety. If you work in governance, privacy, software engineering and any management discipline related with these then this is one video you must watch today.

If it took the NHS 10 years to get this far, then in areas such as privacy which share many of the same characteristics of being a flawed system (in the general sense) with many safety-critical features, we've a huge amount of work to do outside of the mainly philosophical and legal debates about what privacy is.

Alastair Scotland.mp4 from Guy Murray on Vimeo.

Tuesday, 5 August 2014

On Finding Reasonable Measures To Bridge the Gap Between Privacy Engineers and Lawyers

This article originally appeared on the Privacy Association website on 29th July 2014.

On Finding Reasonable Measures To Bridge the Gap Between Privacy Engineers and Lawyers

Getting privacy lawyers and software engineers to work together to implement privacy is a perennial problem. Peter Swire, CIPP/US, and Annie Anton's article "Engineers and Lawyers in Privacy Protection: Can We All Just Get Along?" has explored this as has The Privacy Engineer's Manifesto, of which their article is part.

But here's the problem: We cannot keep addressing privacy from a top-down, legally driven perspective. No amount of additional processes and compliance checks is going to change the fact that software itself is so complex. Software engineering is often assumed to be the final stage and by some a mere consequence of many requirements from a large number of often conflicting sources.

Often privacy issues are “solved” by hashing an identifier, encrypting a communications link or anonymizing a data set, for some definition of anonymization. Many of these are piecemeal, ineffective, Band-Aid type solutions—the proverbial rearranging of deck chairs on the Titanic. Privacy Theater at its worst … or best.

So how do we address this?

First, maybe we should realize that we don't really understand each other’s disciplines. Very few lawyers are trained software engineers and vice-versa; therefore, constructing a lingua franca between these groups is part of that first step.

Often, understanding ideas that are simple in one domain do not translate to the other. For example, the use of the term “reasonable” means nothing in the software engineering domain. Plus, the simple act of “encryption” to a software engineer hides an enormous complexity of network protocols, key management, performance, encryption libraries and so on. Similarly, the now-ubiquitous use of “app” by lawyers to mean something that runs on your phone means a lot more to the software engineer.

What does “app” really mean to software engineers?

That little piece of code that downloads your news feed each morning and presents it in a friendly way—allowing you the scroll through using your touch screen device and maybe now and again presenting an advert because you didn't want to buy the paid version—is, in fact, not a tiny bit of code at all.

Effectively, what runs on your device is a multi-layered, complex interaction of hundreds of components, each sharing the device's memory, long-term storage, network components, display, keyboard, microphone, camera, speaker, GPS and so on. Each of those individual components interacts with layers underneath the interface, passing messages between components, scheduling when pieces of code must run and which piece of code gets the notification that you've just touched the screen as well as how and where data is stored.

When the app needs to get the next news item “from the web,” we have a huge interaction of network protocols that marshal the contents of the news feed, check its consistency, perform error-correction, marshal the individual segments of message from the network, manage addressing messages and internal format. There are protocols underneath this that decide on the routing between networks and those that control the electrical pulses over wires or antennae. This itself is repeated many, many times as messages are passed between routing points all over the Internet and includes cell towers, home wireless routers, undersea cables and satellite connections. Even the use of privacy-preserving technologies—encryption, for example—can be rendered virtually meaningless because underneath lies a veritable treasure trove of metadata. And that is just the networking component!

Let's for the moment look at the application's development itself, probably coded in some language such as Java, which itself is a clever beast that theoretically runs anywhere because it runs on top of a virtual machine and can be ported to almost any computing platform. The language contains many features to make a programmer's life easier by abstracting away complex details of the underlying mechanisms of how computers work. These are brought together through a vast library of publicly available libraries for a plethora of tasks—from generating random numbers to big data analytics—in only a 'few' lines of well-chosen code.

Even without diving deep into the code and components of which it’s made, similar complexity awaits in understanding where our content flows. Maybe your smartphone’s news app gets data from a news server and gives you the opportunity to share it through your favorite social network. Where does your data flow after these points? To advertisers, marketers, system administrators and so on? Through what components? And what kinds of processing? How about cross-referencing? Where is that data logged and who has access?

Constructing a full map or data flow of a typical software system—even a simple mobile app—becomes a spider web of interacting information flows, further complicated by layering over the logical and physical architectures, the geographical distribution and concepts such as controller and processor. Not to mention the actual content of the data, its provenance, the details of its collection, the risks, the links with policies and so on. And yet we still have not considered the security, performance and other functional and nonfunctional aspects!

Software engineers have enough of a problem managing this complexity without having to learn and comprehend legal language, too. Indeed, privacy won't get a foothold in software engineering until a path from “reasonable” can be traced through the myriad requirements to those millions of lines of code that actually implement “reasonable.”

Engineers love a challenge, but often when solutions are laid out before them in terms of policy and vague concepts—concepts which to us might be perfectly reasonable (there's that word again!)—then those engineers are just going to ignore or, worse, mis-implement those requirements. Not out of any malicious intent but because those requirements are almost meaningless in their domain. Even concepts such as data minimization can lose much if not all of their meaning as we move through the layers of code and interaction.

Life as a software engineer is hard: juggling a complex framework of requirements, ideas, systems, components and so on without interpreting what a privacy policy actually means across and inside the Internet-wide environment in which we work.

Software isn't a black box protected by a privacy policy and a suite of magic privacy-enhancing technologies but a veritable Pandora's Box of who-knows-what of “things.”

As privacy professionals, we should open that box often to fully comprehend what is really going on when we say “reasonable.” As software engineers, we'll more than make you welcome in our world, and we'd probably relish the chance to explore yours. But until we both get visibility in our respective domains and make the effort to understand each other's languages and how these relate to each other, this isn't going to happen—at least not in any “reasonable” way

Sunday, 3 August 2014

Messenger 10 year Anniversary

One of the first posts I made on this blog was about the Messenger probe to Mercury - at the time it had just made the 3rd fly-by before orbit insertion about 18 months later. On 1st August it celebrated 10 years since its launch.

So a large beer to all involved!

So now we have Messenger at Mercury, Venus Express performing dangerous aerobraking manoeuvres in Venus' atmosphere, a fleet robots trundling over Mars - not forgetting a small fleet of satellites around the planet, Juno on its way to Jupiter and venerable Cassini still going strong around Saturn.

Then there are the three that I'm most anticipating:

and then in just a few days, Rosetta finally arrives at 67P/Churyumov-Gerasimenk after 10 years. At this moment she's 3 days away with about 500 km to go! She's caring a lander called Philae which'll deeply later this year. ESA have a good track record of this kind of landings in exotic places with Huygens.

Can't wait!

In the meantime:

Wednesday, 30 July 2014

The Story of the Privacy Engineering Book

It was never going to be the next Harry Potter novel in neither content nor sales; thought I am open to offers from major film studios on buying the rights to my book in which case it isn't about data flow modelling and ontologies but the story of how Alice and Bob tried to keep their relationship private from Eve, Dave, Frank and the rest of the security character gang.

When writing a book, people often think that you must be an expert or genius to start. In fact this is quite the opposite, by writing a book you actually release that you are not an expert or genius in that area but become one (maybe) through making your thoughts and ideas explicit through the medium of print. Actually I think at the end you realise that being an expert is quite something else altogether.

The adage of "if you want to understand something then you should teach it" is what applies here. Following in the footsteps of Richard Feynman isn't too bad an idea regarding teaching.

The point of starting a technical book like Privacy Engineering was more to the conceptualise and concretise my thoughts and ideas on how I'd like systems to be modelled, analysed and understood from the privacy perspective.

In many ways writing the book was very much like writing my PhD thesis involving research just to understand the area, a thesis or hypothesis of how things work followed by the book keeping work of writing it down and documenting the sources of ideas and wisdom though copious references. Interestingly it took about the same amount of time from start to finish, approximately four years. I probably could have made it much quicker if it wasn't for the day job actually trying to analyse real systems but without that experience it would have been a dry, theoretical text without practical underpinning.

What surprised me, and maybe this comes from the training one received making a PhD, is that how "easy" it was to carve a niche in which I could be an expert (I'll come to what I mean by expert in a minute). It isn't that everything in the book is new but rather that the overall structure and application is "new". The Illustrated guide to a PhD really explains it best

Your Contribution to Knowledge
(from The Illustrated Guide to a PhD)
What was particularly exciting was bringing together the ideas from the following areas
  • Software Engineering - modelling, analysis, coding, data-flow modelling, requirements analysis
  • Ontologies, Taxonomies, Semantics
  • Law (Privacy)
  • Law (Philosophical)
  • The Philosophy of Privacy
  • Safety Critical Systems Design
  • Aviation, Surgery, Anaesthesia, Chemical Plant Design
  • etc
Maybe some of those areas surprise you? For example, what the $%^* has surgery got to do with privacy engineering? Many years ago we used to have a formal procedure for approving software project through various phases - concept, architecture, design, release. My job was to analyse and give approval for privacy from a software engineering perspective. Up until that point there were no software engineers in privacy, and an question of "do you have a privacy policy?" didn't really tell us anything. So procedures were put in place in the form of a set of questions - a checklist of things that must be done to move to the next stage. I got the idea from aviation!

It worked to a point, except that it was too rigid and didn't really fit into the more "agile" ways of working (agile, ad hoc, hacked...). After this is became a quest to find something that did fit, that did work in an environment where only when dissecting a piece of software you actually saw what was inside.

It was one of those serendipitous moments while reading some books on aircraft safety that I finally read Atul Gawande's book The Checklist Manifesto which lead me to Peter Pronovost's work and how CULTURE was the driver behind the workings of a safety oriented process. From this point onwards it was obvious (to me at least - with caveats) how we should approach privacy in software engineering: as a safety critical aspect!

Many, many experts have already discovered this - in aviation, surgery, anaesthesia, chemical plant design and so on. So, obviously I can't be an expert because *I* didn't know about this! Anyway, you can read about some this here:

There were closer at hand experts too other than those famous names appearing on the covers of books and papers. My colleagues at Nokia and Here explicitly and implicitly influenced my ideas. One thing was painfully obvious was the lack of common terminology, not just inside privacy but when working between domains such as software engineer and law. Construction of a lingua franca was our main priority and much of this was influenced with the ontological work made earlier in NRC's Semantic Web Infrastructure project M3 and work with a certain Ora Lassila of RDF and Semantic Web fame.

Even more interestingly was that our ontologies turned out to be "implementable" in the sense that we could construct reasoning systems and tie these with our analytics systems running HADOOP etc, to perform "in-line analysis" of what was really passing through these systems. Furthermore we had started to workout how to calculate privacy policies - those legal texts that appear on the start-up of applications and you have to click on OK or Accept to continue. We never quite got around to integrating this with established policy languages, but the main thing was we now knew how all this fitted together.

For a long time however, and this drove much of the terminological work above, was my worry - shared by others - that privacy was just a bunch of people talking in different languages albeit with the same, or similar words. Worse was that everyone felt that their terminology was right and it was so obvious that everyone understood what they meant regardless of legal, engineering or other background. I wrote an article back in January 2013 entitled "On The Naivety of Privacy" that expressed my feeling on this and stated that we really didn't have a formal foundation to privacy. The replies to that article were surprising in that many people wrote to me, mainly privately, that they felt the same way. I had supporters, albeit a silent group that seemed to fear some kind of privacy orthodoxy. Either way, the path I needed to take in order to understand my role was clear(er).

So, as a summary so far, the only way to be an expert is to surround yourself with experts in lots of different areas. But, they have to be willing participants and willing to share their knowledge. At this point an overall structure was coming into focus and the initial plans on how to engineer privacy were coalescing nicely. Documenting this was an interesting struggle and a number of presentations were made on exploring ideas in this area to get feedback and understand what people really needed, or thought they needed. It turned out that some of the areas I was looking at formed great analogies:

The trouble with ideas such as these is that you can get side tracked easily, though such side tracking is essential to the though process. Being challenged on simple questions such as why the terminology was structured so, what that particular hierarchy of concepts, why that particular naming etc and being presented with links to other areas however is critical to obtaining some kind of focus to the work you are embarking upon.

It was April 2012 that I travelled to Tallin, Estonia to talk about obscure topics like category theory, topology and homotopy type theory with a colleague from the University of Brighton. On the ferry crossing from Helsinki to Tallin, accompanied with copious amounts of coffee (I'm with Alfred Renyi - misattributed to Paul Erdos - on this one!) I wrote the first full draft of all of the ontologies or taxonomies and their structuring that we needed. After this and a further meeting with Brighton's VMG group later that year the thesis was however set.

It is critical to state that during this time I wasn't working in some theoretical vacuum. The ideas, concepts, terms, modelling etc were being applied, albeit somewhat silently - subterfuge was the game. Formal methods were outlawed, agile was the true way...said by those who understand neither. It appeared that everything worked in both the software engineering and legal domains; with the caveat that it wasn't introduced as a whole but rather as bits of tooling and technique to be used appropriately when necessary.

At this point the book started in earnest, though the actual writing didn't start until late 2013 and the initial few chapters were collected together from earlier technical reports and presentations. Much, if not all, of the writing in those initial texts were rewritten many, many times. Ernest Hemmingway was telling the truth when we said, "The first draft of anything is shit."

Apart from the practical battles: 
  • I chose the Tufte-LaTeX style in the end because it looked nice and they way it dealt with references and notes forced a certain way of writing.
  • Sublime and vi as text editors
  • Microsoft Visio as the drawing tool - I really wish they'd release a Linux, Mac and/or Cloud version.
  • Version Control .... erm, a certain cloudified, file store...
things generally went well. Initial drafts complete with unique spelling and grammatical errors were well received by those who reviewed them. I even ran a short series of lectures with colleagues to go through the chapters. I joked that these lectures were very much in the style of Andrew Wiles' secret lectures to a trusted colleague explaining the proof of Fermat's Last Theorem.

By February 2014 the overall structure and plan was in place, albeit a little nebulous and fluid in places - the structure of sections and chapters was changing but not the content. Then I started on the requirements chapter and there it stopped. Nothing seemed to work, the formal structure of requirements was wrong, I couldn't get the mapping between the terminologies, data flows and requirements to work at all. And there I got stuck for two months. I knew what the requirements need to look like but nothing fitted together...nada, zilch...."f***" was the word of the day, week, month. With the possibility of missing my self inflicted deadline of May was it even worth continuing? Luckily I persevered.

In another moment similar to that of Wiles, there came the nightmare of another book on the same subject with the same title being published. "F***" times 2. I bought this damned book entitled The Privacy Engineers Manifesto and started to read it, hoping and praying that they didn't cover the same material. This is actually where it got interesting, PEM didn't cover my material but rather provided a hook between the ubiquitous Privacy-by-Design principles and software engineering. It actually laid out a path that linked the two. This wasn't a rival but rather a symbiotic, co-conspirator in the development of the discipline of privacy engineering. With some hope I pushed the deadline forward to June and attempted to restart the work. 

It was actually back to pen and paper for drawing figures for a while as Microsoft just purchased Nokia's device's division and IT upgraded laptops, which meant a "many week" wait while Microsoft Visio was upgraded. During a latte fuelled moment there came the revelation on how these damned requirements and risk classifications would all link together:

3 Dimensions of Requirements
Simple eh?  Well, not perfect but it did provide a high-level structure in which things did fit and did work. Hunting through academic papers on the subject gave some kind of impetus to the work and writing started afresh and at great pace. May and June were spent in-between work and family finalising the draft. The deadline slipped again - oh the joys of self-publishing and having no editor nor real deadline.

July was the sprint finish mainly rewriting paragraphs, spell checking and actually removing a chapter of examples and patterns as the text now contained these - due to in no small part to the secret lecture series which turned the book from academic text into something more practical. In mid-July it was finished with only the proofing to go and on the 17th of July it finally went on sale.

Somewhat of an anticlimactic moment it seemed, but that was it. Whether the book is perfect or not and whether ideas have changed or become refined in the meantime was now irrelevant, it was now public and another contribution to knowledge existed. After this was many days of thinking of "what the hell do I do now?"

A colleague once explained that writing a book is like pregnancy: there are three trimesters followed by, well, the trimesters are: excitement, boredom and panic - the latter as in, it has to come out. What follows after all this gore, mess and pain is the desire to write a new book. 

So, am I an expert in this now? Well, yes in the sense that there aren't too many privacy engineers around but this belittles the term expert and gives it the wrong meaning. I now think that an expert is someone who understands what they don't know. There are huge areas that I want to know more about: human factors (cf: James Reason's Swiss Cheese Model) in privacy, formal underpinnings of privacy - yes, I love to write on a category theory foundation of this. There are experts around in requirements management, risk management etc that I'd love to talk to in bringing these areas into a much closer relationship with the structures we see in privacy. Information system security is another - it is something just assumed in privacy - whereas in software engineering it is an integral part.

The making of knowledge explicit is hard - unbelievably so. In fact, I am of the opinion that if you think you are an expert then you should go through the process of explaining and formalising your ideas in whatever your chosen area is; in other words, write a book about it. As presented earlier in the Illustrative Guide to a PhD, you spend all of your effort in adding that tiny amount to the sum of human knowledge but are rewarded with the ability to look back and see all of human knowledge and how it all fits together as a huge, holistic system. For a while you get your bit of this all to yourself, but this is closely followed by the desire to add another bit, and another, and another and so on. Knowledge is an addictive drug.

So that's the story. The book doesn't have wizards and car crashes, or a galactic princess who needs rescuing; the royalties will probably earn me a beer or two, but that really wasn't the point. Despite this sadomasochistic process what I have is an explicit, embodiment of my knowledge that can be shared - a conglomeration or summation of others' knowledge that I found a small niche to add to. I guess somewhere someone will find a flaw in my reasoning about privacy engineering and with luck suggest a solution and thereby adding a further contribution to our overall knowledge and development of this domain.

Actually I hope so.

ps: the second book...due 2015...deadlines permitting

Monday, 28 July 2014

Privacy Engineering Book

Privacy Engineering

A dataflow and ontological approach

An essential companion book for those of us who have to model systems from small mobile apps to large, cloudified BigData system from the perspective of privacy and personal data handling.

Available via Amazon (US, UK and all Amazon worldwide sites), CreateSpace and selected bookstores such as Barnes and Noble. Kindle version available, also with Kindle Match Book enabling you to the Kindle version for just 2.99USD when purchasing the paperback.

Table of Contents:
  1. Introduction
  2. Case Study
  3. Privacy Engineering Process Structure
  4. Data Flow Modelling
  5. Security and Information Classifications
  6. Additional Classification Structures
  7. Requirements
  8. Risk and Assessments
  9. Notice and Consent
  10. Privacy Enhancing Techniques
  11. Auditing and Inspections
  12. Developing a Privacy Programme
Information privacy is the major defining issue of today's Internet enabled World. To construct information systems from small mobile 'apps' to huge, heterogeneous, cloudified systems requires merging together skills from software engineering, legal, security and many other disciplines - including some outside of these fields! 

Only through properly modelling the system under development can we full appreciate the complexity of where personal data and information flows; and more importantly, effectively communicate this. This book presents an approach based upon data flow modelling, coupled with standardised terminological frameworks, classifications and ontologies to properly annotate and describe the flow of information into, out of and across these systems. 

Also provided are structures and frameworks for the engineering process, requirements and audits; and even the privacy programme itself, but takes a pragmatic approach and encourages using and modifying the tools and techniques presented as the local context and needs require.

Published July 2014
ISBN-13: 978-1497569713
ISBN-10: 1497569710
264 Pages, B/W on White Paper

Saturday, 19 July 2014

A Privacy Engineer's Bookshelf

There's a huge amount of material about privacy, software engineering etc already existing. So what should every privacy engineer have at minimum on his or her bookshelf? Here are my suggestions (I might be biased in some cases) which I think everyone working privacy should know about.
The reasoning behind the above is that entering the privacy engineering field one needs a good cross section and balance of understanding the legal and ethical foundations of privacy (Nissenbaum, Solove) through the software engineering process (Dennedy et al) to the actual task of modelling and analysing the system (Oliver). Scheneier's book is included to provide a good perspective on the major protection technology of encryption.

Of course this does not preclude other material nor a thorough search through the privacy literature, conferences and academic publications.

To be a privacy engineer really does mean engineering and specifically system and software engineering skills.

Monday, 14 July 2014

Final Proof...

And so it starts...the final proof before publication...

Privacy Engineering - a Data Flow and Ontological Approach

Friday, 11 July 2014

Privacy Engineering Book Update

Well it has been a while since I posted last and that's primarily because I've been making to push to finalise the book. I think it was Hemmingway that said that a book is never truly finished but reaches a state where it can be abandoned...

Well, this is very much the same. I'm happy with the draft, it contains the story I want to tell so far in as much detail I can put into it at the moment. This doesn't mean many chapters could have gone much further but there have to be compromises in content. If the book provides enough to tie these areas of ontology, data flow, requirements etc together and get the reader to a state where they can see that structure and use the references to move deeper into the subject, then this will have been a success.

I'll write more when I finally send the book for official publication next week.

But, what a journey...just like a PhD but without all the fun of being a student again :-)

Monday, 9 June 2014

Word Clouds

Just a bit of fun, but also quite nice to see an overall idea of what I write about on this blog. Generated by Wordle:

So it seems I'm seriously interested in engineering, privacy (privacy engineering too!), data, technologies, analytics and so on.

Friday, 6 June 2014

Privacy Engineering - The Book ... real soon now, I promise

Final push to complete the draft. Many many thanks to all that have provided numerous comments...We're probably looking at late July after the editorial process and the production of the first proof copy.

Ian Oliver (2014) Privacy Engineering - A Dataflow and Ontological Approach. ISBN 978-1497569713

Official Website:

Tuesday, 3 June 2014

Privacy SIG @ EIT Labs

Yesterday I was fortunate enough to be given the chance to speak at the founding of the a Privacy Special Interest Group (facilitated by EIT ICT Labs) on the subject of privacy engineering and some of the technologies and areas that will make up the future of privacy engineering technologies.

The presentation is below (via SlideShare):

The PrivacySIG group's charter is simply:
The Privacy Special Interest Group is a non profit organisation consisting of companies which are developing or involved in the next generation of visitor analytics. We work hard to ensure we can build a future where everybody can benefit from the new technologies available. The Privacy Special Interest Group has developed and maintains a common "Code of Conduct" which is an agreement between all members to follow common rules to ensure and improve the privacy of individuals. We also work on educating our customers, the media and the general public about the possibilities and limitations of the new technology. We also maintain a common opt-out list to make it easy for anyone who wishes to opt-out in one step, this list is used by all our members. Any company who agrees to follow the code of conducts is qualified to join.
This is certainly a worthwhile initiative and one that really has taken the need for an engineering approach to privacy as part of its ethos.

Wednesday, 28 May 2014

How much data?!?!

I took part as one of the speakers in a presentation about analytics today; explaining how data is collected through instrumentation of applications, web pages etc, to an audience who are not familiar with the intricacies of data collection and analytics.

We had a brief discussion about identifiers and what identifiers actually are which was enlightening and hopefully will have prevented a few errors later on. This bears explaining briefly: an identifier is rarely a single field, but should be considered any one of the subsets of the whole record. There are caveats there of course, some fields can't be used as part of some compound identifier, but the point here was to emphasis that you need to examine the whole record not just individual fields in isolation.

The bulk of the talk however introduced from where data comes from. For example if we instrument an application such that a particular action is collected, then we're not just collecting an instance of that action but also whatever contextual data provided by the instrumentation and the data from the traffic or transport layer. This came as a surprise that there is so much information available via the transport/traffic layers:

Said meta-data includes location, device/application/session identifiers, browser and environment details and so on, and so on...

Furthermore data can be cross-referenced with other data after collection. A canonical example is geolocation over IP addresses to provide information about location. Consider the case where a user switches off the location services on his or her mobile device; location can still be inferred later in the analytics process to a surprisingly high-level of accuracy.

If data is collected over time, then even though we are not collecting specific latitude-longitude coordinates we are collecting data about movements of a single, unique human being; even though no `explicit' location collection seems to be being made. If you find that somewhat disturbing, consider what happens every time you pay with a credit card or use a store card.

Then of course there's the whole anonymisation process where once again we have to take into consideration not just what an identifier is, but the semantics of the data, the granularity etc. Only then can we obtain an anonymous data set. Such a data set can be shared publicly...or maybe not as we saw in a previous posting.  

Even when one starts tokenising and suppressing fields, the k-anonymity remains remarkably low, typically with more than 70% of the records remaining unique within that dataset. Arguments about the usefulness of k-anonymity notwithstanding - on the other hand it is one of the few privacy metrics we have,

So, the lesson here is rather simple, you're collected a massive amount more than you really think.

The next surprise was how tricky or "interesting" this becomes when developing a privacy policy that contains all the necessary details about data collection, meta-data collection, traffic data collection; and then the uses to which that data is put, whether it is primary or secondary collection and so on.

Friday, 23 May 2014

Surgical privacy: Information Handling in an Infectious Environment

What has privacy engineering, data flow modelling and analysis got to do with how infectious materials and the sterile field are handled in medical situations?  Are there things we can learn by exploiting by drawing an analogy between these seemingly different fields?

We've discussed this subject earlier and a few links can be found here. Indeed privacy engineering has a lot to learn from analogous environments such as aviation, medicine, anaesthesia, chemical engineering and so on; the commonality here is that those environments understood they had to take a whole systems approach rather than relying upon a top-down driven approach or relying upon embedding the semantics of the area in one selected discipline.

Tuesday, 20 May 2014

Foundations of Privacy - Yet Another Idea

Talking with a colleague about yesterday's post on "a" foundation for privacy, or privacy engineering, he complained that the model wasn't complete. Of course the structuring is just one possible manifestation and others can be put together to take into consideration other views, or to provide a semantics of privacy in differing domains. For example, complete with semantic gaps, we might have a model which presents privacy law and policies in terms of economic theory which in turn is grounded in mathematics:

Then place the two models side-by-side and "carve" along the various tools, structures, theories etc that each uses and note the commonalities and differences, and then try to reconcile those.

The real challenge here is to decompose each of those areas into those theories, tools etc that are required to properly express each level. Then for each of those areas such as listed earlier,eg: type theory, programming, data flow, entropy etc, map each of these together. For example, a privacy policy might talk about anonymity and in turn anonymity of a data set can be given a semantics in terms of entropy.

Actually this is where the real details are embedded and we the levels as we have depicted them are vague, fuzzy classifications for convenience of grouping  these together.