Thursday, 18 September 2014

Scotland, Independence, Wales, the UK etc...


Well, what ever happens today in Scotland, I think that Salmond has won: the concessions that have been offered Scotland by the leaders of the main parties in the UK are generous to say the least. In effect a no vote will trigger practically everything but independence and full financial autonomy, while a yes vote will lead to much protracted negotiations and to be honest I'm not sure either side in this has a good idea of how these will turn out.

The questions of economy, currency union and EU membership are academic to a point. Firstly Scotland's economy will probably do quite well, though the independence vote is not specifically about the economy but rather the right of a people to decide their own future - democratically. Currency union might well become a moot point - if Scotland enters a currency union with the UK pound then there isn't really much the UK could do about it; there are alternatives such as pegging a Scottish currency against the EU in much the same way as Montenegro has done. Even if Scotland's currency did devalue then this could be a good thing for inward investment.

EU membership is interesting, especially as Spain and quite probably Belgium has vested interests in quelling their own internal independence movements - even then the pressure from their local voters and even smaller EU nations which might feel threatened by over dominance by larger members might sway this. EU membership also solves the currency issue.

Border controls and the like? Well Scotland couldn't become a Schengen country unless the UK joined too and I strongly doubt that we'd see passport checks at any time in the future other than in the more fanciful predictions.

My main issue however is - regardless of yes or no - is what happens to Wales and to a smaller extend to Northern Ireland and England. Certainly some Welsh politicians seem blissfully unaware or even naive of the implications. Even the Conservative Party in Wales has come out much more in favour of devolution of powers to Wales than Labour.

Politicians always has vested interests anyway: the Conservatives know that their only chance of any power in Wales is through the Senedd while Labour can always sit back and count on votes from their heartlands both in Wales and England. If Scotland leave, or even in Westminster is reorganised to solve the West Lothian Question this might change as Welsh MPs would have little influence even if they did vote on English specific legislations. There's an interesting discussion about this on the TrueWales web site in an article written by Rachel Banner (24 Jan 2012): The West Lothian Question.

Ironically it has been Welsh MPs who have probably made some of the most important decisions regarding England the UK: Nye Bevan and the NHS, Lloyd George and Home Rule (which unfortunately didn't come to pass due to World War 1) and even back to the Welsh advisors to Queen Elizabeth the First who promoted the idea of naval supremacy.

If we come back to what has been offered Scotland now and since the 1998 devolution votes one must seriously ask the question of why Scotland gets so much at the expense of the rest of the nations of the UK. I'm still unsure why every change to devolution in Wales requires a referendum such as that back in 2011 which granted Wales the power to make laws specific to the needs of the nation. Are politicians so weak and afraid of their decisions? Surely the populace voted those MPs into power to make such decisions for the good of the people and so they should take the responsibility themselves rather than pass it off to often under informed voters?

So on Friday, will Wales be offered DevoMax? I doubt it - no politician is that brave.

Wednesday, 20 August 2014

Design by Contract and Security

I've worked with design-by-contract for a long while; in fact its use as an essential programming concept was brought to my attention by one of my professors who didn't just teach it but demanded it as a fundamental part of any design or program. It remains today an integral technique in my toolbox of programming and design techniques. As an aside, Eiffel supports DbC as part of its programming model; it makes it one of the most elegant and easy languages not just to program in but to develop in.

DbC of course is related to the validation that one should make on web page forms and various concepts in strong typing of functions etc. Generalising DbC into other areas such as security, privacy should be an obvious step, but one seemingly not taken.

Via Schneier's brilliant security blog is a link to an article entitled Security As A Class Of Interface Guarantee. It is a long read but one more than worthwhile for software developers certainly; and maybe for any security and privacy engineers too. Actually if there's one article on security and development you read today it must be this one! Here's small sample:
Security Is Part Of Every Interface 
I prefer to think of security as a class of interface guarantee. In particular, security guarantees are a kind of correctness guarantee. At every interface of every kind — user interface, programming language syntax and semantics, in-process APIs, kernel APIs, RPC and network protocols, ceremonies — explicit and implicit design guarantees(promises, contracts) are in place, and determine the degree of “security” (however defined) the system can possibly achieve. 
Design guarantees might or might not actually hold in the implementation — software tends to have bugs, after all. Callers and callees can sometimes (but not always) defend themselves against untrustworthy callees and callers (respectively) in various ways that depend on the circumstances and on the nature of caller and callee. In this sense an interface is an attack surface — but properly constructed, it can also be a defense surface.

At the end of the article is a suprise link to a tweet :-)

Obvious isn't it...?!

Friday, 8 August 2014

Loyalty Cards

Tangentially related to privacy which I'll discuss later regarding information asymmetry, but first I wanted to comment on this article that appeared on LinkedIn: Business Travelers Are Saying "Buh-bye" to Loyalty Programs - Here's Why by Christopher Elliott, a "Reader advocate" for National Geographic Traveler

"After years of putting up with blackout dates, broken promises and bait-and-switch games, American travellers — particularly air travellers — are saying “Enough!”
They’re refusing to play the loyalty-program game, jettisoning blind brand allegiance in favor of a more pragmatic view of travel. Price and convenience are trumping mindless devotion to an airline, a car rental company or a hotel."

This applies equally to European travellers to I suspect. But let's look at the price-quality-customer service dimensions. I did as the article states a few years ago and switched allegiance from one to another, despite accumulating a sizeable amount of points and occasionally using them to upgrade it actually became too difficult to actually use those points.

I remember a few cases clearly. The first was trying to enter a business lounge at Helsinki Airport - according to the "loyalty" scheme terms and conditions I had enough flights to qualify for an upgrade to the next tier, thus allowing me access to the business lounge. Except that there was a mistranslation between the English translation and the "correct" terms and conditions. The airline customer service basically stated it wasn't their fault. Given that the lounge at the time was empty, but staffed by no less than three "customer representatives" make it all a little too surreal.

The second was at Heathrow when lounge entry was not just dependent on the card but on the airline that issued to card, despite both airlines being part of the same alliance. It worked like this: if you gained the second tier on airline A you could enter business lounges run by the alliance that A was a member of, except, if the lounges were at an airport where airline B, also of the same alliance, was the major carrier.

The third was when trying to book a flight with air miles. For my selected flight I had enough miles and successfully made the booking until it came to select my seats, where upon the whole process failed and I was unceremoniously ditched out of the booking process with a terse error about the flight no longer being available to points holders. It turns out this was an artefact of the internal IT systems and actually you had to call the customers service department to book the flights - after a long hold where you were constantly told that all of the customer service representatives were busy, but you really are valued by the company you're trying to spend you money with.

After a few flights across the Atlantic I switched to Lufthansa - Frankfurt and Munich are clean, efficient and as pleasant as airports go. Lufthansa may not have good economy seating on long haul but the crews are some of the most professional (and smiling!) I've had the pleasure to be on an aircraft with.

I actually have no idea actually how to redeem the points I've collected with Lufthansa and SAS - they'll probably expire or have expired without my knowledge without me ever receiving any bonus from my loyalty as a customer.

A few years ago a colleague of mine called a major Nordic airline in response to a request for customer feedback. It turns out that they couldn't care why he stopped flying with them. Surely with all the data analytics being applied to our data collected, the airline never stopped for a moment to question why customers suddenly stopped flying with them. 

Now due to primarily economic reasons, I fly with whoever gives me the best deal. Given that the terms and conditions are pretty much the same regardless whether you fly a national carrier or a "cheap" airline, price will always win. 

Given that many established, major airlines decided to compete with the cheap carriers by matching their standards and terms, it is no wonder that the loyalty of the customer to the airline has all but disappeared?

Then, we come to credit cards and store cards...will these go the same way as air miles as customers realise that the deal they get is barely worth the loyalty?

But as a final statement, given that this is all about customer loyalty - has anyone ever been contacted by an airline, shop or other about why they stopped using those products offered?

References:

Thursday, 7 August 2014

Privacy and Governance

Actually this is a presentation by Prof. Alastair Scotland of the UK's NHS on governance and, ultimately, patient safety. If you work in governance, privacy, software engineering and any management discipline related with these then this is one video you must watch today.

If it took the NHS 10 years to get this far, then in areas such as privacy which share many of the same characteristics of being a flawed system (in the general sense) with many safety-critical features, we've a huge amount of work to do outside of the mainly philosophical and legal debates about what privacy is.


Alastair Scotland.mp4 from Guy Murray on Vimeo.

Tuesday, 5 August 2014

On Finding Reasonable Measures To Bridge the Gap Between Privacy Engineers and Lawyers

This article originally appeared on the Privacy Association website on 29th July 2014.

On Finding Reasonable Measures To Bridge the Gap Between Privacy Engineers and Lawyers


Getting privacy lawyers and software engineers to work together to implement privacy is a perennial problem. Peter Swire, CIPP/US, and Annie Anton's article "Engineers and Lawyers in Privacy Protection: Can We All Just Get Along?" has explored this as has The Privacy Engineer's Manifesto, of which their article is part.

But here's the problem: We cannot keep addressing privacy from a top-down, legally driven perspective. No amount of additional processes and compliance checks is going to change the fact that software itself is so complex. Software engineering is often assumed to be the final stage and by some a mere consequence of many requirements from a large number of often conflicting sources.

Often privacy issues are “solved” by hashing an identifier, encrypting a communications link or anonymizing a data set, for some definition of anonymization. Many of these are piecemeal, ineffective, Band-Aid type solutions—the proverbial rearranging of deck chairs on the Titanic. Privacy Theater at its worst … or best.

So how do we address this?

First, maybe we should realize that we don't really understand each other’s disciplines. Very few lawyers are trained software engineers and vice-versa; therefore, constructing a lingua franca between these groups is part of that first step.

Often, understanding ideas that are simple in one domain do not translate to the other. For example, the use of the term “reasonable” means nothing in the software engineering domain. Plus, the simple act of “encryption” to a software engineer hides an enormous complexity of network protocols, key management, performance, encryption libraries and so on. Similarly, the now-ubiquitous use of “app” by lawyers to mean something that runs on your phone means a lot more to the software engineer.

What does “app” really mean to software engineers?

That little piece of code that downloads your news feed each morning and presents it in a friendly way—allowing you the scroll through using your touch screen device and maybe now and again presenting an advert because you didn't want to buy the paid version—is, in fact, not a tiny bit of code at all.

Effectively, what runs on your device is a multi-layered, complex interaction of hundreds of components, each sharing the device's memory, long-term storage, network components, display, keyboard, microphone, camera, speaker, GPS and so on. Each of those individual components interacts with layers underneath the interface, passing messages between components, scheduling when pieces of code must run and which piece of code gets the notification that you've just touched the screen as well as how and where data is stored.

When the app needs to get the next news item “from the web,” we have a huge interaction of network protocols that marshal the contents of the news feed, check its consistency, perform error-correction, marshal the individual segments of message from the network, manage addressing messages and internal format. There are protocols underneath this that decide on the routing between networks and those that control the electrical pulses over wires or antennae. This itself is repeated many, many times as messages are passed between routing points all over the Internet and includes cell towers, home wireless routers, undersea cables and satellite connections. Even the use of privacy-preserving technologies—encryption, for example—can be rendered virtually meaningless because underneath lies a veritable treasure trove of metadata. And that is just the networking component!

Let's for the moment look at the application's development itself, probably coded in some language such as Java, which itself is a clever beast that theoretically runs anywhere because it runs on top of a virtual machine and can be ported to almost any computing platform. The language contains many features to make a programmer's life easier by abstracting away complex details of the underlying mechanisms of how computers work. These are brought together through a vast library of publicly available libraries for a plethora of tasks—from generating random numbers to big data analytics—in only a 'few' lines of well-chosen code.

Even without diving deep into the code and components of which it’s made, similar complexity awaits in understanding where our content flows. Maybe your smartphone’s news app gets data from a news server and gives you the opportunity to share it through your favorite social network. Where does your data flow after these points? To advertisers, marketers, system administrators and so on? Through what components? And what kinds of processing? How about cross-referencing? Where is that data logged and who has access?

Constructing a full map or data flow of a typical software system—even a simple mobile app—becomes a spider web of interacting information flows, further complicated by layering over the logical and physical architectures, the geographical distribution and concepts such as controller and processor. Not to mention the actual content of the data, its provenance, the details of its collection, the risks, the links with policies and so on. And yet we still have not considered the security, performance and other functional and nonfunctional aspects!

Software engineers have enough of a problem managing this complexity without having to learn and comprehend legal language, too. Indeed, privacy won't get a foothold in software engineering until a path from “reasonable” can be traced through the myriad requirements to those millions of lines of code that actually implement “reasonable.”

Engineers love a challenge, but often when solutions are laid out before them in terms of policy and vague concepts—concepts which to us might be perfectly reasonable (there's that word again!)—then those engineers are just going to ignore or, worse, mis-implement those requirements. Not out of any malicious intent but because those requirements are almost meaningless in their domain. Even concepts such as data minimization can lose much if not all of their meaning as we move through the layers of code and interaction.

Life as a software engineer is hard: juggling a complex framework of requirements, ideas, systems, components and so on without interpreting what a privacy policy actually means across and inside the Internet-wide environment in which we work.

Software isn't a black box protected by a privacy policy and a suite of magic privacy-enhancing technologies but a veritable Pandora's Box of who-knows-what of “things.”

As privacy professionals, we should open that box often to fully comprehend what is really going on when we say “reasonable.” As software engineers, we'll more than make you welcome in our world, and we'd probably relish the chance to explore yours. But until we both get visibility in our respective domains and make the effort to understand each other's languages and how these relate to each other, this isn't going to happen—at least not in any “reasonable” way
.

Sunday, 3 August 2014

Messenger 10 year Anniversary

One of the first posts I made on this blog was about the Messenger probe to Mercury - at the time it had just made the 3rd fly-by before orbit insertion about 18 months later. On 1st August it celebrated 10 years since its launch.

So a large beer to all involved!

So now we have Messenger at Mercury, Venus Express performing dangerous aerobraking manoeuvres in Venus' atmosphere, a fleet robots trundling over Mars - not forgetting a small fleet of satellites around the planet, Juno on its way to Jupiter and venerable Cassini still going strong around Saturn.

Then there are the three that I'm most anticipating:

and then in just a few days, Rosetta finally arrives at 67P/Churyumov-Gerasimenk after 10 years. At this moment she's 3 days away with about 500 km to go! She's caring a lander called Philae which'll deeply later this year. ESA have a good track record of this kind of landings in exotic places with Huygens.

Can't wait!

In the meantime:

Wednesday, 30 July 2014

The Story of the Privacy Engineering Book

It was never going to be the next Harry Potter novel in neither content nor sales; thought I am open to offers from major film studios on buying the rights to my book in which case it isn't about data flow modelling and ontologies but the story of how Alice and Bob tried to keep their relationship private from Eve, Dave, Frank and the rest of the security character gang.

When writing a book, people often think that you must be an expert or genius to start. In fact this is quite the opposite, by writing a book you actually release that you are not an expert or genius in that area but become one (maybe) through making your thoughts and ideas explicit through the medium of print. Actually I think at the end you realise that being an expert is quite something else altogether.

The adage of "if you want to understand something then you should teach it" is what applies here. Following in the footsteps of Richard Feynman isn't too bad an idea regarding teaching.

The point of starting a technical book like Privacy Engineering was more to the conceptualise and concretise my thoughts and ideas on how I'd like systems to be modelled, analysed and understood from the privacy perspective.

In many ways writing the book was very much like writing my PhD thesis involving research just to understand the area, a thesis or hypothesis of how things work followed by the book keeping work of writing it down and documenting the sources of ideas and wisdom though copious references. Interestingly it took about the same amount of time from start to finish, approximately four years. I probably could have made it much quicker if it wasn't for the day job actually trying to analyse real systems but without that experience it would have been a dry, theoretical text without practical underpinning.

What surprised me, and maybe this comes from the training one received making a PhD, is that how "easy" it was to carve a niche in which I could be an expert (I'll come to what I mean by expert in a minute). It isn't that everything in the book is new but rather that the overall structure and application is "new". The Illustrated guide to a PhD really explains it best

Your Contribution to Knowledge
(from The Illustrated Guide to a PhD)
What was particularly exciting was bringing together the ideas from the following areas
  • Software Engineering - modelling, analysis, coding, data-flow modelling, requirements analysis
  • Ontologies, Taxonomies, Semantics
  • Law (Privacy)
  • Law (Philosophical)
  • The Philosophy of Privacy
  • Safety Critical Systems Design
  • Aviation, Surgery, Anaesthesia, Chemical Plant Design
  • etc
Maybe some of those areas surprise you? For example, what the $%^* has surgery got to do with privacy engineering? Many years ago we used to have a formal procedure for approving software project through various phases - concept, architecture, design, release. My job was to analyse and give approval for privacy from a software engineering perspective. Up until that point there were no software engineers in privacy, and an question of "do you have a privacy policy?" didn't really tell us anything. So procedures were put in place in the form of a set of questions - a checklist of things that must be done to move to the next stage. I got the idea from aviation!

It worked to a point, except that it was too rigid and didn't really fit into the more "agile" ways of working (agile, ad hoc, hacked...). After this is became a quest to find something that did fit, that did work in an environment where only when dissecting a piece of software you actually saw what was inside.

It was one of those serendipitous moments while reading some books on aircraft safety that I finally read Atul Gawande's book The Checklist Manifesto which lead me to Peter Pronovost's work and how CULTURE was the driver behind the workings of a safety oriented process. From this point onwards it was obvious (to me at least - with caveats) how we should approach privacy in software engineering: as a safety critical aspect!

Many, many experts have already discovered this - in aviation, surgery, anaesthesia, chemical plant design and so on. So, obviously I can't be an expert because *I* didn't know about this! Anyway, you can read about some this here:


There were closer at hand experts too other than those famous names appearing on the covers of books and papers. My colleagues at Nokia and Here explicitly and implicitly influenced my ideas. One thing was painfully obvious was the lack of common terminology, not just inside privacy but when working between domains such as software engineer and law. Construction of a lingua franca was our main priority and much of this was influenced with the ontological work made earlier in NRC's Semantic Web Infrastructure project M3 and work with a certain Ora Lassila of RDF and Semantic Web fame.

Even more interestingly was that our ontologies turned out to be "implementable" in the sense that we could construct reasoning systems and tie these with our analytics systems running HADOOP etc, to perform "in-line analysis" of what was really passing through these systems. Furthermore we had started to workout how to calculate privacy policies - those legal texts that appear on the start-up of applications and you have to click on OK or Accept to continue. We never quite got around to integrating this with established policy languages, but the main thing was we now knew how all this fitted together.

For a long time however, and this drove much of the terminological work above, was my worry - shared by others - that privacy was just a bunch of people talking in different languages albeit with the same, or similar words. Worse was that everyone felt that their terminology was right and it was so obvious that everyone understood what they meant regardless of legal, engineering or other background. I wrote an article back in January 2013 entitled "On The Naivety of Privacy" that expressed my feeling on this and stated that we really didn't have a formal foundation to privacy. The replies to that article were surprising in that many people wrote to me, mainly privately, that they felt the same way. I had supporters, albeit a silent group that seemed to fear some kind of privacy orthodoxy. Either way, the path I needed to take in order to understand my role was clear(er).

So, as a summary so far, the only way to be an expert is to surround yourself with experts in lots of different areas. But, they have to be willing participants and willing to share their knowledge. At this point an overall structure was coming into focus and the initial plans on how to engineer privacy were coalescing nicely. Documenting this was an interesting struggle and a number of presentations were made on exploring ideas in this area to get feedback and understand what people really needed, or thought they needed. It turned out that some of the areas I was looking at formed great analogies:



The trouble with ideas such as these is that you can get side tracked easily, though such side tracking is essential to the though process. Being challenged on simple questions such as why the terminology was structured so, what that particular hierarchy of concepts, why that particular naming etc and being presented with links to other areas however is critical to obtaining some kind of focus to the work you are embarking upon.

It was April 2012 that I travelled to Tallin, Estonia to talk about obscure topics like category theory, topology and homotopy type theory with a colleague from the University of Brighton. On the ferry crossing from Helsinki to Tallin, accompanied with copious amounts of coffee (I'm with Alfred Renyi - misattributed to Paul Erdos - on this one!) I wrote the first full draft of all of the ontologies or taxonomies and their structuring that we needed. After this and a further meeting with Brighton's VMG group later that year the thesis was however set.

It is critical to state that during this time I wasn't working in some theoretical vacuum. The ideas, concepts, terms, modelling etc were being applied, albeit somewhat silently - subterfuge was the game. Formal methods were outlawed, agile was the true way...said by those who understand neither. It appeared that everything worked in both the software engineering and legal domains; with the caveat that it wasn't introduced as a whole but rather as bits of tooling and technique to be used appropriately when necessary.

At this point the book started in earnest, though the actual writing didn't start until late 2013 and the initial few chapters were collected together from earlier technical reports and presentations. Much, if not all, of the writing in those initial texts were rewritten many, many times. Ernest Hemmingway was telling the truth when we said, "The first draft of anything is shit."

Apart from the practical battles: 
  • I chose the Tufte-LaTeX style in the end because it looked nice and they way it dealt with references and notes forced a certain way of writing.
  • Sublime and vi as text editors
  • Microsoft Visio as the drawing tool - I really wish they'd release a Linux, Mac and/or Cloud version.
  • Version Control .... erm, a certain cloudified, file store...
things generally went well. Initial drafts complete with unique spelling and grammatical errors were well received by those who reviewed them. I even ran a short series of lectures with colleagues to go through the chapters. I joked that these lectures were very much in the style of Andrew Wiles' secret lectures to a trusted colleague explaining the proof of Fermat's Last Theorem.

By February 2014 the overall structure and plan was in place, albeit a little nebulous and fluid in places - the structure of sections and chapters was changing but not the content. Then I started on the requirements chapter and there it stopped. Nothing seemed to work, the formal structure of requirements was wrong, I couldn't get the mapping between the terminologies, data flows and requirements to work at all. And there I got stuck for two months. I knew what the requirements need to look like but nothing fitted together...nada, zilch...."f***" was the word of the day, week, month. With the possibility of missing my self inflicted deadline of May was it even worth continuing? Luckily I persevered.

In another moment similar to that of Wiles, there came the nightmare of another book on the same subject with the same title being published. "F***" times 2. I bought this damned book entitled The Privacy Engineers Manifesto and started to read it, hoping and praying that they didn't cover the same material. This is actually where it got interesting, PEM didn't cover my material but rather provided a hook between the ubiquitous Privacy-by-Design principles and software engineering. It actually laid out a path that linked the two. This wasn't a rival but rather a symbiotic, co-conspirator in the development of the discipline of privacy engineering. With some hope I pushed the deadline forward to June and attempted to restart the work. 

It was actually back to pen and paper for drawing figures for a while as Microsoft just purchased Nokia's device's division and IT upgraded laptops, which meant a "many week" wait while Microsoft Visio was upgraded. During a latte fuelled moment there came the revelation on how these damned requirements and risk classifications would all link together:

3 Dimensions of Requirements
Simple eh?  Well, not perfect but it did provide a high-level structure in which things did fit and did work. Hunting through academic papers on the subject gave some kind of impetus to the work and writing started afresh and at great pace. May and June were spent in-between work and family finalising the draft. The deadline slipped again - oh the joys of self-publishing and having no editor nor real deadline.

July was the sprint finish mainly rewriting paragraphs, spell checking and actually removing a chapter of examples and patterns as the text now contained these - due to in no small part to the secret lecture series which turned the book from academic text into something more practical. In mid-July it was finished with only the proofing to go and on the 17th of July it finally went on sale.

Somewhat of an anticlimactic moment it seemed, but that was it. Whether the book is perfect or not and whether ideas have changed or become refined in the meantime was now irrelevant, it was now public and another contribution to knowledge existed. After this was many days of thinking of "what the hell do I do now?"

A colleague once explained that writing a book is like pregnancy: there are three trimesters followed by, well, the trimesters are: excitement, boredom and panic - the latter as in, it has to come out. What follows after all this gore, mess and pain is the desire to write a new book. 

So, am I an expert in this now? Well, yes in the sense that there aren't too many privacy engineers around but this belittles the term expert and gives it the wrong meaning. I now think that an expert is someone who understands what they don't know. There are huge areas that I want to know more about: human factors (cf: James Reason's Swiss Cheese Model) in privacy, formal underpinnings of privacy - yes, I love to write on a category theory foundation of this. There are experts around in requirements management, risk management etc that I'd love to talk to in bringing these areas into a much closer relationship with the structures we see in privacy. Information system security is another - it is something just assumed in privacy - whereas in software engineering it is an integral part.

The making of knowledge explicit is hard - unbelievably so. In fact, I am of the opinion that if you think you are an expert then you should go through the process of explaining and formalising your ideas in whatever your chosen area is; in other words, write a book about it. As presented earlier in the Illustrative Guide to a PhD, you spend all of your effort in adding that tiny amount to the sum of human knowledge but are rewarded with the ability to look back and see all of human knowledge and how it all fits together as a huge, holistic system. For a while you get your bit of this all to yourself, but this is closely followed by the desire to add another bit, and another, and another and so on. Knowledge is an addictive drug.

So that's the story. The book doesn't have wizards and car crashes, or a galactic princess who needs rescuing; the royalties will probably earn me a beer or two, but that really wasn't the point. Despite this sadomasochistic process what I have is an explicit, embodiment of my knowledge that can be shared - a conglomeration or summation of others' knowledge that I found a small niche to add to. I guess somewhere someone will find a flaw in my reasoning about privacy engineering and with luck suggest a solution and thereby adding a further contribution to our overall knowledge and development of this domain.

Actually I hope so.

ps: the second book...due 2015...deadlines permitting

Monday, 28 July 2014

Privacy Engineering Book

Privacy Engineering

A dataflow and ontological approach


An essential companion book for those of us who have to model systems from small mobile apps to large, cloudified BigData system from the perspective of privacy and personal data handling.


Available via Amazon (US, UK and all Amazon worldwide sites), CreateSpace and selected bookstores such as Barnes and Noble. Kindle version available, also with Kindle Match Book enabling you to the Kindle version for just 2.99USD when purchasing the paperback.

Table of Contents:
  1. Introduction
  2. Case Study
  3. Privacy Engineering Process Structure
  4. Data Flow Modelling
  5. Security and Information Classifications
  6. Additional Classification Structures
  7. Requirements
  8. Risk and Assessments
  9. Notice and Consent
  10. Privacy Enhancing Techniques
  11. Auditing and Inspections
  12. Developing a Privacy Programme
Information privacy is the major defining issue of today's Internet enabled World. To construct information systems from small mobile 'apps' to huge, heterogeneous, cloudified systems requires merging together skills from software engineering, legal, security and many other disciplines - including some outside of these fields! 

Only through properly modelling the system under development can we full appreciate the complexity of where personal data and information flows; and more importantly, effectively communicate this. This book presents an approach based upon data flow modelling, coupled with standardised terminological frameworks, classifications and ontologies to properly annotate and describe the flow of information into, out of and across these systems. 

Also provided are structures and frameworks for the engineering process, requirements and audits; and even the privacy programme itself, but takes a pragmatic approach and encourages using and modifying the tools and techniques presented as the local context and needs require.

Published July 2014
ISBN-13: 978-1497569713
ISBN-10: 1497569710
264 Pages, B/W on White Paper

Saturday, 19 July 2014

A Privacy Engineer's Bookshelf

There's a huge amount of material about privacy, software engineering etc already existing. So what should every privacy engineer have at minimum on his or her bookshelf? Here are my suggestions (I might be biased in some cases) which I think everyone working privacy should know about.
The reasoning behind the above is that entering the privacy engineering field one needs a good cross section and balance of understanding the legal and ethical foundations of privacy (Nissenbaum, Solove) through the software engineering process (Dennedy et al) to the actual task of modelling and analysing the system (Oliver). Scheneier's book is included to provide a good perspective on the major protection technology of encryption.

Of course this does not preclude other material nor a thorough search through the privacy literature, conferences and academic publications.

To be a privacy engineer really does mean engineering and specifically system and software engineering skills.


Monday, 14 July 2014

Final Proof...

And so it starts...the final proof before publication...

Privacy Engineering - a Data Flow and Ontological Approach

Friday, 11 July 2014

Privacy Engineering Book Update

Well it has been a while since I posted last and that's primarily because I've been making to push to finalise the book. I think it was Hemmingway that said that a book is never truly finished but reaches a state where it can be abandoned...

Well, this is very much the same. I'm happy with the draft, it contains the story I want to tell so far in as much detail I can put into it at the moment. This doesn't mean many chapters could have gone much further but there have to be compromises in content. If the book provides enough to tie these areas of ontology, data flow, requirements etc together and get the reader to a state where they can see that structure and use the references to move deeper into the subject, then this will have been a success.

I'll write more when I finally send the book for official publication next week.

But, what a journey...just like a PhD but without all the fun of being a student again :-)

www.facebook.com/privacyengineering
www.privacyengineeringbook.net


Monday, 9 June 2014

Word Clouds

Just a bit of fun, but also quite nice to see an overall idea of what I write about on this blog. Generated by Wordle:


So it seems I'm seriously interested in engineering, privacy (privacy engineering too!), data, technologies, analytics and so on.

Friday, 6 June 2014

Privacy Engineering - The Book ... real soon now, I promise

Final push to complete the draft. Many many thanks to all that have provided numerous comments...We're probably looking at late July after the editorial process and the production of the first proof copy.


Ian Oliver (2014) Privacy Engineering - A Dataflow and Ontological Approach. ISBN 978-1497569713

Official Website: www.privacyengineeringbook.net
Facebook:  facebook.com/privacyengineering


Tuesday, 3 June 2014

Privacy SIG @ EIT Labs

Yesterday I was fortunate enough to be given the chance to speak at the founding of the a Privacy Special Interest Group (facilitated by EIT ICT Labs) on the subject of privacy engineering and some of the technologies and areas that will make up the future of privacy engineering technologies.

The presentation is below (via SlideShare):


The PrivacySIG group's charter is simply:
The Privacy Special Interest Group is a non profit organisation consisting of companies which are developing or involved in the next generation of visitor analytics. We work hard to ensure we can build a future where everybody can benefit from the new technologies available. The Privacy Special Interest Group has developed and maintains a common "Code of Conduct" which is an agreement between all members to follow common rules to ensure and improve the privacy of individuals. We also work on educating our customers, the media and the general public about the possibilities and limitations of the new technology. We also maintain a common opt-out list to make it easy for anyone who wishes to opt-out in one step, this list is used by all our members. Any company who agrees to follow the code of conducts is qualified to join.
This is certainly a worthwhile initiative and one that really has taken the need for an engineering approach to privacy as part of its ethos.

Wednesday, 28 May 2014

How much data?!?!

I took part as one of the speakers in a presentation about analytics today; explaining how data is collected through instrumentation of applications, web pages etc, to an audience who are not familiar with the intricacies of data collection and analytics.

We had a brief discussion about identifiers and what identifiers actually are which was enlightening and hopefully will have prevented a few errors later on. This bears explaining briefly: an identifier is rarely a single field, but should be considered any one of the subsets of the whole record. There are caveats there of course, some fields can't be used as part of some compound identifier, but the point here was to emphasis that you need to examine the whole record not just individual fields in isolation.

The bulk of the talk however introduced from where data comes from. For example if we instrument an application such that a particular action is collected, then we're not just collecting an instance of that action but also whatever contextual data provided by the instrumentation and the data from the traffic or transport layer. This came as a surprise that there is so much information available via the transport/traffic layers:

Said meta-data includes location, device/application/session identifiers, browser and environment details and so on, and so on...

Furthermore data can be cross-referenced with other data after collection. A canonical example is geolocation over IP addresses to provide information about location. Consider the case where a user switches off the location services on his or her mobile device; location can still be inferred later in the analytics process to a surprisingly high-level of accuracy.

If data is collected over time, then even though we are not collecting specific latitude-longitude coordinates we are collecting data about movements of a single, unique human being; even though no `explicit' location collection seems to be being made. If you find that somewhat disturbing, consider what happens every time you pay with a credit card or use a store card.

Then of course there's the whole anonymisation process where once again we have to take into consideration not just what an identifier is, but the semantics of the data, the granularity etc. Only then can we obtain an anonymous data set. Such a data set can be shared publicly...or maybe not as we saw in a previous posting.  

Even when one starts tokenising and suppressing fields, the k-anonymity remains remarkably low, typically with more than 70% of the records remaining unique within that dataset. Arguments about the usefulness of k-anonymity notwithstanding - on the other hand it is one of the few privacy metrics we have,

So, the lesson here is rather simple, you're collected a massive amount more than you really think.

The next surprise was how tricky or "interesting" this becomes when developing a privacy policy that contains all the necessary details about data collection, meta-data collection, traffic data collection; and then the uses to which that data is put, whether it is primary or secondary collection and so on.