Wednesday, 1 July 2015

Pluto

On the 14th of July a small and very fast space probe will fly-by Pluto and its menagerie of moons. In the space of a few hours we'll learn more about Pluto and its companions than we have since its discovery on the 18th of February 1930.

In fact we've learnt huge amounts so far even at the huge distances New Horizons is away from the planet (yes PLANET!).

At the moment, we've four exciting missions on-going and actively producing data: five if you include Cassini which just seems to keep going and going to the point where one could even be quite blase about its constant stream of images. OK, there *are* others but for us that particular like planetary exploration...I'll come back to Mars and Jupiter in a moment....

Of the four I'll pick out Rosetta and Philae circling and sitting on a comet nucleus respectively - I think Philae should be renamed Phoenix after its return from the dead. Then there's Dawn orbiting Ceres tantalising us with better and better resolution of a world that was probably considered no more than a lump of rock to one that is possibly even active. Finally New Horizons itself.

We've also lost two probes this year: Messenger and Venus Express as their missions came to an end. Juno is still on its way and hopefully the mission planners will get us a better look at Europa along the way, and of course a flotilla at and on Mars.

I've probably missed a few from the above list - it is getting difficult to keep track especially when even the Chinese surprise us with things such as a quick visit to a passing asteroid!

But despite all of this excitement, New Horizons brings a little sadness: we'll have completed initial exploration of the nine planets. As a child I watched, sometimes in the middle of the night the Voyager probes, especially when Voyager 2 reached Uranus and Neptune. In both cases returning not just surprises but shocks - the cliffs of Miranda, nitrogen geysers on Triton anyone!

Pluto was left alone, unvisited and somewhat unloved.

Now we get a few weeks of excitement and a day or so of wondrous revelations and New Horizons departs giving us our first and last view for a long time of this mysterious place.

And that will be all nine and it comes to an end the first major milestone of the exploration of our Solar System.

There are silver linings to this cloud: the stream of data from all the probes and years of research of New Horizons' data will hopefully provide more impetus to space exploration. We can't stay rooted to our Blue Marble forever and we must pave the way forward, not just in exploring new worlds but also in understand how the Universe operates and what is out there. These space probes furthermore push technology to unexplored boundaries will many unexpected innovations to even our daily lives.

Finally, New Horizons will look back and Pluto, probably go into hibernation and continue to a possible second target. A world that even until the 1980s and 1990s was purely hypothetical. Considering that until 1989 we expected Neptune's moons to be inert and barren rocks and found something completely different, and now expecting the same - at least in terms of surprises - at Pluto, we should have learnt by now, after the exploration of the first nine planets, that whatever comes next is going to be truly wondrous.





Monday, 29 June 2015

UK Ontology Network 2015 Presentation

I really have been a bit lax here of late - not that I have little to write, but rather time (as always!) and the day job takes me away from privacy into new uncharted, more security research related areas - which I will admit is great fun!

Back in April I attending a fascinating little workshop run by the UK Ontology Network to talk about ontologies for privacy. What makes this workshop fun is the sheer amount of interaction between the participants: the distinction between presenter and audience is completely blurred. As a presenter you get a 5 minute slot then after a number of presenters on similar subjects, a 20 minute panel session where the audience really gets into the conversation.

I'm going to be unfair to all the presenters at UKON 2015 but I'm going to pick out a presentation by Adam Nogradi on Sparqlycode - a tool for semantically annotating source code and establishing compliance against, in this case, certain security guarantees. Of course, to move to a different area all you need is an ontology for your subject area, say, privacy, and you have a tool for privacy compliance...more on this later :-)

So here's my presentation based on the work in the book Privacy Engineering: A Dataflow and Ontological Approach.




And the book containing a much more detailed description can be bought from Amazon, Barnes and Nobles etc etc.

Amazon UK/EUAmazon UK/EU

Saturday, 27 June 2015

An article on The Semantics of PII


A while back I wrote a short article for the IAPP's Privacy Tech Blog. With permission I'll reproduce it here for additional reference. Also,a tip of the hat for the administrator of the blog: Jedidah Bracy of the IAPP for his spell checking, grammar checking and editorial skills!


The Semantics of PII
Privacy Tech | Feb 26, 2015


Last year, Profs. Peter Swire and Annie Antón wrote a compelling piece in Privacy Perspectives about the need for privacy engineers and lawyers to get along. Establishing a common language in which to communicate will be essential to appropriately connect policy with technology.

It’s probably safe to say that the most common terms used in privacy are personally identifiable information (PII) and personal data, depending upon whether you come from a U.S. or European background. I think these terms are more or less self-explanatory.

But what do they really mean?

Take PII, for example. It means a chunk of data that reveals some knowledge about a person that can be unambiguously identified. Sounds more or less about right, doesn’t it? Is a computer's IP address personally identifiable? What if that IP address belongs to a router for a large, multinational corporation? Is it PII then? And what if it belongs to a family using multiple computers, tablets, phones or other devices?

We will soon find ourselves delving into the minutiae of meaning—the what-does-personal-really-mean type questions. Plus, we must ask what isinformation, and what does identifiable denote?

There is a whole area of linguistics, philosophy and mathematics—take your pick—that deals with the meaning of things, otherwise known as semantics, or even semiotics if you want the overall field.

Mathematicians took years to fully understand the semantics of even simple statements such as 1+1=2, which looks obvious until you try to explain what 1 is, what 2 is, what + means, what = means and then what it means to say 1+1. The English philosophers Bertrand Russell and Albert Whitehead spent most of their careers writing Principia Mathematica to answer this question, and after four editions and 300 pages of dense mathematics, they had an answer. That was, until a young German by the name of Kurt Gödel came along and shook mathematics to its foundations with an equally "trivial" result.

So if it took 300 pages by two of the brightest minds in mathematics to give us a semantics for 1+1=2, how many pages—and years of work—will it take to give "PII" a semantics?

Now here's an interesting point: The definition of PII that is used in contemporary privacy is perfectly well defined in the privacy-legal context. I can go to various legal documents and read a formal definition of what PII or personal data means. But as we move between disciplines—in our case from privacy-legal to privacy-engineering disciplines—these definitions no longer hold, or at the very least, they don't work well.

If we move to the other end of the scale from legal to mathematics, we find concepts such as information entropy, which provides a clear, unambiguous and precise definition of what information is as well as the identifiability of a data set with respect to some population and so on. Information entropy, however, is not an easy concept with which to work. We can state now that the legal definition of PII can be defined in terms of the mathematical definition; it's just that this is obscenely difficult to do.

Somewhere between these two extremes lies software engineering, the discipline that actually implements privacy law into our systems, in ostensibly mathematical (programming language) terms.

Software engineers, much to the chagrin of privacy lawyers, do not understand legal terms. Well, ok, they do to a point, but you try coding a statement such as "reasonable privacy" into C++ or Java!

Plus, privacy lawyers don't understand all the subtle ramifications of virtual machines, machine language, object orientation, distributed computing, network protocols, XML, RDF—the list goes on!—again, much to the chagrin of software engineers.

Yet, as we stated earlier, there is a relationship between the terms and language that privacy lawyers use and the terms and language that software engineers use. That link provides the translation mechanism that allows both groups not just to talk but to properly communicate with each other.

We can spend as much time as we’d like writing manifestos and principles, designing processes, inventing new job titles such as privacy officer, privacy compliance tsar, grand chief-overseer-of-the-worshipful-court-of-privacy-dudes and so on, but without grounding semantics into terms such as PII and personal data—terms that will allow us to translate between legal-speak and engineer-speak—all of this work will be in vain.

Thursday, 25 June 2015

What is quality?

Everyone talks about quality, whether it be a quality customer experience, or quality product, or even a quality process...still not sure what that is...but ask the question of "what is quality?" and you won't get a good answer.

There are two good places to start:
  1. Robert Pirsig's Zen and the Art of Motorcycle Maintenance
  2. John Guaspari's "I know it when I see it"
Anyone working in "quality" should have not just read but completely internalised both books before uttering a further word on the subject.

Interestingly, quality doesn't necessary mean expensive or the best - though being called the "best" occurs when you have quality.

For example, SAS actually have a "quality" long-haul economy product. This doesn't mean it is great, but they do their best to keep passengers (on an 11 hour flight) well fed and watered. Now just because I had a good experience and perceived their economy product to be of a high quality doesn't necessarily mean that they shouldn't improve on it. Trust me, there are LOTS of things SAS could do to improve on their economy product.

Airlines are abound with examples of quality: for example, Finnair vs Norwegian - the former just feels to be an expensive Ryanair but without the friendliness while the latter gives you what you need plus free Wifi (slow Wifi, but Wifi still). To me, Norwegian provides me with a quality product. Another example is Lufthansa: while their long-haul economy offering is poor from a seating point of view, their food (SAS take note, I get beer and wine for free with my meal!) and superb crews, plus the added advantage of using the A380 means I choose to fly Lufthansa when I can.

But today I came across something interesting while stopping at a cafe to buy ice-cream for my children. There is no doubt this particular cafe has good service, smiling and ample staff, and excellent ice-cream, though a bit pricey! But even with, I counted 3 members of staff and a manager present they couldn't actually keep the 10 or so tables clean. It really isn't pleasant to have to sit at a table with someone else's food still left there, or having to clean it up yourself.

While everything about this cafe says "we provide a quality product" this is let down by a very simple action of not cleaning up.

This got me thinking, especially when referring back to Pirsig's observation that even though you might have the most expensive and best built motorcycle, it only takes a single screw to become threaded, or the head to become ground away to reduce that most expensive of machines to being totally unfit for purpose - in Pirsig's case, unrepairable because of a single flaw in the cheapest, most innocuous of components.

Funny how a simple act such as clearing up a table can ruin a cafe's reputation; or, how a faulty IFE screen can ruin a flight. Sometimes it is just attitude that ruins everything...when a much respected company fails to ever ask of, or even respect even a semi-regular customer then the quality of the whole is reduce to nothing.

Sunday, 3 May 2015

Privacy Engineering on Wikipedia

I noticed that an entry for privacy engineering was missing on Wikipedia, so in the true traditions of Wikipedia I created it:




It is a bit sparse at the moment, but here's hoping that the privacy community adds material and the Gods of Wikipedia look favourably on the entry...


Friday, 1 May 2015

Goodbye Messenger

One of the first posts here was about NASA's Messenger probe to Mercury, and today its mission ended. Here's Messenger's final image and tip of the hat, or a beer or two to NASA's engineers!


Now it is just waiting for ESA's BepiColombo probe to arrive - it departs in January 2017, makes 2 fly-bys of Venus, 5 of Mercury before entering orbit in January 2024.

Thursday, 30 April 2015

Automating Privacy Compliance

I was at the 2015 meeting of the UK Ontology Network in Leeds earlier this month where, amongst many, there was a presentation about a tool called Sparqlycode - which if you get the chance you must check out!

Anyway, Paul Worral of Interition Ltd wrote a very nice summary of my work:

Ian Oliver [Nokia Networks, Espoo, Finland] presented Ontologies for Privacy.  The whole idea behind Sparqlycode is to provide an information tier for software that enables it to be linked to knowledge about the business.  Ian's work is a perfect example of this. He demonstrates how high-level policies on Personally Identifiable Information should and can be directly related to the code responsible for adhering to them. Ian has authored a book on the subject, Privacy Engineering. I bought it and hope to have some examples of how it can be applied to Sparqlycode soon.

I look forward to seeing how this works out, but it certainly is in the direction of where I hoped. It'll be very interesting to see how this particular approach matches with a more data-flow based modelling approach.

At UKON2015 there was also an extremely interesting presentation about a tool called TawnyOWL for programmatically generating ontologies. Given that Clojure is my current language of choice this seems a perfect fit for the privacy ontologies themselves.

Tuesday, 28 April 2015

Modelling Privacy

Here's a teaser from the next book on privacy. ETA late 2015, December - just in time for Christmas - if I work really hard!


It will compliment the existing book Privacy Engineering and build upon more of the data flow modelling, use of taxonomies and techniques for analysing models such as those from safety critical engineering, eg: FMEA, RCA etc.

In the meantime, Privacy Engineering is available from Amazon.com, Amazon.co.uk, CreateSpace as well as Barnes and Noble and even book stores such as CDON.fi here in Finland.

Monday, 27 April 2015

Privacy Awareness Training

I had the pleasure of presenting at the IAPP's DPIntensive workshop in London this month. After my session I got to talk with many about how to move privacy forward beyond an insular group discussion properly towards the engineers whose job it is to build the systems that implement these privacy rules.

One thing that came up was the need for training and that privacy awareness training hasn't had the effect hoped for. Given that awareness training is exactly that, is it no surprise that once the, usually, one hour presentation on how we should all care about privacy is made nothing happens?

Primarily this is because awareness training is by its very nature very abstract at best and irrelevant at worst. Awareness training is also rarely followed up by more context relevant training, for example, for the software architects or programmers or marketers and so on.

There are various reasons for this, mainly, that to continue training in such a manner takes a great deal of effort to set up and comes with an interesting catch-22 problem: the privacy department/group/... probably doesn't have any engineers; which makes generating relevant training for engineers remarkably difficult.

Worse is that because of the current nature of privacy - it is primarily a legal discipline, albeit one trying to break through to engineering - very few engineers move towards or even into privacy.

One member of the audience at the DPIntensive workshop remarked on this stating that this was one of their biggest problems, especially as they had so much to learn from engineering.

The other major difficulty is that the structures that need to be put in place in order to translate between a legal discipline and an engineering one are undoubtedly complex. Consider a linguist trying to create a translation into an as yet not understood language: first one must understand the script, the syntactic structure and then the semantic ones - not to mention the whole problem of the pragmatic structures and idioms that exist before a degree of fluency is reached that makes translation or even basic conversation possible.

So, the problem with privacy awareness training is that it becomes almost impossible to follow up and continue beyond anything more than a broad, common denominator.

Such training however are fantastic for metrics ... make the training compulsory and you'll get 99% of the company taking the training - which normally lasts an hour, can be delivered by webcast or similar. Working with metrics and a delivery mechanism like that makes it an amazing vehicle for improving 'management' metrics. Which in this case are exactly the wrong metrics, at least from the point of view of the good of the company.

So next time you create a privacy awareness training consider :


  • whether that training is aimed at a particular audience, or it is broad and generic
  • how that training is to be followed up
  • what effects do you expect to see
  • measurement of must be made on what effects of the training actually went into practice
We can go further and ask what cultural changes happened due to the training, from the point of view of:
  • the programmers
  • the engineers
  • the overall R&D
  • the management
  • the marketing department
  • the legal department
  • the privacy group

Unless all of the above can be answered then the privacy awareness training will have no overall or lasting effect.



Monday, 6 April 2015

Quote of the Day about Truth

Emile Zola:

 “If you shut up truth and bury it under the ground, it will but grow, and gather to itself such explosive power that the day it bursts through it will blow up everything in its way.”

Saturday, 4 April 2015

2015 UK Election Leader TV Debate

Whether leadership debates are a good thing or not is itself a debate, however ITV's UK Leadership Debate with seven party leaders was held with the result that Miliband (Lab) "beat" Cameron (Con) by a small margin. YouGov made a survey of who do you think won the debate with the results as shown below (source: Guardian)

One thing however is not explained, and that is who was asked. Obviously if you'd polled in Ceredigion or Gwynedd then Wood (PC) would have won, if in Brighton then Bennet (Greens) and so on. However I assume that we could say that this was a representative sample from across the UK, but still it is going to be heavily weighted in favour of the national parties and then especially the two leading parties.

This got me thinking, as you can tell whatever you want with statistics - think of it as accountancy with more leeway - could the above figures be weighted according to the uk electorate, especially as two of the parties involved do not campaign outside of Wales or Scotland.

The electorate figures for England, Scotland and Wales for 2013 according to the Electoral Commission are 40,100,00, 4,100,00 and 2,300,00 (to nearest 100,000). Given this I think it is obvious that the above results are going to be skewed towards the established parties.

Furthermore the SNP are fairly well known and have a more 'national' or UK-wide agenda than Plaid Cymru who are much more focussed on Wales. Leanne Wood (PC) for example is standing as a member of the Welsh Parliament rather than Westminster. Welsh politics rarely feature outside of Wales, except for a strange incident back in 1997 (one, certainly for the conspiracy theorists). Ironically given the current constitutional issues since the Scottish independence vote, it has been Rhodri Morgan, leader of the Welsh Government who has been proposing ideas (even at EU level) of how the UK and Northern Ireland should be governed.

That given, the above figures on who "won" really should be taken much more in context of the audiences to which they are most relevant. The above is so much biased towards an English view - not that there's a problem with that - it does give a false impression to voters in Wales and Scotland. Furthermore, given the size of England and its electorate even the above figure does not truly represent England - how would it look in the context of Thanet versus Toxteth?

So getting back to how the leaders actually did in the debate, it would be best to take each individually, especially as each has very different leadership goals. Probably the best overview of each of the party leaders' performances was given by the Telegraph. Or as another put it, four out of touch public school boys taken to task by three women :-)


Sunday, 22 March 2015

Slowing Down Software Development


Stephen Wilson in his blog post Programming is like Playwriting (23 Feb 2011) which recently resurfaced via a Twitter conversation makes a few interesting points about how we write software and how the tools and speed of development cause some very interesting quality problems.

Coding is fast and furious. In a single day, a programmer can create a system probably more complex than an airport that takes more than 10,000 person-years to build. And software development is tremendous creative fun. Let's be honest: it's why the majority of programmers chose their craft in the first place.

Actually I found this statement ironic, especially in light of the Denver Airport Baggage System - which itself became far more complex than the rest of the airport's operations.

So, picking out two salient points:

We took our time. I was concerned that the CASE tools we introduced in the mid 90s might make code rather too easy to trot out, so at the same time I set a new rule that developers had toturn their workstations off for a whole day once a week, and work with pen and paper.
I worked a long while back in software-hardware co-design, to best understand the difference consider these situations:

Software - compilation and testing phases

$ vi myProg.c
$ gcc myProc.c
$ ./a.out


repeat multiple times per minute/hour as necessary. The cost of compiling and editing is measured only in man hours.

Hardware - compilation and testing phases

  • Send net list to TI,Phillips or whoever for ASIC manufacturer
  • Pay $1,000,000
  • Wait 3-6 months
  • Receive single ASIC in post
  • Test

Maybe the solution is that each compilation is charged per compilation? Actually I knew one developer that added sleep statements to his compilation scripts so that the act of compilation would become so 'expensive' that he spend much more time ensuring that the code worked before compilation.

My internal coding standard included a requirement that when starting a new module, developers write their comments before they write their code, and their comments had to describe ‘why’ not ‘what’. Code is all syntax; the meaning and intent of any software can only be found in the natural language comments.

Formal specification? Now whether you use B, Z, VDM or any of the other host of mathematical languages (and by the way, C, Java etc are mathematical languages in that sense) along with their tools and techniques is largely irrelevant, though for actually expressing the WHY and WHAT they are rather good at this!

We have had some excellent results regarding so called 'light-weight' usage of formal methods. The main learning however is not doing formal methods for the sake of doing formal methods but the fact that the communication and clarity of the requirements and subsequent code was much improved.

References:

[1] Ian Oliver Experiences of Formal Methods in 'Conventional' Software and Systems Design. FACS 2007 Christmas Workshop: Formal Methods in Industry. BCS London, UK, 17 December 2007 

[2]  Ian Oliver Experiences of Formal Methods in 'Conventional' Software and Systems Design

Thursday, 19 March 2015

Messenger at Mercury .. the "end game"

A long time ago, and probably one of the reasons I started writing this blog, Messenger arrived, or more correctly made a fly-by of Mercury. Now after many years NASA plan some audacious manoeuvres before they finally crash Messenger into Mercury.

Sad to see Messenger's mission end, but the results have been amazing. You can read about the planned hovering and low passes at Science Daily.



Tuesday, 17 February 2015

IWPE2015 Keynote

I'm giving the keynote speech at IWPE2015 which is provisionally entitled

"Engineering Privacy as a Safety-Critical Concern"

I'll talk about some tools and techniques which we can use from other domains such as aviation and medicine and how privacy in software engineering is synonymous with safety in these other domains.

Conference details can be found from an earlier posting or via the link above. Conference date is 21st May 2015 and it will be held in conjunction with the 36th IEEE Symposium on Security and Privacy in San Jose, California.

Tuesday, 10 February 2015

Privacy Engineering Tutorial Session held in conjunction with IEEE TrustCom-15

Privacy Engineering Tutorial Session held in conjunction with IEEE TrustCom-15

August 20-22, 2015, Helsinki, Finland


Privacy from legal aspects through to engineering concepts has become a defining aspect of system design. Knowledge of how this relatively young and important area links together lawyers and engineers is critical to a proper implementation of privacy beyond mere lip-service and obscure privacy policies.

What would make this tutorial session unique is the presentation of the end-to-end privacy ‘process’ with examples drawn from industry demonstrating how Privacy-by-Design becomes Privacy Engineering with foundational aspects, tools and techniques, risk management, requirements management, checklists, auditing etc being properly integrated together.

Organisers

Dr. Ian Oliver, Nokia, Finland
Michelle Dennedy, VP/Chief Privacy Officer, McAfee/Intel, US
Jonathan Fox, Director Data Privacy, McAfee/Intel, US

Dates

This tutorial will be held on the 20th of August 2015.

Content

This tutorial session will be held in four parts and presented by the three organizers listed above.

  1. Legal Aspects of Privacy For Managers and Engineers(JF)
  2. Privacy Development in the Software Process (MD)
  3. Engineering Foundations of Privacy (IO)
  4. Guest Lectures
    1. Privacy at F-Secure, Antti Vaha-Sipila, F-Secure
    2. Privacy at Nokia, TBD
  5. Discussion (All)

The above sessions are supported by material in the following books:

  • The Privacy Engineer's Manifesto - Apress
  • Privacy Engineering: A dataflow and ontological approach - CreateSpace

Contact

Please direct enquiries and registration for the tutorial to Ian Oliver.