Saturday 25 February 2012

The Requirement of using Formal Methods

A very interesting posting by way of LtU about the ethics, morals and necessity of using formal methods when developing software: When Formal Systems Kill: Computer Ethics and Formal Methods by Darren Abramson and Lee Pike. I'll cherry pick two quotes that stood out for me from this paper: the first being:

How to mathematically model formal systems and their environments. Unlike other engineering artifacts that are mathematically-modelled (bridge stresses, aircraft aerodynamics, etc.), many concepts in computer science are brand new, and a research challenge is simply how to mathematically model these concepts. Such concepts of course include modelling software and hardware and the environments in which they operate.

The fact that much of the work we're doing in the computer science area is brand appears curious, but if you think that in many cases the mere act of formalising something reveals how little is understood about the underlying theory. So while a new design of bridge can be built, it still uses well known and (importantly!) well understood underlying theories to verify and validate its design. Compare this to situations that arise in computer science, eg: the formalisation of policy structures for privacy, or the design of a space-based semantic web inspired, personal cloud and you quickly realise how little we know.

Privacy is an interesting issue, for example we have many groups proposing "engineering methods" for design of software systems but then failing on the actual theory.

Another case was the M3 project [1,2] and now there's a group trying to replicate that work without even attempting to learn the underlying theory...go figure...

But as the paper goes on to state, verification is truly difficult - the size, internal relationships and complexity and interactions within a computer system is immense and very fragile. The paper notes a comparison between aircraft and computer systems; while an aircraft as complex as the Airbus A380 has as many "parts" as a typical large software installation, it can survive partial failure and, more importantly, we knew how the thing would behave mechanically before it even was built.

The discussion, and source of the second quote I'll take, then moved to the ethics and morality of using formal methods as best practice:

If formal methods is a best practice of software engineering, then an engineer who does not employ it is either negligent or incompetent. But formal methods is beyond the capability of typical software engineers (otherwise, why do we need formal methods experts and researchers?) or is too time-intensive to employ, so it cannot be considered to be a best practice today.

Sadly this quote is too true, but is deeply troubling when you consider how much software is actually in production.

I feel there is an irony here operating at many levels. After years of working with the UML - a language severely criticised by some for not being formal (and vice versa by others) and techniques such as MDA, we had developers using formal mechanisms without even really knowing it. Programming languages themselves are formal languages.

Then there's the irony of agile methods which apparently eschow any use for formality and modelling (so I am told and have experienced) - the irony here is that agile methods rely upon very strict communication between the developers, testing, validation, verification and common, consistent shared understanding of what is being developed.

Our own experiences of formal methods (Alloy, B-Method [4]) being used as a best practice have been documented [3] and the results we had very extremely encouraging. Though I do remember one episode where some "uber-architect" almost had a fit that we reduced his grand designs to a single page of relatively simple specification. I guess that simplicity was overrated - suffice to say, that particular design failed for many of the reasons given the the above paper.

Overall, the paper "When Formal Systems Kill..." is an extremely important discussion on software engineering failure to embrace and utilise the very techniques that it is based upon will some very damaging and frightening outcomes.

References:
  • [1] Oliver I, (2009) Information Spaces as a Basis for Personalising the Semantic Web. 11th International Conference on Enterprise Information Systems6 - 10, May 2009 Milan, Italy
  • [2] Oliver Ian, Honkola Jukka, Ziegler Jurgen (2008). “Dynamic, Localized Space Based Semantic Webs”. WWW/Internet 2008. Proceedings, p.426, IADIS Press, ISBN 978-972-8924-68-3
  • [3]Ian Oliver (2007). Experiences of Formal Methods in 'Conventional' Software and Systems Design. BCS FACS Christmas Workshop: Formal Methods in Industry, BCS London, UK, 17 December 2007.
  • [4] Ian Oliver (2006). A Demonstration of Specifying and Synthesising Hardware using B and Bluespec. Forum on Design Languages FDL'06

Thursday 23 February 2012

Modelling

I've been recently involved in some work involving modelling of information systems (or software systems if you prefer). What has been quite interesting is the amount of effort it takes to move beyond the idea of "let's make a model" to actually doing modelling.

Without going into certain specifics, we were tasked to build a "classification hierarchy" of something - or more specifically, could I build a database (with the implication that the database might be in Microsoft Excel or something similarly unsuitable for the task).

Ontologies are complex beasts to build, especially when one must take into consideration its semantics and expressability (see: [1]). Combined with the fact that you then have relationships, attributes, aspects and a host of other details to take into account, finally topped with mapping (architecting) this to something implementable such as a relationship database structure.

What I have noticed, implementation difficulties aside, is that there seems to be a trend for those providing the requirements to over simplification, typically, dropping of aspects; overcomplication, ie, the concentration on certain specific minutae of details, and a general misunderstanding, or is it an incomprehension of the underlying concepts that need to be modelled.

Indeed it might even be so that the act of modelling reveals too many details about the underlying, and still somewhat undiscovered underlying theory that are just too hard and too complex to initially understand.

As an example, I give the concept of "Personally Identifiable Information" or PII from the area of privacy. What constitutes PII?  Obviously, it is names, identifiers etc? Obviously...? Now what if I stated that we don't need any of that, but maybe just two abstracted locations to uniquely identify people, eg: home and work addresses to the nearest 100m (or even greater). Two items of data that might not qualify to be PII under normal conditions conspire together to be PII.  PII is considered one of the central concepts in privacy and yet remains stubbornly elusive to precise definition.

My next point regarding modelling is the lack of analysis that is made. Let's start with a few definitions: If I draw something, I'll call it a picture. If I give some semantics (maybe formally) then we can call it a diagram. If I describe some system in terms of those diagrams I'm actually going through the process of modelling, but still I don't have a model until I can use those diagrams to actually say something about the thing or system I have described. If we have procedures and techniques for analysing the information contained within those diagrams then I have a model. Until then we're just drawing pictures...

Diagrams + Analysis = Model

Aside: A diagram could also be textual as well as pictorial, or a heterogeneous combination of both.

So, given the ontologies that I (or we) have been developing, do we have methods of analysis associated with those, and if now, why are we actually going through this procedure? Now, one could argue that the act of trying to formally specify an ontology provides some valuable insights into the underlying theory but the final outcomes still needs to be used and provide value in its application.

I'm giving a course on conceptual modelling next week, should be interesting to place the participants actually in the situation where they need to think more about the concepts, the underlying semantics and theory rather than the modelling language itself [2] or some bizarre implementation. Always a revealing course to run...

References:
[1] The superb, Description Logic Complexity Navigator
[2] Alex Bell (2004). Death by UML Fever. ACM Queue. Volume 2 Issue 1, March 2004

Wednesday 22 February 2012

Visiting Fellow

Just became a Visiting Fellow at the University of Brighton in the School of Computing, Mathematical and Information Sciences working with a great team of people in the Visual Modelling Research Group.

This formalises work we've made over a number of years and gives ample opportunity to return to some more academic and research routes, especially in the area of information systems and privacy.

Thursday 16 February 2012

Some maths and programming links for later

Not a discussion but just some links to follow up on later: Math2.0 - a discussion forum for mathematical topics and a discussion found via Lambda-the-Ultimate on concatentive programming: Why Concatenative Programming Matters.

Will write something on classifying privacy later...

Saturday 11 February 2012

Home cooking: aioli

There's something about home cooking, a few, simple ingredients that just beats even the best restaurants. Some of the best meals, if not, the best meals I've had have been home cooked.

So, in the spirit of the above, I decided to make some aioli today (done this before). Now, all an aioli is, is a mayonnaise made with garlic and maybe mustard as additional ingredients. Actually despite its reputation, a mayonnaise is actually quite simple to make - as long as you can whisk fast enough.

The surprising thing about good, home made mayonnaise, or aioli this case, is how different it tasted from anything is a bottle or jar, ie: mass manufactured. Rather than getting a sweet, white, runny sauce, you get a complicated, rich, yellow, fruity, peppery, spicy sauce with lingering and changing aftertastes - superb with pasta or fish.

My particular combination today: aioli made with, garlic, egg, olive oil, whole-grain dijon mustard and a touch of black pepper.

Wednesday 8 February 2012

A hierarchy of privacy?

I've been wondering about privacy and its hierarchy. Specifically after reading the article Hierarchy and Emergence by David Corfield on The n-Cateogry Cafe (July 18, 2008),  do we have a good hierarchy of privacy?

In that article, Corfield presents a list of hierarchies, eg:

sound, vocable, word, utterance, conversation, discourse

and discusses things such as while mathematics deals with the lower levels, eg: the theory of sound waves, it does not handle the upper levels well. Corfield goes on to discuss whether mathematics can explain symphonies etc.

So, with that as a starting point, can we do something for privacy and in particular, information privacy and its relationship with data. Would a hierarchy such as:

field, type, object/table/class, database, corpus ....

work or be sensible? I'm not suggesting the above is strictly correct or the only way of defining this, but we need a start and this seems vaguely sensible at the moment

Certainly, mathematics works well at the lower level here...how do we deal with privacy at a field level? Typically, encryption, hashing and various other obfusication techniques. We know what a data field is, how to define it, how to manipulate it and what a field means in various semantic defintions.

At a type level we know the transformations and semantics similarly. For an example of information transformation - street being mapped to city - we have ontologies and consistent mappings that work well here.

There's even a nice metric over these as well: entropy, though typically this manifests itself as a probability distribution rather than a specific value that can be applied - though an ordering relationship obviously exists and this is isomorphic to the subclass relationship found in most ontologies.

At an object/class/table level things get more complicated as individual fields have relationships with other fields which complicates things. A trival example here might be that one obfusicates the date of birth but doesn't obfusicate the starsign and age fields. Both fields have a relationship though in this case the entropy does increase but the amount of information loss is less than one might perceive just by dealing with a field or fields in isolation.

Here I think we already start seeing problems...I don't think really that mathematics has a problem per se but rather the complexity of the solution starts to emerge and we don't readily know how to deal with this, except maybe in an abstract sense in that we can define the properties and structures but not the actual inner workings consistently. Even just the simple act of defining what an object is is fraught with difficulty. One can look at the work on RDF Molecules [1] to see an example of how difficult it can be in defining the boundary of the concept of an object. I'd actually love to perform some analysis on using the expressivity of description logics to provide the various bounds of an object and then combine this with a power law relationship to give probability bounds, eg: this chunk of data constitutes 95% of what I would consider to be the core of some object - aside to self: need to write that down fully.

How does privacy work at the database level; what do databases look like after processing for privacy...as we have seen with AOL [2] and Netflix, there are interesting issues when a database is "anonymised" and then combined with other databases.

Finally, what happens at the corpus level? Definitions get very weak here and maybe not amentiable to our traditional, formal mathematical treatment.

In the comments to the original Corfield's hierarchy article, there is a good point made in that mathematics can describe every level in the hierarchy equally well; but this doesn't mean that mathematics currently explains the links between the levels (hint: composition and aggregation are not easy) nor do we have a good model that explains the hierarchy at all. This might imply that our hierarchy is either wrong, we've missed levels or not decomposed the concepts well.

The main thing here is that I've written this down, even if it is in very early draft form...maybe this'll help with the writer's block...so, A theory of privacy anyone?

References

[1] Tim Finin. RDF molecules and lossless decompositions of RDF graphs
[2] Andrew Orlowski. AOL publishes database of users' intentions. The Register 7 Aug 2006.

Sunday 5 February 2012

Fried carrots with cheese and sage

Have to try this:

Fry 500g small sliced carrots in butter, garlic, touch of salt, pepper and chopped sage leaves. When cooked add a tablespoon (or two) of dark syrup. Serve with goats cheese broken over the hot carrots and spinach.

Might do this as the accompaniment with seared duck's breasts and a port wine sauce...

Also on a food theme, a blog from a photographer/chef friend: From A Cook's Heart

Thursday 2 February 2012

Welsh Independence

The Guardian today has an excellent, well-written, balanced article on the issue of Welsh independence in light of the moves by the Scottish Government.

Could Wales leave the United Kingdom?
Talk of independence is growing – and the referendum in Scotland in 2014 is eagerly awaited. But could Wales really break free from England – and stand on its own?John Harris
guardian.co.uk, Wednesday 1 February 2012 20.00 GMT
 
Leanne Wood is rather different from most of the UK's politicians. Forty years old and a mother of one, she still lives in the same street in the Rhondda Valley where she was born and brought up. She thinks the crash of 2008 should have "resulted in the rejection of capitalism and many of its basic economic and political assumptions", and that the UK's coalition amounts to a "hyper-competitive, imperial/militaristic, climate-change-ignoring and privatising government". She is also a proud republican, who refuses to attend the kind of official events at which the Queen turns up, and was once thrown out of the Welsh Assembly for referring to the reigning monarch as "Mrs Windsor". If any of this chimes with your general view of what's wrong with the world, it's fair to say that you'd like her.

 
Quite interestingly, this article (and comments) are devoid of the usual rabit, anti-devolutionist talk but rather focus on some interesting issues. Now if looking at a post-independent Scotland UK, what would this mean for Northern Ireland and Wales. Certainly in this situation we'd have an extremely powerful English central parliement with two very small additions. Does this imply we'd automatically get a federal UK (whatever the UK means at this point) - Spanish or German style federation? More importantly, at least from the English perspective, is what happens to England; especially what happens to the bulk of England that isn't London or the South-East?

Anyway, I digress, most of the talk about the Welsh Government is along the lines of a certain scene in Monty Python's Life of Brian .... "what have the Romans ever done for us?"