Friday, 19 July 2013

Systems Safety - Defining Moments

As I've been concentrating on "safety improvements", or at least techniques for the improvement of system I've tended to concentrate on four areas:
  • Aviation
  • Industrial
  • Medical (specifically surgical)
  • Software Engineering (specifically information privacy)
The parallels between theses areas when it comes to what each define as "safety" and the techniques should be obvious. However the question remains what actually triggered each of these respective areas to take a more inherent safety approach and then what will trigger a similar approach with regards to information safety?

 
Above Diagram Key: Y-Axis: relative degree of safety embedded into that discipline, X-Axis, year of time since seminal incident.

Aviation safety's seminal moment was the 1935 crash of a Boeing Model 299 aircraft during a presentation flight. Instead of blaming the pilots, effort was made to understand the causes of the accident and develop techniques to help prevent similar accidents in the future.

For industrial safety the seminal moment was the 1974 Flixborough Disaster in the UK. This resulted in work on the design of industrial plants and the development of the notion of "inherent safety".

Surgical safety has quite a long tradition especially with the development of anaesthetic safety from the 1960s and the introduction of a proper systems approach. However anesthetists seem not to feature prominently as surgeons and doctors so the fame would probably go to Peter Pronovost et.al. for the Central Line checklist. This was probably one of the major contributors to the WHO Surgical Safety Checklist discussed in detail in Atul Gawande's book The Checklist Manifesto which brings together much of the above incidents. 

If you're still in doubt maybe Atul Gawande's article in the New Yorker magazine entitled The Checklist: If something so simple can transform intensive care, what else can it do? (Dec 10, 2007) might help.

Getting back to the crux of this article, what is the incident that will cause the wholesale change in attitudes and techniques to software engineering that instills such a sense of discipline that we can eradicate errors to such a degree that we could compare ourselves favourably with other disciplines?

The increasingly frequent hacking and information leaks? The NSA wiretapping and mass surveillance? Facebook and Google's privacy policies? None of these have had any lasting effect upon the very core of software engineering if any at all. Which either means that we place such low value on the safety of our information or that the economics of software are so badly formulated in society that the catastrophe would have to be so huge that it would have to cause societal change?

Interestingly, in software engineering and computer science we're certainly not short on techniques for improving the quality and reliability of the systems we're developing: formal methods (eg: Alloy, B, Z, VDM etc), proof, simulation, testing, modelling (in general). What we probably lack is the simplicity of a checklist to guide us through the morass of problems we encounter. In this last respect, this is why I think we're more like surgeons that modern day aviators; or, maybe some of us are like the investigators to the 1935 Boeing crash and other aviation heroes learning their trade?




No comments: