Friday, 20 April 2012

A month in space

The Guardian runs a series (actually various series) on science topics, the series "A Month in Space" this month has a set of simply stunning pictures. Alas I can't link to the pictures, but I can link to the article:

A month in space: A Martian dust devil, Milky Way bubbles and a star trek navigational aid – in pictures

This month's roundup of some of the best space-related images includes a dust devil and its shadow on the surface of Mars, some of the cosmic bubbles spotted by 35,000 citizen scientists, and a new navigational technique for starships exploring the final frontier
Eric Hilaire and James Kingsland  Friday 20 April 2012 19.04 BST

As for other series, there's an excellent one working through the Periodic Table of the Elements...this month, that most well known of elements: Praseodymium.

Tuesday, 17 April 2012

Privacy, Dataflow and Nissenbaum ... formalisation?

I read the article by Alexis Madrigal of The Atlantic about Helen Nissenbaum's approach to privacy. It is good to see someone talking about sharing of information as being good for privacy. Maybe this is one of the rare instances that the notion of privacy has been liberated from being all about hiding your data, protecting the "consumer" to actually, in my opinion, being about how data flows.

To quote from the article and given a good example:
This may sound simple, but it actually leads to different analyses of current privacy dilemmas and may suggest better ways of dealing with data on the Internet. A quick example: remember the hubbub over Google Street View in Europe? Germans, in particular, objected to the photo-taking cars. Many people, using the standard privacy paradigm, were like, "What's the problem? You're standing out in the street? It's public!" But Nissenbaum argues that the reason some people were upset is that reciprocity was a key part of the informational arrangement. If I'm out in the street, I can see who can see me, and know what's happening. If Google's car buzzes by, I haven't agreed to that encounter. Ergo, privacy violation.

First thing here is that Nissenbaum gets us past the privacy as a binary thing: its private or public where private means hidden. Nissenbaum actually promotes the idea of how we perceive the data flow rather than whether something is private or public; again quoting from the article:

Nissenbaum argues that the real problem "is the inapproproriateness of the flow of information due to the mediation of technology." In her scheme, there are senders and receivers of messages, who communicate different types of information with very specific expectations of how it will be used. Privacy violations occur not when too much data accumulates or people can't direct it, but when one of the receivers or transmission principles change. The key academic term is "context-relative informational norms." Bust a norm and people get upset. 

For a while I've been working on formalising architectures, ontologies, taxonomies and so on for privacy (privacy engineering) - the common factor in all of these is the data-flow. Actually I think some of this is quite simple when thought of in this manner, firstly we construct a simple data-flow model:



Aside: this is quite informal and the following just sketches out a line of thinking rather than being a definition.

Some information I flows from A to B. For this information I we can extract a number of aspects: sensitivity, information type, identity (amount of) etc. We can also ask the question of this particular interaction (A,I,B) of whether that information I is relevant to the particular set of transactions or services that B provides. If B requires a set of information H to work for fulfil the contract with A then I<=H in this case, which allows A to supply less but should discourage B asking for more.

We can also look at other factors in this to make that decision: the longevity of information in B, the ownership of the information once passed to B and importantly, whether B passes this information on - we come to this latter point later. Ultimately we can assign a weight to this data-flow, though what form of metric this is I don't have a good idea about at the moment but let's call it 'm', ie: a(I) is some measure of the 'amount of information' weighted by the various aspects and classifications. The above I<=H should then be rewritten as a(I)<=a(H) which better takes into account of the weightings of the information classifications.

This we can continue through a number of other flows and introduce a typing or taxonomic structure for the nodes:



As B is a bank then the amount of information required tends to be high, if C is on-line shop, then this tends to be lower and so on. Such a rule might be:

forall u:User, b:Bank, c:OnlineShop, d:NewsSite |
    a( u-->b ) => a( u-->c ) and
    a( u-->c ) => a( u-->d )
    ...

For each node, we can better describe the expectation in terms of this metric, ie: a(b) where b is the Bank node from above, we get the rule from earlier:

forall u:User, b:Bank |
    a( u-->b ) <= a(b)

Now our weighting function a deals with particular instances, where as we have stated that that there are expectation, so let's introduce a new function that computes a range for a given type, for example r(Bank) returns a range [ r_min, r_max ]. Then for a particular instance of Bank we get

forall b:Bank |
      r_min(Bank) <= a(b) <= r_max(Bank)

If a given instance, for example e in the above data-flow requires something outside the range for its type then we are "busting a norm" for that particular type, and following on from the above rules:


forall u:User, b:Bank |
      r_min(Bank) <= a(b) <= r_max(Bank)
         and
      a( u-->b ) <= a(b)


The next thing is to look at the next level in the data-flow graph, to where do B,C,D and E send their information, how much and how do these data-flows affect the first - I guess there's a very interesting feedback loop there. A few other things spring to mind as well: do we see a power law operating over the weighting of the data-flows? Does it matter to where and how much data flows?

Introduce a temporal dimension and plot the above over time and you get a picture of the change in norms and consumer expectations.

Getting back to Nissenbaum's thesis which is that the expectation of privacy over data-flows is the key and not whether the data-flows at all, I think we could reasonably model this.

Monday, 16 April 2012

LoD 2012: Layout of Diagrams 2012

LOD 2012

3rd International Workshop on Layout of Diagrams 

Diagrams are an effective means of conveying a wide variety of different sources of information. They play a vital role in communication at many levels, and the quality of diagram layout effects the ease of communication. Creating task-adequate layouts is surprisingly difficult, and the cognitive factors involved are often not very well understood. The automatic generation of diagrams is essential for many tasks such as the presentation of multiple views of large scale data sets. In many domains, tool support is not satisfactory. The workshop aims to bring together different communities that can learn from each other, both within different academic disciplines and between academia and industry.
We solicit original submissions related to diagram layout, in areas including, but not limited to, the following areas:
  • Layout algorithms ;
  • Layout design styles, guidelines and patterns;
  • Automatic diagram generation and transformation techniques;
  • Visual language theory (e.g. quality metric development) ;
  • Visualisation of constraints, algorithms, and tools;
  • Cognitive or empirical research on diagram layout;
  • Diagram layout for diagrammatic reasoning, knowledge representation, etc;
  • Application areas (e.g. system modelling, ontology visualisation, etc, with an emphasis on layout requirements and benefits).
All diagram types fall within the scope of the workshop series (e.g. graphs, hypergraphs, Euler diagrams, maps, knot diagrams, etc). Of particular interest is research and techniques that may encourage interaction and knowledge transfer between fields. However, for this instalment of the workshop series, we particularly encourage submissions that have a software engineering orientation (e.g. those within UML, IDEF or ARIS), building on the success of two previous workshops at VL/HCC on the Layout of Software Engineering Diagrams (LED).

The proceedings of this workshop will be published in the journal of Electronic Communications of the EASST (TBC). All papers must conform to the ECEASST format. All papers must be submitted electronically in the PDF format via the EasyChair submission system. Each submission will be reviewed by 3 reviewers, as usual.

The intention is that after the workshop the authors of the best papers will be invited to submit a revised and significantly extended version of their article to a special issue of a sutiable journal (e.g. Journal of Visual Languages and Computing or the journal of Software and Systems Modeling).

Important Dates

  • Abstract Submission: June 11th, 2012
  • Paper Submission: June 18th, 2012
  • Notification to Authors: August 8th, 2012
  • Camera Ready Submission: August 30th, 2012
  • Workshop Dates: Oct 4th, 2012

 

 

Software Documentation and Formality

An article linked via Slashdot about using documentation as a bug finding tool [1] gets me wondering about why documentation is so neglected. There are of course many, well researched and known reasons but a common theme I keep seeing is that documentation implies a degree of formality, ie: the necessity to describe something accurately.

This in addition with the fact that documentation tends to be very static and the antithesis of agile methods which concentrate on the code rather than the documents.

But maybe here is one issue that does need to be actively tackled and that is firstly, code is part of the documentation and secondly the documentation necessarily contains more than the code. Keeping these various aspects consistent is, of course hard - exceedingly so given that the underlying model that does link the various levels of description and abstraction together is never visible. The underlying model implies a formal definition of how documentation relates to code.

So far I've just managed to describe some of the ideas of the Model driven Architecture which apart from some academic work never really succeeded in solving the documentation-code issue. Probably the best example I've seen is the B-Method which does make the link between various descriptions of a system clear through formal refinement, albeit in the same language and with a very well-defined underlying model.

But what is it that we're really trying to convey using "documentation" - is it really an abstract description of the code, or better still an abstract description of what the system does, which then the code is a refinement of?

And at what level of abstraction?

[1] Documentation as a Bug-Finding Tool April 11, 2012 | Eric

Monday, 2 April 2012

Visual Modelling Group @ Brighton

Just a quick plug for the Visual Modelling Group at the University of Brighton:


The group’s work focuses on visual languages, often developing tool support alongside defining their theoretical underpinnings.

One of the group's main strands of research is designing new diagrammatic logics that are appropriate for practical application. These logics have included spider diagrams, constraint diagrams and, most recently, concept diagrams. In all cases, these logics have formally defined syntax and semantics and we have developed inference rules that allow sound reasoning to be performed. The group has also established expressiveness, decidability and completeness results for some of these logics and their fragments.

In terms of tool support, the group has devised novel automatic drawing and layout techniques for Euler diagrams. These diagrams are commonly used for visualizing information concerning grouped data. Automated theorem provers have been implemented for Euler diagrams and spider diagrams.

And a link to an impressive set of papers on the subject: