Ian's Blog: Categorification of Databases

Friday, 8 January 2010

Categorification of Databases

Interesting paper: SIMPLICIAL DATABASES David I Spivak (Arxiv 0904.2012.v1) and the discussion on LtU.

From the abstract:

In this paper, we dene a category DB, called the category of
simplicial databases, whose objects are databases and whose morphisms are
data-preserving maps. Along the way we give a precise formulation of the
category of relational databases, and prove that it is a full subcategory of
DB. We also prove that limits and colimits always exist in DB and that they
correspond to queries such as select, join, union, etc.

and then from the introduction:

One reason that relational databases have been so successful is that their de-
nition can be phrased within a precise mathematical language.

and

It is no accident that SQL uses tables instead of relations: Tables are inherently
more useful, yet just as easy to implement.

So, a category of databases based around tables and morphisms between tables - now as my interest is rather in graph based databases, ie: RDF - there are papers on categorification of RDF or at least using category theory over RDF structures. Of course an isomorphism can be shown (and does exist), but how easily do operations such as join etc (which form limits in the above category DB) correspond to "objects" or more accurately structures such as "molecules" and named graphics in RDF. The description logic handbook has references to papers about the expressability of ER in terms of various DLs...something to think about in the meantime.

Friday, 8 January 2010

Categorification of Databases

No comments: