Saturday, 28 July 2012

Opa Language Tutorial: Part 1

I've been experimenting with the Opa programming language for a couple of weeks now and, for me, it addresses various issues of writing web/cloud services really quite well. It has a bunch of features and some very nice abstractions that make the job quite well. In a nutshell it combines the features and semantics of OCaml, Erlang and Javascript and binds them together. I'm not going to draw comparisons with other languages and frameworks - I'll leave that to others, but suffice to say, I like Opa and if it does the job, in this case really quite well, then I'm happy.

So that aside I though it would be nice to present some of my results and experiences of application design and programming in Opa as a tutorial. Opa already has a good overview tutorial and there's further discussion on StackOverflow, the original Opa tutorial (a tour of Opa) and on the Opa forums, but nothing beats learning a language by actually applying it to a real problem and within the process and method structures found in practice.

So what we're going to do is develop a system or application for managing regular expressions. Just something simple enough to easily build, understand and use enough basic features to make an interesting application. Here are the requirements:
  1. Store regular expressions with descriptions
  2. Store logical sets of regular expressions with descriptions
  3. Present the data back in JSON format
  4. Have some nice reports and static front-end to this database for documentation, debugging etc...
  5. Must be REST
  6. Must be "Cloudified"
OK, the last two are architectural in nature and part of the current web/cloud zeitgeist.

The REST paradigm has some good, architectural ideas and fits in quite well with that other love of mine: formal methods. Now before you run away screaming saying that "agile" is the way to go and we don't need that stinking design stuff, agile is just a way a managing your processes such that you focus on what functionality to produce next in order to immediately satisfy the customer. In order to understand and prioritise you need to understand what you're doing and effectively communicate that - formal methods give us that clarity. This isn't a tutorial (nor argument/discussion - I've written about this before here and here) about agile and formal methods but I'll be using their principles (see [1]).

But first, let's get started with some design. REST and web design talks about resources which are access via HTTP verbs and URIs. Skipping a long thought process, let's partially tackle requirement #4 and we get something like:

Resource GET
/ Return welcome page. Always successful (HTTP 200 OK)

Let's now write some Opa.

I assume you have Opa installed and are comfortable with programming in general. Here's the entire code for this part of the tutorial:

function hello() {
   Resource.styled_page("Expressions Server - Hello", ["/resources/css.css"],
         This server contains various regular expressions for data analysis either presented individually or by grouping and returned as JSON objects via the REST interface. See API documents for more information
       </> );

function error() {
   Resource.styled_page("Expressions Server - Error", ["/resources/css.css"],
          No idea of what you want, go read the API documents for the web and REST/JSON interfaces, or better still go read the source code!
      </> );

function start(uri) {
   match (uri) {
       case {path: [] ... }: hello();
       case {~path ...}: error();

     [ {resources: @static_include_directory("resources")} , {dispatch: start} ]

The two functions hello() and error() should be self-explanatory above - their syntax isn't too different from Java, C++ etc. Inside however we have a single statement:

Resource.styled_page("Expressions Server - Hello", ["/resources/css.css"],
         This server contains various regular expressions for data analysis either presented individually or by grouping and returned as JSON objects via the REST interface. See API documents for more information
       </> );

Resource.styled_page is an Opa function that returns web pages with a title, a attached files and some content.

  • The first parameter is a string containing the title of the page, in this case it is the string  "Expressions Server - Hello".
  • The second parameter is a list, denoted by square brackets and a comma separated items (we have only one item here). Each item is a string containing a link to some style resource. In this case the link is /resources/css.css which points to some (currently undefined) CSS file. 
  • The last entry is XHTML containing the content of the page.

Opa treats XHTML as part of the language and allows Opa code and XHTML to be freely mixed - cool feature! Strictly speaking XHTML in a datatype in Opa recognised by the use of opening and closing tags, in this case we're using div tags. We'll look more at what you can do with Opa and XHTML later but now we can see that this parameter in the above function simply presents us with a div section and some welcoming text.

When called Resource.style_page places all this together and returns a resource object which will be presented back to the caller as, effectively, a web page with DOCTYPE and other necessary structures. More strictly a Resource is anything that can be accessed by a URL, eg: webpage, picture, XML, JSON object etc...

Opa treats the last line of any function as the value to be returned by that function. Some languages use an expilcit return statement whereas Opa doesn't - this belies somewhat Opa's functional heritage.

Let's now jump to the Server.start() function:

     [ {resources: @static_include_directory("resources")} , {dispatch: start} ]

Server.start() is Opa's equivalent of main(). In this case we're passing two parameters:
  • the first is the configuration of the server - in this case Server.http is a convenience function that returns a suitable configuration for a basic, default http protocol server.
  • the second is a list of handlers and resources for the server. This appears to be a fairly complex and sophisticated structure, here we provide two parameters: a static resource definition and a function that gets calls when the server is accessed.
The interesting thing here is the dispatch function: start(). When we call our http server, it in turn dispatches that call to a function called start() which is where the interpretation of whatever we send to the server is done:

function start(uri) {
   match (uri) {
       case {path: [] ... }: hello();
       case {~path ...}: error();

Dispatch functions like start(), hard to believe, return objects of type Resource - we'll come to this little piece of static typing in a moment. For now start() takes a URI as parameter, this is passed by the server. So if I point my browser to "http:/" then the URI received by start() is "http:/".

The match-case takes this URI and tries to match it against the cases listed. URI's are actually structures containing fields such as path, query fields and port etc. In this case we're interested in the path. The syntax looks a little confusing but match passes the URI structure to each case statement, which then calls the path attribute of that URI. The case statement:

case {path: [] ... }:

tries to match the path of the URI against an empty list pattern. Uri paths are lists of strings if you're wondering. If a match is made then the function hello() is called.

If we pass the URI "http:/" to start() then the variable uri then the path of uri (ie: uri.path) will be empty - there is no path - and the empty list is denoted [] in Opa.

When we call the function hello() a Resouce object as described earlier is returned, this in turn is returned from the function start() back to our Server and ultimately to the program which called the server.

The second statement in the case pattern matching matches everything else. The tilde character (~) here states that is the pattern "path=path" matched which happens to be tautology and we call the function error() which behaves similarly to hello().

I guess that more or less explains the code. So let's compile it:

ian@U11-VirtualBox:~/opatutorial$ opa tutorial1.opa
Warning unused
File "tutorial1.opa", line 24, characters 14-18, (24:14-24:18 | 705-709)
  Unused variable path.
Warning inclusions.directory_empty
Directory /home/ian/opatutorial/resources is empty.

The command to compile is "opa" followed by the name of the file. The extension is optional but I use .opa for identification. You'll notice the compiler complaining about unused variables and empty resource directories. Take all warnings seriously:
  • in line 24, case {~path ...}: error(), we just don't use path for anything other than the pattern matching, so this is fine(ish).
  • We've included information about resources and their location but not provided anything at runtime. We don't worry about this now, I've no CSS files there anyway.
The compiler generates an executable, which in this case happens to be a file containing JavaScript and requires the existence of node.js on your system. You might need to make this file executable (I'm using Linux for these examples, never tried this on Windows nor Mac sorry) and the file contains the necessary headers to allow the various Linux shells to execute the script from the command line. Note, it may take a short while to start up

ian@U11-VirtualBox:~/opatutorial$ chmod u+rwx tutorial1.js
ian@U11-VirtualBox:~/opatutorial$ ls -l tutorial1.js
-rwx------ 1 ian ian 6171 Jul 28 10:59 tutorial1.js
ian@U11-VirtualBox:~/opatutorial$ ./tutorial1.js
http serving on http://U11-VirtualBox:8080

You can leave out the chmod and ls in future, these are there just for explanatory purposes here. Once the executable reports the http server is running and where - note the default use of port 8080, we can either conenct a browser there or use something like curl:

ian@U11-VirtualBox:~/opatutorial$ curl -i -X GET
HTTP/1.1 200 OK
Cache-Control: no-cache
Pragma: no-cache
Date: Sat, 28 Jul 2012 08:03:16 GMT
Server: http
Content-Type: application/xhtml+xml; charset=utf-8

NB: I've cut most of the output just to show the success message (HTTP/1.1 200 OK) from the server. In a browser window it looks like so:

and if we use a longer path (any longer path in this instance will work):

As you can see both of the above correspond to the pattern matching and the calls we make to the hello() and error() functions for this statically defined content in our application.

To close the server use Ctrl+C (or kill -15, kill -9 as necessary)

So that's it for the moment...a fully functional, running web server with a couple of static pages. Not much, but it is a time we'll get in to the database and some REST calls.


[1] Ian Oliver (2007) Experiences of Formal Methods in Conventional Software and Systems Design. BCS-FACS Meeting on Formal Methods in Industry. December 2007. London, UK


Ora said...

Well done. I must say, though, that the MATCH statement (or your explanation thereof, I cannot decide) is a bit confusing...

Ian Oliver said...

Thanks, I'll work on that...match-case is probably too powerful in Opa for its own good :-) But I'll give it some thinking certainly, maybe part 2 which I'm writing now will explain more - bigger match-case statement.