Mini OCaml Tutorial

Task: Simple Formula Library

We think it is best to illustrate the problems encountered and how we decided to organize code in Toss on an example task. In this tutorial, we will build a very simple program to manipulate propositional formulas. Let's state the task as follows: Build a program to compute the NNF of formulas given on input. We will construct the program in several steps, starting with basic OCaml and adding complexity.

Step 1 — The Functions

In this step we construct the main file with types and functions that do the work. This requires nothing more than the basic OCaml tutorial and results in the following file.
Formula.ml
type formula =
  | Var of string
  | Not of formula
  | Or of formula * formula
  | And of formula * formula

let rec str = function
  | Var s -> s
  | Not f -> "(not " ^ (str f) ^ ")"
  | Or (f, g) -> "(" ^ (str f) ^ " or " ^ (str g) ^ ")"
  | And (f, g) -> "(" ^ (str f) ^ " and " ^ (str g) ^ ")"
    
let rec nnf ?(negate=false) = function
  | Var s -> if not negate then Var s else Not (Var s)
  | Not f -> nnf ~negate:(not negate) f
  | Or (f, g) -> if not negate then Or (nnf f, nnf g) else
      And (nnf ~negate:true f, nnf ~negate:true g)
  | And (f, g) -> if not negate then And (nnf f, nnf g) else
      Or (nnf ~negate:true f, nnf ~negate:true g)

let _ = print_endline (str (nnf (Not (And (Var "X", Not (Var "Y"))))))      
    
Note that we have only two functions: str to print a formula to string and nnf to convert it to NNF, with an optional argument that allows to negate the formula we are converting. The last line calls the functions on an example formula (not (X and not Y)), and if you execute ocaml Formula.ml you will get the result, namely ((not X) or Y).

Step 2 — Splitting and Compiling

During development, we will almost never keep implementations of types and functions together with the calls which are user-visible — like the one in the last line in the code above. To account for this, let us split the above code into a Formula.ml file containing everything but the last line, and a Main.ml file with two lines: one saying open Formula and another with exactly the last line of the code above. You can now compile these files by running ocamlopt Formula.ml Main.ml and execute the resulting file ./a.out. Since with more files it becomes harder and harder to account for the dependencies, we will not call ocamlopt but use ocamlbuild instead. It is a build system which automatically recognizes dependencies between OCaml files, so to build our two files you just need to execute ocamlbuild Main.native. This results in a file Main.native which, when executed, will print out the same result ((not X) or Y). Note that ocamlbuild will create a directory _build to store intermediate files.

Step 3 — Lexer and Parser

To allow the user to input formulas, we have to construct a lexer and a parser. We will not do it by hand, but use a parser generator called menhir. Here is the simple lexer file we use for lexing formulas.
Lexer.mll
{
type token =
  | ID of (string)
  | AND
  | OR
  | NOT
  | OP
  | CL
  | EOF
}

rule lex = parse
  | [' ' '\t']      { lex lexbuf }
  | "and"           { AND }
  | "or"            { OR }
  | "not"           { NOT }
  | "("             { OP }
  | ")"             { CL }
  | ['A'-'Z' 'a'-'z' '_']['0'-'9' 'A'-'Z' 'a'-'z' '_']* as s { ID (s) }
  | eof             { EOF }
    
and here is a very simple parser.
Parser.mly
%token <string> ID
%token AND OR NOT OP CL EOF

%left OR AND
%nonassoc NOT

%{
  open Lexer
  open Formula
%}

%start parse_formula
%type <Formula.formula> parse_formula formula_expr

%%

%public formula_expr:
  | ID                                    { Var ($1) }
  | NOT formula_expr                      { Not ($2) }
  | formula_expr AND formula_expr         { And ($1, $3) }
  | f = formula_expr OR g = formula_expr  { Or (f, g) }
  | OP f = formula_expr CL                { f }

parse_formula:
  | formula_expr EOF                      { $1 }
    
Note that the syntax is similar to the Yacc system, and a complete documentation can be found in the reference manual of menhir. The Main.ml file can now look like this.
Main.ml
open Formula

let formula_of_string s = Parser.parse_formula Lexer.lex (Lexing.from_string s)

let _ = print_endline (str (nnf (formula_of_string "not (X and not Y)")))
    
and to build it we need to execute to tell ocamlbuild how to call menhir, as follows.
ocamlbuild -use-menhir -menhir "menhir --external-tokens Lexer" Main.native
    
The result of executing ./Main.native should not change, so it should still be ((not X) or Y).

Step 4 — Testing with oUnit

Having a lexer and a parser, we can now complete the simple program for reading formulas from the input and calculating their NNF. But we would also like to keep the test we had above — to make sure that, during further development, the NNF of at least this sample formula will remain correct. This is called unit testing and we will use the oUnit package for it. A small test file with our one test is given below.
FormulaTest.ml
open OUnit
open Formula

let formula_of_string s = Parser.parse_formula Lexer.lex (Lexing.from_string s)

let tests = "Formula" >::: [
  "nnf" >::
    (fun () ->
      assert_equal ~printer:(fun x -> str x)
        (formula_of_string "((not X) or Y)") 
        (nnf (formula_of_string ("not (X and not Y)")))
    );
]
    
We can now complete the Main.ml file. We will make it run the tests if the option -t is given or read a formula from the input and convert it to NNF otherwise.
Main.ml
open Formula
  
let formula_of_string s = Parser.parse_formula Lexer.lex (Lexing.from_string s)

let main =
  let run_tests = ref false in
  let opts = [
    ("-t", Arg.Unit (fun () -> run_tests := true), "run unit tests");
  ] in
  Arg.parse opts (fun _ -> ()) "Try -help for help or one of the following.";
  if !run_tests then 
    ignore (OUnit.run_test_tt ~verbose:true FormulaTest.tests)
  else
    print_endline (str (nnf (formula_of_string (read_line ()))))
    
To compile Main.ml now, with oUnit, we have to add the path to oUnit to ocamlbuild. The following command works both on Linux and on MacOSX with MacPorts. If you use OPAM, the path might be different, e.g. ~/.opam/4.00.1/lib/oUnit — check it out.
ocamlbuild -libs unix,oUnit -lflags -I,+oUnit,-I,+site-lib/oUnit
  -cflags -I,+oUnit,-I,+site-lib/oUnit -use-menhir
  -menhir "menhir --external-tokens Lexer" Main.native
    

Step 5 — Interface Files and Documentation

While we are ready with our small program, it is still useful to look which functions from Formula.ml we use in Main.ml. Since we just call them, we could in principle use all of them. This is not a good practice, especially for larger modules with many functions which should remain private in the module. To declare which functions should be public, and to document them, we use mli files with documenting comments with double **. Here is an example of the mli file for our Formula module.
Formula.mli
(** Module for computing the NNF of formulas. *)

(** Representation of propositional formulas. *)
type formula =
  | Var of string  (** Propositional variable *)
  | Not of formula (** Negation of a formula  *)
  | Or of formula * formula   (** Disjunction *)
  | And of formula * formula  (** Conjunction *)

(** Convert a formula to a string. *)
val str: formula -> string

(** Negation normal form of a formula. *)
val nnf: ?negate: bool -> formula -> formula      
    
To build documentation in HTML from this file, create a new file called Formula.odocl listing the mli files to document, i.e. in this case containing just a single line saying Formula.
Formula.odocl
Formula
    
With this file present, run the following command and look at the file index.html in the created directory Formula.docdir.
ocamlbuild -libs unix,oUnit -lflags -I,+oUnit,-I,+site-lib/oUnit
  -cflags -I,+oUnit,-I,+site-lib/oUnit -use-menhir
  -menhir "menhir --external-tokens Lexer" Formula.docdir/index.html
    

Step 6 — JavaScript Interface

While the program for computing the NNF is finished now and can be run from the command line, most people nowadays do not use the command line but rather run their programs in the browser. OCaml programs can be compiled to JavaScript and run in the browser using js_of_ocaml. While you can find the complete manual on the above site, we present here a simple way of using js_of_ocaml to compute NNF in the browser. Below is the HTML file with the interface and the JavaScript calls, have a look at it and note the new Worker ("JsClient.js"); call, which we explain in detail below.
index.html
<html>
<head>
  <meta http-equiv="Content-Type" content="text/xhtml+xml; charset=UTF-8" />
  <title>Tutorial</title>
  <script type="text/javascript">
<!--
var worker = new Worker ("JsClient.js");
var worker_handler = new Object ();

worker.onmessage = function (m) {
    if (typeof m.data == 'string') {
        console.log("" + m.data);
    } else {
        console.log ("[ASYNCH] back from " + m.data.fname);
        var handler = worker_handler[m.data.fname];
        handler (m.data.result);
    }
}

function ASYNCH (action_name, action_args, cont) {
    worker_handler[action_name] = cont;
    worker.postMessage ({fname: action_name, args: action_args});
    console.log ("[ASYNCH] " + action_name + " (" + action_args + ")");
}

function convert_nnf () {
  var txt = document.getElementById ("formula").value;
  ASYNCH ("nnf", [txt], function (resp) { alert (resp) })
}
//-->
</script>
</head>

<body>
<textarea id="formula" rows="2" cols="40">
not (X and not Y)</textarea> 
<button onclick="convert_nnf()">Convert to NNF</button>
</body>

</html>
As you can see, the code above sends the content of the textarea to the function labeled nnf in the worker. Recall that web workers were introduced in HTML5 to start a parallel thread from JavaScript. We always execute compiled OCaml code in a worker thread, to make sure that it does not make your browser hang, even if the computation takes very long. The postMessage function is the standard way to pass data to the worker, and worker.onmessage is used to get the response. We wrap the calls to the worker in an ASYNCH function, which also allows us to log all calls (useful for debugging). Let us now present the worker itself, which results from compiling the following ml file.
JsClient.ml
open Formula

(* Boilerplate code for calling OCaml in the worker thread. *)
let js_object = Js.Unsafe.variable "Object"
let js_handler = jsnew js_object ()
let postMessage = Js.Unsafe.variable "postMessage"

let log s = ignore (Js.Unsafe.call postMessage (Js.Unsafe.variable "self")
		      [|Js.Unsafe.inject (Js.string s)|])

let onmessage event =
  let fname = event##data##fname in
  let args = event##data##args in
  let handle = Js.Unsafe.get js_handler fname in
  let result = Js.Unsafe.fun_call handle (Js.to_array args) in
  let response = jsnew js_object () in
  Js.Unsafe.set response (Js.string "fname") fname;
  Js.Unsafe.set response (Js.string "result") result;
  Js.Unsafe.call postMessage (Js.Unsafe.variable "self") [|Js.Unsafe.inject response|]

let _ = Js.Unsafe.set (Js.Unsafe.variable "self") (Js.string "onmessage") onmessage

(* The NNF conversion and registration in JS. *)
let formula_of_string s = Parser.parse_formula Lexer.lex (Lexing.from_string s)
let js_nnf s = 
  log ("computing nnf of " ^ (Js.to_string s));
  Js.string (str (nnf (formula_of_string (Js.to_string s))))

let _ = Js.Unsafe.set js_handler (Js.string "nnf") (Js.wrap_callback js_nnf)
    
The first, longer part of the above file is just boilerplate code which creates the postMessage function, required for a web worker, in js_of_ocaml. We will not explain it in detail here, note only the log function, which allows to log messages to the browser console. The actual NNF computation and registration in JavaScript is done in the last 6 lines of the above file. As you can see, we use the same OCaml functions formula_of_string and nnf as in previous steps. The only difference is that we have to convert from JavaScript strings to OCaml with Js.to_string and back with Js.string. In the last line, we register the js_nnf OCaml function under the name nnf in JavaScript. It is then called in JavaScript using the invocation ASYNCH ("nnf", [args], continuation), as shown above in the index.html file. Here is how we compile JsClient.ml.
ocamlbuild -use-menhir -menhir "menhir --external-tokens Lexer" \
  -pp "camlp4o -I /opt/local/lib/ocaml/site-lib js_of_ocaml/pa_js.cmo" \
  -cflags -I,+js_of_ocaml,-I,+site-lib/js_of_ocaml -libs js_of_ocaml \
  -lflags -I,+js_of_ocaml,-I,+site-lib/js_of_ocaml JsClient.byte
js_of_ocaml JsClient.byte
    
Note that the first command is just a standard ocamlbuild call, only this time we compile to bytecode, not to a native executable, and we use the preprocessor that comes with js_of_ocaml (used e.g. in the call event##data##args). The second command invokes js_of_ocaml and compiles the bytecode to a JavaScript file JsClient.js. You can now open index.html in the browser and compute the NNF of formulas. You should also open the JavaScript console (Control-Shift-k in Firefox and Control-Shift-j in Chrome) and see the logs, to make sure it works (use google-chrome --allow-file-access-from-files to start Chrome and allow web workers to be executed from files on disk). While this tutorial presented only a very simple program, the general rules discussed above and the JavaScript boilerplate are at the core of code organization and user interface in Toss.