Getting Started with LLVM and the Ocaml LLVM Tutorial

I recently started playing with LLVM, and I ran into a few issues with getting started that didn’t have google-able solutions, so I’m posting this on the off chance it might help someone else.

The directions on the LLVM website are very detailed, but intimidatingly long. Getting a build working is actually really simple. Aaron Gray put together an amazingly easy four step guide on bulding LLVM for cygwin here (and the directions worked fine for me on Unbuntu, too). It looks like he was working with an older version of LLVM, and you can actually skip his “build the tools” step. If you want to do that step anyway, and follow his instructions with the current version of cygwin, the tools will drop into a “Debug+Asserts” directory, instead of a “debug” directory.

Also, on cygwin, you can build with gcc3 or gcc4 but, if you have gcc3 and ocaml installed, you have to install gcc4 (or fiddle with your path). Ocaml installs enough components of gcc4 that configure detects that you have gcc4, but during the compile process, cc1plus from gcc3 will complain about arguments that are only legal with gcc4:

cc1plus: error: unrecognized command line option “-Wno-variadic-macros”
make[3]: *** [llvm-main.o] Error 1

If you’re on 64-bit Ubuntu, you’ll need to do sudo apt-get install libc6-dev-i386, or you’ll get an error related to stubs-32.h

That’s it! It’s really that simple.

If you want to build a back-end, there’s a nice guide here. As with the official instructions, the prerequisite reading list, which states “These essential documents must be read before reading this document”, is intimidating long, but it’s actually simple enough that you can just dive in and refer to the detailed manual when an obscure reference pops up, JIT :-).

There’s an even simpler guide to writing a front-end by Chris Lattner here, with an Ocaml translation by Erick Tryzelaar here.

The Ocaml tutorial just requires LLVM + Ocaml + camlp4. If you don’t have camlp4, you’ll know it because you’ll get

/usr/bin/ocamldep -pp camlp4of -modules >
sh: camlp4of: not found

On Ubuntu, you just need to do sudo apt-get install camlp4-extra to fix that.

Also, the tutorial has a few minor bugs.

In chapter 3

| '+' -> build_add lhs_val rhs_val "addtmp" builder
| '-' -> build_sub lhs_val rhs_val "subtmp" builder
| '*' -> build_mul lhs_val rhs_val "multmp" builder

Should be

| '+' -> build_fadd lhs_val rhs_val "addtmp" builder
| '-' -> build_fsub lhs_val rhs_val "subtmp" builder
| '*' -> build_fmul lhs_val rhs_val "multmp" builder

Since everything is converted to a double, but the base build_* functions expect an int, you’ll get

lvm::BinaryOperator::init(llvm::Instruction::BinaryOps): Assertion `getType()->isIntOrIntVectorTy() && “Tried to create an integer operation on a non-integer type!”‘ failed.

unless you use operations that expect floats

Similarly, in Chapter 5,

let next_var = build_add variable step_val "nextvar" builder in

should be

let next_var = build_fadd variable step_val "nextvar" builder in

Also, in chapter 4, to get extern functions working, see this.

And if you compiled using the instructions above, you’ll need to build with ocamlbuild -cflags -I,+llvm-2.8 -lflags -I,+llvm-2.8 toy.byte instead of just ocamlbuild toy.byte (or you can modify your build file).

Getting Started with LLVM and the Ocaml LLVM Tutorial

Project Euler in F#: Problem 219

Skew-cost coding

Let A and B be bit strings (sequences of 0’s and 1’s).
If A is equal to the leftmost length(A) bits of B, then A is said to be a prefix of B.
For example, 00110 is a prefix of 001101001, but not of 00111 or 100110.

A prefix-free code of size n is a collection of n distinct bit strings such that no string is a prefix of any other. For example, this is a prefix-free code of size 6:

0000, 0001, 001, 01, 10, 11

Now suppose that it costs one penny to transmit a ‘0’ bit, but four pence to transmit a ‘1’.
Then the total cost of the prefix-free code shown above is 35 pence, which happens to be the cheapest possible for the skewed pricing scheme in question.
In short, we write Cost(6) = 35.

What is Cost(109) ?

A lot of people stumbled across this blog, searching for the solution to problem 219, which seemed like a good indication that it might be a fun problem, so I skipped ahead to try it out. Unlike the early problems that I’d been solving, the really brain-dead obvious solution didn’t work here: using Huffman coding and maintaining the entire tree is way too big. But, since we always take the lowest cost node, and add 1 and 4 to the costs, the range of node costs at any time is at most five, so the slightly less brain-dead solution of keeping track of node counts and iterating works fine.

module Problem219 =
    let cost (c,(n0,n1,n2,n3,n4)) = c*n0 + (c+1L)*(n1) + (c+2L)*(n2) + (c+3L)*(n3) + (c+4L)*(n4)

    let counts = (1L,(1L,0L,0L,1L,0L)) |> Seq.unfold(fun (c,(n0,n1,n2,n3,n4)) ->
        if n0 > 0L then Some((c,(n0,n1,n2,n3,n4)), (c,(n0-1L,n1+1L,n2,n3,n4+1L)))
        else Some((c,(n0,n1,n2,n3,n4)), (c+1L,(n1-1L,n2+1L,n3,n4,1L))))

    let ans = cost (Seq.nth (1000000000-2) counts)

In retrospect, using a recursive function instead of Seq.unfold probably would have been cleaner and, of course, this could be done in log time by doing n0 steps in single iteration, but I’m trying to keep my total time spent solving problems under my time spent writing up solution, so I’ll stick with this solution. I’ll probably go back and re-write the cost function at some point, though, since it’s just plain ugly.

A conversation with Seth the other day reminded me that I should be unit testing, so I started playing with xUnit, putting the incremental tests I’d been doing in the REPL into unit tests.

    let Test1 = Seq.nth 3 counts|> MustEqual (3L, (0L, 2L, 1L, 1L, 1L))
    let Test2 = Seq.nth 5 counts|> MustEqual (4L, (0L, 3L, 1L, 1L, 2L))
    let Test3 = minCost 6 |> MustEqual 35L
    let Test4 = minCost 12 |> MustEqual 96L

I’m probably going to cut back on project Euler for now; I started working through these as a way to scratch my coding itch and learn F#, since I hadn’t written any code for about a year, but the problems aren’t very coding intensive, and I’ve started writing some F# and JavaScript for my day job.

Project Euler in F#: Problem 219

Project Euler in F#: Problem 21

Evaluate the sum of all the amicable numbers under 10000.

    let sumOfDivisors n = seq {for m in n)) do if n % m = 0 then yield! [m;n/m]} |> Seq.sum |> (+) 1
    let ans = {1..10000} |> Seq.filter (fun n -> sumOfDivisors (sumOfDivisors n) = n && sumOfDivisors n <> n) |> Seq.sum

The naive solution I came up for this returns a sequence of pairs. I’m not sure what the idiomatic and/or efficient solution is in F#, so I just yield!ed the pair as a list. Any F# experts have a better way?

I should add a disclaimer that while this sum of divisors function works for this problem, it’s actually wrong for perfect squares, just in case anyone stumbles on this code and tries to use it.

After seeing how explicitly calling DivRem sped up problem 16, I wondered if explicitly saving the sum would speed this up.

    let sumOfDivisors n = seq {for m in n)) do if n % m = 0 then yield! [m;n/m]} |> Seq.sum |> (+) 1
    let amicable n =
        let m = sumOfDivisors n
        n = sumOfDivisors m && m <> n
    let ans = {1..10000} |> Seq.filter amicable |> Seq.sum

But this gives the exact same performance; I guess the compiler does the optimization automatically.

Project Euler in F#: Problem 21

Project Euler in F#: Problem 19

You are given the following information, but you may prefer to do some research for yourself.

  • 1 Jan 1900 was a Monday.
  • Thirty days has September,
    April, June and November.
    All the rest have thirty-one,
    Saving February alone,
    Which has twenty-eight, rain or shine.
    And on leap years, twenty-nine.
  • A leap year occurs on any year evenly divisible by 4, but not on a century unless it is divisible by 400.

How many Sundays fell on the first of the month during the twentieth century (1 Jan 1901 to 31 Dec 2000)?

module Problem19 =
    open System

    let start = new DateTime(1901, 1, 1)
    let finish = new DateTime(2000, 12, 31)

    let ans = start |> Seq.unfold (fun (t : DateTime) -> Some(t, t.AddMonths(1)))
                    |> Seq.filter (fun (t : DateTime) -> t.DayOfWeek = DayOfWeek.Sunday)
                    |> Seq.takeWhile (fun (t : DateTime) -> t <= finish)
                    |> Seq.length

Like all of the bigint project euler problems, the built in libraries make this completely trivial. Since my goal for this project is to learn F#/.NET, these trivial solutions are actually helpful.

Project Euler in F#: Problem 19

Project Euler in F#: Problem 18

By starting at the top of the triangle below and moving to adjacent numbers on the row below, find the maximum total from top to bottom of the triangle below:

95 64
17 47 82
18 35 87 10
20 04 82 47 65
19 01 23 75 03 34
88 02 77 73 07 63 67
99 65 04 28 06 16 70 92
41 41 26 56 83 40 80 70 33
41 48 72 33 47 32 37 16 94 29
53 71 44 65 25 43 91 52 97 51 14
70 11 33 28 77 73 17 78 39 68 17 57
91 71 52 38 17 14 91 43 58 50 27 29 48
63 66 04 68 89 53 67 30 73 16 69 87 40 31
04 62 98 27 23 09 70 98 73 93 38 53 60 04 23

NOTE: As there are only 16384 routes, it is possible to solve this problem by trying every route. However, Problem 67, is the same challenge with a triangle containing one-hundred rows; it cannot be solved by brute force, and requires a clever method! ;o)

Like problem 14, this problem, while brute forcable, just screams DP / memoization. Since I played around with memoization for 14, I decided to try a classic ‘C’ style dynamic program this time around.

module Problem18 =
    open System
    let fileName = @"C:\Users\d\Documents\Visual Studio 2010\Projects\Euler\Euler\euler 18.txt"

    let aData = System.IO.File.ReadAllLines(fileName) |> Array.rev |> (fun l ->  l.Split([|' '|], System.StringSplitOptions.RemoveEmptyEntries) |> System.Int32.Parse)
    let size = aData.Length
    let maxTable = Array2D.init size size (fun i j -> if j < size - i then aData.[i].[j] else 0)

    for i in 1 .. size - 1 do
        for j in 0 .. size - i - 1 do
            maxTable.[i,j] <- maxTable.[i,j] + (max maxTable.[i-1,j] maxTable.[i-1,j+1])
    printfn "Answer: %i" maxTable.[size-1, 0]

There’s no reason to use a 2D array here, but since I already learned how to use 1D arrays, I figured I might as well try a 2D array, to learn something new.

Project Euler in F#: Problem 18

Project Euler in F#: Problem 17

How many letters would be needed to write all the numbers in words from 1 to 1000?

module Problem17 =
    let ones = [|""; "one"; "two"; "three"; "four"; "five"; "six"; "seven"; "eight"; "nine"; "ten"; "eleven"; "twelve"; "thirteen"; "fourteen"; "fifteen"; "sixteen"; "seventeen"; "eighteen";  "nineteen"; "twenty"|] |> String.length
    let tens = [|""; "";"twenty"; "thirty"; "forty"; "fifty"; "sixty"; "seventy"; "eighty"; "ninety"|] |> String.length
    let hundred = [|"hundred"|] |> String.length

    let rec numLength n =
        match n with
        | n when n <= 20                    -> ones.[n]
        | n when n < 100                    -> tens.[(n/10)] + (numLength (n%10))
        | n when n < 1000 && n % 100 = 0    -> ones.[n/100] + hundred.[0]
        | n when n < 1000                   -> ones.[n/100] + hundred.[0] + String.length "and" + (numLength (n%100))
        | n when n = 1000                   -> String.length "oneThousand"
        | _ -> failwith "Out of range"

    let ans = {1..1000} |> numLength |> Seq.sum

I was just about to stop working through project Euler, since I hadn’t run across a problem that forced me to learn a new F# feature for a while (which was my motivation for doing this in the first place), and the algorithmic aspect of the problems hasn’t been that interesting, but this problem forced me to use arrays, which I hadn’t used before. I guess I’ll keep going for at least another few problems.

Project Euler in F#: Problem 17