Getting Started with LLVM and the Ocaml LLVM Tutorial

I recently started playing with LLVM, and I ran into a few issues with getting started that didn’t have google-able solutions, so I’m posting this on the off chance it might help someone else.

The directions on the LLVM website are very detailed, but intimidatingly long. Getting a build working is actually really simple. Aaron Gray put together an amazingly easy four step guide on bulding LLVM for cygwin here (and the directions worked fine for me on Unbuntu, too). It looks like he was working with an older version of LLVM, and you can actually skip his “build the tools” step. If you want to do that step anyway, and follow his instructions with the current version of cygwin, the tools will drop into a “Debug+Asserts” directory, instead of a “debug” directory.

Also, on cygwin, you can build with gcc3 or gcc4 but, if you have gcc3 and ocaml installed, you have to install gcc4 (or fiddle with your path). Ocaml installs enough components of gcc4 that configure detects that you have gcc4, but during the compile process, cc1plus from gcc3 will complain about arguments that are only legal with gcc4:

cc1plus: error: unrecognized command line option “-Wno-variadic-macros”
make[3]: *** [llvm-main.o] Error 1

If you’re on 64-bit Ubuntu, you’ll need to do sudo apt-get install libc6-dev-i386, or you’ll get an error related to stubs-32.h

That’s it! It’s really that simple.

If you want to build a back-end, there’s a nice guide here. As with the official instructions, the prerequisite reading list, which states “These essential documents must be read before reading this document”, is intimidating long, but it’s actually simple enough that you can just dive in and refer to the detailed manual when an obscure reference pops up, JIT :-).

There’s an even simpler guide to writing a front-end by Chris Lattner here, with an Ocaml translation by Erick Tryzelaar here.

The Ocaml tutorial just requires LLVM + Ocaml + camlp4. If you don’t have camlp4, you’ll know it because you’ll get

/usr/bin/ocamldep -pp camlp4of -modules >
sh: camlp4of: not found

On Ubuntu, you just need to do sudo apt-get install camlp4-extra to fix that.

Also, the tutorial has a few minor bugs.

In chapter 3

| '+' -> build_add lhs_val rhs_val "addtmp" builder
| '-' -> build_sub lhs_val rhs_val "subtmp" builder
| '*' -> build_mul lhs_val rhs_val "multmp" builder

Should be

| '+' -> build_fadd lhs_val rhs_val "addtmp" builder
| '-' -> build_fsub lhs_val rhs_val "subtmp" builder
| '*' -> build_fmul lhs_val rhs_val "multmp" builder

Since everything is converted to a double, but the base build_* functions expect an int, you’ll get

lvm::BinaryOperator::init(llvm::Instruction::BinaryOps): Assertion `getType()->isIntOrIntVectorTy() && “Tried to create an integer operation on a non-integer type!”‘ failed.

unless you use operations that expect floats

Similarly, in Chapter 5,

let next_var = build_add variable step_val "nextvar" builder in

should be

let next_var = build_fadd variable step_val "nextvar" builder in

Also, in chapter 4, to get extern functions working, see this.

And if you compiled using the instructions above, you’ll need to build with ocamlbuild -cflags -I,+llvm-2.8 -lflags -I,+llvm-2.8 toy.byte instead of just ocamlbuild toy.byte (or you can modify your build file).

Getting Started with LLVM and the Ocaml LLVM Tutorial

Mixed-language designs with Xilinx tools: be careful

I had a couple of bugs that each took me a few days to track down, because it just didn’t occur to me that the compiler might just be wrong. Googling around for related issues didn’t help me, but perhaps this will save someone else some time.

Using the version of Xst package with ISE 10.x, Verilog modules instantiated in VHDL files can get incorrectly optimized away. Moving to 12.x solved that issue. It wouldn’t have even occurred to me to look for that had a co-worker not mentioned that he had problems with some of his code getting optimized away in 9.X, when he was creating a multi-banked RAM (without mixed-language code) – the synthesis tool decided that one bank just wasn’t necessary, and removed it entirely. Moving backwards to version 8.x fixed his issue.

Even in the latest version, there are still mixed language issues. A VHDL IOBUF connected to a Verilog module may get Xs on the output of the IOBUF in ISim even if T, the IOBUF select, is high, and the output of the IOBUF is only connected to Verilog module inputs. Switching to a Verilog wrapper and instantiating the IOBUF in Verilog fixed that problem.  I never would have thought to try swapping out a VHDL instantiation for an identical Verilog instantiation had I not run into a mixed language previously.

Mixed-language designs with Xilinx tools: be careful

Project Euler in F#: Problem 219

Skew-cost coding

Let A and B be bit strings (sequences of 0’s and 1’s).
If A is equal to the leftmost length(A) bits of B, then A is said to be a prefix of B.
For example, 00110 is a prefix of 001101001, but not of 00111 or 100110.

A prefix-free code of size n is a collection of n distinct bit strings such that no string is a prefix of any other. For example, this is a prefix-free code of size 6:

0000, 0001, 001, 01, 10, 11

Now suppose that it costs one penny to transmit a ‘0’ bit, but four pence to transmit a ‘1’.
Then the total cost of the prefix-free code shown above is 35 pence, which happens to be the cheapest possible for the skewed pricing scheme in question.
In short, we write Cost(6) = 35.

What is Cost(109) ?

A lot of people stumbled across this blog, searching for the solution to problem 219, which seemed like a good indication that it might be a fun problem, so I skipped ahead to try it out. Unlike the early problems that I’d been solving, the really brain-dead obvious solution didn’t work here: using Huffman coding and maintaining the entire tree is way too big. But, since we always take the lowest cost node, and add 1 and 4 to the costs, the range of node costs at any time is at most five, so the slightly less brain-dead solution of keeping track of node counts and iterating works fine.

module Problem219 =
    let cost (c,(n0,n1,n2,n3,n4)) = c*n0 + (c+1L)*(n1) + (c+2L)*(n2) + (c+3L)*(n3) + (c+4L)*(n4)

    let counts = (1L,(1L,0L,0L,1L,0L)) |> Seq.unfold(fun (c,(n0,n1,n2,n3,n4)) ->
        if n0 > 0L then Some((c,(n0,n1,n2,n3,n4)), (c,(n0-1L,n1+1L,n2,n3,n4+1L)))
        else Some((c,(n0,n1,n2,n3,n4)), (c+1L,(n1-1L,n2+1L,n3,n4,1L))))

    let ans = cost (Seq.nth (1000000000-2) counts)

In retrospect, using a recursive function instead of Seq.unfold probably would have been cleaner and, of course, this could be done in log time by doing n0 steps in single iteration, but I’m trying to keep my total time spent solving problems under my time spent writing up solution, so I’ll stick with this solution. I’ll probably go back and re-write the cost function at some point, though, since it’s just plain ugly.

A conversation with Seth the other day reminded me that I should be unit testing, so I started playing with xUnit, putting the incremental tests I’d been doing in the REPL into unit tests.

    let Test1 = Seq.nth 3 counts|> MustEqual (3L, (0L, 2L, 1L, 1L, 1L))
    let Test2 = Seq.nth 5 counts|> MustEqual (4L, (0L, 3L, 1L, 1L, 2L))
    let Test3 = minCost 6 |> MustEqual 35L
    let Test4 = minCost 12 |> MustEqual 96L

I’m probably going to cut back on project Euler for now; I started working through these as a way to scratch my coding itch and learn F#, since I hadn’t written any code for about a year, but the problems aren’t very coding intensive, and I’ve started writing some F# and JavaScript for my day job.

Project Euler in F#: Problem 219

Project Euler in F#: Problem 21

Evaluate the sum of all the amicable numbers under 10000.

    let sumOfDivisors n = seq {for m in n)) do if n % m = 0 then yield! [m;n/m]} |> Seq.sum |> (+) 1
    let ans = {1..10000} |> Seq.filter (fun n -> sumOfDivisors (sumOfDivisors n) = n && sumOfDivisors n <> n) |> Seq.sum

The naive solution I came up for this returns a sequence of pairs. I’m not sure what the idiomatic and/or efficient solution is in F#, so I just yield!ed the pair as a list. Any F# experts have a better way?

I should add a disclaimer that while this sum of divisors function works for this problem, it’s actually wrong for perfect squares, just in case anyone stumbles on this code and tries to use it.

After seeing how explicitly calling DivRem sped up problem 16, I wondered if explicitly saving the sum would speed this up.

    let sumOfDivisors n = seq {for m in n)) do if n % m = 0 then yield! [m;n/m]} |> Seq.sum |> (+) 1
    let amicable n =
        let m = sumOfDivisors n
        n = sumOfDivisors m && m <> n
    let ans = {1..10000} |> Seq.filter amicable |> Seq.sum

But this gives the exact same performance; I guess the compiler does the optimization automatically.

Project Euler in F#: Problem 21

Project Euler in F#: Problem 19

You are given the following information, but you may prefer to do some research for yourself.

  • 1 Jan 1900 was a Monday.
  • Thirty days has September,
    April, June and November.
    All the rest have thirty-one,
    Saving February alone,
    Which has twenty-eight, rain or shine.
    And on leap years, twenty-nine.
  • A leap year occurs on any year evenly divisible by 4, but not on a century unless it is divisible by 400.

How many Sundays fell on the first of the month during the twentieth century (1 Jan 1901 to 31 Dec 2000)?

module Problem19 =
    open System

    let start = new DateTime(1901, 1, 1)
    let finish = new DateTime(2000, 12, 31)

    let ans = start |> Seq.unfold (fun (t : DateTime) -> Some(t, t.AddMonths(1)))
                    |> Seq.filter (fun (t : DateTime) -> t.DayOfWeek = DayOfWeek.Sunday)
                    |> Seq.takeWhile (fun (t : DateTime) -> t <= finish)
                    |> Seq.length

Like all of the bigint project euler problems, the built in libraries make this completely trivial. Since my goal for this project is to learn F#/.NET, these trivial solutions are actually helpful.

Project Euler in F#: Problem 19