a ANSI C compiler for high-level languages
New! Moved to github! And no more formal releases; just use head of github instead.
...and the bad news is that sparse has a fundamental compiler bug where it occasionally tries to do pointer arithmetic by converting a pointer to an integer, modifying it, and converting it back again. This won't work with Clue's computation model, thus leaving the project dead in the water. Sorry.
Clue currently supports the following targets:
- Lua 5.1.3
- Lua 5.2
- Perl 5
- Common Lisp (partially)
What do you mean, 'why'?
Apart from pure hack value (I'm hoping at some point to produce a back end that will emit sh script --- just because), Clue is mainly an experiment into the use of dynamic VMs to run static code. Modern JITs can do an astonishing job of producing machine code from dynamic languages, gathering all the necessary type information just from watching the program run. It therefore seems instructive to try taking a statically typed language like C, discarding all the type information, and letting the JIT have a go.
In terms of actual practical value, it may be useful to allow the use of code written for one system to run on another, much more restricted system. For example, using clue you could use off-the-shelf encryption systems like gpg to work inside a web browser.
How well does it work? Well, let's have some numbers. (All these were calculated during a single benchmarking run on my machine. The gcc score is included for reference. The gcc version of the benchmark uses the same source code as the Clue versions.)
|Backend||Interpreter||Whetstone score||Performance relative to gcc|
|java||Sun Java 6||790||32%|
|lua||LuaJIT 2.0.1 (interpreter)||155||6.2%|
|js||node.js (V8) 0.6.19||110||4.4%|
What's (gcc)? This is the test program compiled and run directly by gcc, without Clue being involved. This gives us a reference point to compare the benchmarks with.
What's the 'c' target? That's C code emitted by Clue. That is, we're compiling C into C. Clue's output code uses double precision floats for all numbers, but even then it's impressively fast.
Why is Lua 5.2 so much faster than Lua 5.1? Lua 5.2 supports a new
goto keyword. This is incredibly useful when doing this kind
of compilation as it allows me to pass execution directly from basic block
to basic block. Lua 5.1 doesn't have this, which means I have to fake
goto using what boils down to a switch statement. This is much
Why isn't Common Lisp on that list? Because Clue's libc for Common Lisp isn't up to it yet. I don't know Lisp; anyone want to volunteer?
Holy cow! LuaJIT is faster than C! Well, not really. These figures all come from the Whetstone benchmark, which is a synthetic benchmark that's not indicative of anything much. What's more, the figures above are a composite of several different subbenchmarks. LuaJIT is really, really good at optimising some parts of the benchmark (in fact, for some things it's better than native gcc with no Clue involved!), but less good at others, and this is dragging the overall figure up. This doesn't necessarily correspond to real world performance. (It's still awesome, though.)
Clue is based on the sparse C compiler frontend. This is plugged into a custom register allocator and code generator, which emits the code.
sparse and Clue are written in gcc-dialect C. It should run on most systems, although it has been developed on Linux, and makes fairly major assumptions about living in a Unix environment --- Windows users will want to use Cygwin and even then you're on your own.
Documentation is provided; currently it's a bit patchy, but reasonably complete. If you have any problems, please file a github issue.
Clue is experimental software. It's sole purpose is to be interesting, and not necessarily useful. The resulting code takes between 10 and 100 times longer to run as it would if you just compiled the program with gcc (and that's when using the Lua backend with LuaJIT, possibly the fastest dynamic language around; any other target will be slower).
In addition, while Clue supports the ANSI standard, most programmers don't; non-ANSI behaviour such as casting a pointer to an integer and vice versa is very common. This will not work. So stock code is unlikely to run on Clue unless the authors have been particularly disciplined. (However, this can also be seen as an advantage: if your code works with gcc and with Clue, it's probably going to work elsewhere.)
And I haven't even mentioned the bugs.
Clue's github repository
Send me pull requests!
Note: Right now Clue requires Sparse 0.4.1. Apparently this is pretty hard to come by and some versions vary, which means the patch doesn't work. Try this one; it seems to work for me.
Version ∞, 2016-02-24: I'm not planning on producing any more formal releases; since moving to Github it's easier to get people to use the head of the repository.
Version 0.6, 2013-03-14: Fixed quite a lot of bitrot. Added a new Lua 5.2 target, with goto support. Made work with LuaJIT 2.
Version 0.5, 2008-12-14: Code cleanups was not attending this release; but we do have a shiny new Java backend.
Version 0.4, 2008-12-08: Son of code cleanups (in fact, a pretty major backend overhaul); new Common Lisp and C support.
Version 0.3, 2008-07-19: Code cleanups strike again; new Perl 5 support.
Version 0.1.1, 2008-07-14: The first 'real' release, with lots of code cleanups and optimisations.
Version 0.1pre1, 2008-07-07: The very first release ever.
Clue was written by David Given. The program is freely distributable under the terms of the two-clause Revised BSD License. The download package contains additional material that is distributable under the terms of the MIT License.
The Common Lisp backend was contributed by Peter Maydell and is also covered by the Revised BSD License.
Sparse was written by the Sparse team and is freely distributable under the terms of the Open Software License v1.1. See the Sparse web site for more information.