cowlark.com :  cowbel :  The FAQ

cowbel

The FAQ

Published: 2016 November 3

Why did you write cowbel?

In 2009, Google announced a shiny new programming language, Go. I loathed this on sight, and gained my 15 seconds of internet fame with a badly written essay comparing Go unfavourably to Algol-68.

(If you're interested, here's a link.)

A week or so later, I wrote another essay, putting forward some ideas about what kind of language Go should have been. Nobody read it.

(Please?)

Since then, I decided to put my money where my mouth was and actually implement that language. Programming languages are harder than they look, and it took me several tries, but cowbel is that language.

Why is it in Java? Shouldn't all serious compilers be self-hosting?

Yes, I admit it. This is a total copout. It's just that Java's tooling is so superb that given how much redrafting the cowbel codebase is getting, it doesn't make sense to use anything else right now. Sorry.

Maybe once cowbel stabilises and gets some decent library support I'll rewrite the compiler in cowbel.

What sort of niche is cowbel aimed at?

It's aimed squarely at the narrow gap between hard-core low-level languages like C and C++ and the much more heavyweight VM-based languages like Java or Python. So it compiles into real machine code... but it's got a garbage collector. It's object based... but has no reflection.

It's intended to produce small, relatively standalone executables for use in systems programming. You wouldn't write an operating system kernel in it, but it's highly suited for daemons.

What's with the operator precedence?

If you're used to Algolalikes, cowbel's operator precedence may come as a surprise. This is because technically, cowbel has no operators.

The table of precedence is as follows:

lowest   infix operators
         prefix operators
         method calls or function calls
highest  parentheses, constants, identifiers

This means that all infix operators have the same precedence, and are therefore evaluated left-to-right.

The reason for this is that cowbel treats all operators as method calls. The language itself has no knowledge of what the operator means, and therefore cannot, for example, parse * at higher precedence than +.

Why can't I create null pointers?

Null pointers are now generally reckoned not to be a good idea. They add failure case that the programmer needs to think about to every single pointer dereference in the program. For a pointer-centric language like cowbel, where all variables are pointers, I don't think this is a good idea.

In addition, supporting null pointers leads to an unpleasant degree of non-orthogonality to the language: why should some types be allowed to be set to null while other types (such as primitive types) can't? This makes the language much harder to reason about and adds nasty edge cases. For example, we can't infer the type of null.

There are situations where you genuinely, really need a pointer that can be unset. Cowbel provides the Maybe to meet this need.

Do cowbel generics use code replication or type erasure?

Code replication. While it does involve generating more code, it's basically less trouble and avoids the need to do explicit upcasts. (Currently cowbel doesn't support upcasting. Anywhere.)

At the moment the type and function inflation is rather conservative and will produce multiple copies of identical functions in places where it really ought to be producing just one copy. This needs attention, but as it's just an optimisation and not an actual language bug, I'm letting it pass for now.

Odd stuff happens when I try to use a variable before I declare it.

In cowbel, all symbol and type declarations are hoisted to the top of their scope. This is to allow forward declarations to Just Work.

function f1() { f2(); }
function f2() { f1(); }

This also has some slightly counterintuitive consequences.

print(i); /* valid! */
var i = 1;

However, currently the type inference algorithm is a bit shoddy and there is no dataflow analysis, so what actually happens is (a) you get an error telling you that the compiler was unable to infer the type of i and (b) even if it could it shouldn't let you do the above because you're using i before it is initialised.

Currently these areas are very rough around the edges. File bugs!

I'm trying to compare two objects and it's not working.

There are no automatic methods on interfaces. If you don't declare your interface specifically to support the == and != methods, you won't be able to compare objects of that interface.

type MyInterface =
{
  function == (other: MyInterface): boolean;
  function != (other: MyInterface): boolean;
};

var o1: MyInterface = ...;
var o2: MyInterface = ...;

if (o1 == o2)
  print("Yes!");

Yes, you do need to implement both methods; cowbel doesn't know what any method means, and so doesn't know that they are inverses.

There will eventually be a set of Comparable<> interfaces to make it harder to get this wrong by accident, but they're not there yet.

I've just got this totally incomprehensible error message.

Yeah, sorry. The error diagnostics are currently really manky. They need a lot of work. There should be enough in there to at least let you find the line number where things went wrong.

Any messages referring to something like functionName<1>(2) represent a function signature: the numbers indicate how many type and value parameterrs the function takes.

Likewise, messages like typeName<1> indicate type signatures.

Anything like Interface42 or {17} refer to anonymous interface and class types. Getting human-readable names for this is a priority.

A long string that looks like int=int boolean=boolean ...long stuff here... filename.cow:123.4 indicates a specific function instantiation. The sequence at the beginning is the type environment for the instantiated function, and the location at the end is where it was defined.

File bugs!

I've got code that shouldn't compile, but does. / My program fails at the C compilation stage.

Cowbel's type checker works lazily, and only type checks code if it gets used. (This is a consequence of the way functions are inflated.) This means that pretty much anything goes in unused code.

In particular, if an object constructor declares that it implements an interface, it is only checked to make sure that it actually implements the methods in the interface when those methods are called. Which means that if you never call them, they never get checked for...

This is not optimal, and overhauling the type checker is on my list of things to do.

There are also a few edge cases where invalid code of this kind can interact with the dataflow analyser (or rather, lack thereof) and produce invalid output files. As an example:

type Interface =
{
};

function f(): Interface
{
  /* this should not be accepted by the compiler */
}

f();

What's this extern thing?

The extern keyword is used for the C call-out interface. It produces a quick and easy way to interface with external libraries. It's not documented because it's hacky and I'm still not sure it's the right way to do things.

It comes in two varieties:

extern "#include ...";

If this statement is seen in reachable code then the string constant is emitted at the top of the output file.

extern "...C statement...";

Variable references in the string constant are expanded and the entire line of code emitted into the output file. A variable reference is a substring of the form ${variable}; this will be replaced by a C lvalue to the variable's storage. Both local variables and upvalues can be used. Expressions can not.

For example:

function kill(pid: int, sig: int): (result: int)
{
  extern '#include <sys/types.h>';
  extern '#include <signal.h>';
  extern '${result} = kill(${pid}, ${sig});';
}

If the variable is a primitive type, then the lvalue will be to the equivalent C type. Object references become typed pointers. Cowbel strings become pointers to objects of type s_string_t; the runtime function s_string_cdata() will extract a nul-terminated C string from them.

function mkdir(dir: string, mode: int): (result: int)
{
  extern '#include <sys/stat.h>';
  extern '#include <sys/types.h>';
  extern '${result} = mkdir(s_string_cdata(${dir}), ${mode});';
}

In addition, the special primitive type __extern is available. This represents a C void pointer. There is a special hole in the type rules which means that it can be initialised from an integer; but note carefully that such an initialisation will not actually change its value. This is used to store C pointers in cowbel objects.

function Buffer(size: int): Buffer
{
  var ptr: __extern = 0;
  /* At this point, ptr is declared but contains an undefined value. */
  extern '${ptr} = malloc(${size});';
  ...etc...
}

Why am I getting strange 'cannot unify type' errors with this code?

Do you have code that looks like this?

var o = { implements Interface; };
var o1: Interface = o;
var o2 = o;
o2 = o1;

What's happening here is that the type of o is being inferred to be that of an object constructor; which is an anonymous interface (let's call this C). This can be implicitly downcast to an Interface, as is happening in the second line.

o2 gets inferred to be a C as well. But this means that the last line is trying to assign an Interface to a C, which isn't allowed.

To fix this, change the first line to:

var o: Interface = { implements Interface; };

This will ensure that the implicit C gets downcast to an Interface before assignment, which will cause the type of o to be inferred as an Interface and not a C.