As it’s the 50th anniversary of the Apollo 11 moon landings, I thought this would be a great opportunity to live code a demonstration of using my new compiler to write a simple lunar lander game… but first, I’m going to port the compiler to generate code for an actual lunar lander. Yes, my target platform is the Apollo Guidance Computer used by the Apollo program. And it is weird.
I’m actually using the Virtual AGC simulator for the AGC, as there are very few running AGCs in the world. One, to be precise. There may be one other functional AGC somewhere in interplanetary space, but as it hasn’t been powered on for over fifty years it could probably use some work. Plus, access is hard.
It’s bizarre: 15-bit words, 1s-complement arithmetic, asynchronous logic mostly implemented with discrete logic gates wired together by hand, ferrite core memory and ROM, and the weirdest instruction set you’ll ever find in a mainstream computer. (Is the AGC mainstream? I SAY YES.) It was a miracle of miniaturisation, and its development is largely responsible for the integrated circuit revolution: at one point, apparently, 25% of all ICs produced in the world were going into the AGC development programme.
That said, it is a proper von Neumann architecture computer, capable of self-reprogramming and executing code out of RAM. In many ways it feels startlingly modern. It was interesting (if a little frustrating) to port to.
So, the game?
First, download and decompress the binary. Then hand-weave into core rope memory and assemble into an AGC module. Locate a running AGC and insert it into the common-fixed bank 2 slot. Attach a set of LM simulated peripherals to the AGC and power on. If you have a real LM, all the better.
To play, press ENTER. The DSKY will show you (top to bottom) your current thrust level, your altitude, vertical velocity, and remaining fuel. Use the pitch control of the Rotational Hand Controller (a.k.a. the joystick) on the LM to control your throttle — push forward for more thrust, center for less thrust. Pay attention to the altitude and velocity alarm lights; if the velocity alarm is lit, you’re going too fast to land safely! Once landed (or crashed) press ENTER to reset and then ENTER again to replay.
(Note. Not certified for flight use. Do not attempt to land your LM on the moon while having the lunar lander game loaded into the AGC. Load Luminary, the real lunar lander software package, instead.
Alternatively, if you don’t have an AGC or LM handy to run it on, you can use Virtual AGC instead. Virtual AGC is a little complex to set up, so refer to the instructions there; you’ll need the ACA and DSKY simulators, and yaAGC must be configured in LM mode rather than CM mode. (Because the command module didn’t have a joystick.)
All the source code for the compiler and the game itself are available from my github repository below. It’s all BSD 2-clause licensed.
⇡Some notes on the machine
So this was way more work than I thought. I was originally expecting to get this done in a few hours. Turns out the AGC instruction set was odder than I expected.
⇡The overall architecture
So, by modern eyes, the AGC is essentially a 15-bit word system with three registers and a 12-bit address bus, meaning it could address only 4096 words of memory (15-bit words). It had 2048 words of RAM and a staggering 36,864 words of ROM, both implemented using ferrite core technology, and a complex banking system for accessing it all. There was also a 9-bit I/O address bus and instructions for accessing it.
In many ways it felt very modern, with an extremely familar-looking interrupt system and counters; in some ways it didn’t, such as the asynchronous logic implementation which meant that each instruction executes in a variable amount of time depending on how much work it was doing, or the way the registers are all memory mapped, so the accumulator can be read from address 0. And there is no support for a stack at all.
The instruction set is elegantly RISC, with each 15-bit word encoding a
3-bit opcode and 12-bit address payload (constants were always referred to by
address). However, it clearly had been expanded many times during
development, so some instructions steal some bits from the top of the payload
to allow an expanded opcode and as a result can only address the bottom 1024
words of memory. Then they still ran out of opcodes, so added an
instruction as a prefix word, allowing two-word instructions.
Needless to say, the AGC can’t handle bytes at all.
(It’s a very interesting machine. I think that maybe a cleaned up version of the ISA, expanded to 32 bits, would make a nice very-low-transistor-count embedded CPU. You might possibly want to switch to 2s-complement, though.)
⇡Registers and instructions
The instruction set is strange.
The three registers are A, the accumulator; L, the lower accumulator, used
for the other parameter of three-word operations; and Q, the return address.
ARM and MIPS users will recognise this as a link pointer. Subroutines are
called with a relatively normal
TC label, which stores the return address
in Q, and returned from with
RETURN, which just jumps to Q. To call nested
subroutines, you have to store Q somewhere. The A and Q registers are special
in that they are 16 bits wide rather than 15. The extra bit was used to track
Some instructions are fairly normal, if oddly named:
CA label loads the
value at that address into A. Some aren’t: the main data storage instruction
TS label, which stores A… but it’s also a jump instruction; if A has
overflowed, it skips the next instruction. This allows:
TS something TCF no_overflow_occurred <handle overflow here>
TCF is the normal jump instruction.) Obviously if you can guarantee that
the value cannot overflow, you don’t need the jump instruction. This idea
makes me really uncomfortable as a source for subtle bugs, because if it
did overflow, your program would just skip one instruction and keep
running, with hilarious results…
Then there’s the
CCS instruction, which is the magnitude comparison and
loop instruction. This loads its parameter into A, and then jumps to one of
the four following instructions based on the value, but with A diminished.
(Diminished means: moved one integer closer to zero. So, 5 becomes 4, but -5
CCS variable TCF greater_than_zero TCF positive_zero TCF less_than_zero TCF negative_zero
(1s complement arithmetic allows signed zeroes.) Looking at the Luminary
source code, I see situations where they could guarantee that
always greater than zero or positive zero, and so they skipped the other
jumps. Like with
TS, that makes me very uncomfortable… but it does allow
for nice tight loops:
CA memory_block_size ; A := size of block that we want to clear loop: ZL : L := 0 INDEX A ; add value of A to next instruction LXCH block ; swap L with *(block+A) (this is the only way to store L without touching A) CCS A ; A-- and depending on old A jump to... TCF loop ; if A *was* >0, loop again ... ; if A *was* +0, jump here and exit the loop
INDEX instruction, you might ask? Read on.
The instruction set has no provision for pointer handling. You can’t
(normally) dereference them and the assembler doesn’t (normally) allow you to
emit a constant address of something to memory. It’s my belief that the
original programmers never used them. The instruction set doe have robust
support for look-up tables, via that
INDEX instruction, but I don’t think
they ever actually took the address of anything, except maybe to do indirect
jumps (more on that later).
INDEX does is read its argument and then execute the next instruction
with that value added to the opcode. So, in the example above,
actually operates on the value at
block + A rather than just
allows things like:
TC get_key ; read a key and return it in A INDEX A CA key_translation_table ; A := translated key TC do_something_with_it ... key_translation_table: .dw 1 .dw 2 .dw 3 ...
But it’ll work on any instruction, not just
CA, and on any value; so you can set the opcode bits too. This allows self-modifying code without needing to modify anything!
TC compute_instruction ; get an instruction opcode in A INDEX A .dw 0
I managed to abuse this into allowing pointer dereferences by simply doing
INDEX A; CA 0, so adding A to 0 to read from the absolute address in A.
Stores worked, too, by doing
INDEX A; TS 0. However, trying to actually get
the address of anything was hard, because the original assembler didn’t have
.dw label. I eventually managed to bodge around this by finding
the instruction whose opcode bits were 0. (It was
TC.) Emitting this in a
data section would end up writing the raw address.
⇡Multiplication and division
The AGC does have hardware multiplication and division. You can multiple two 15-bit numbers to get a 30-bit result, or divide a 30-bit value by a 15-bit value to get a 15-bit result (and 15-bit remainder).
The multiplication instruction is relatively sane, but the division one is strictly limited. It can only compute \(x/y\) when \(|x| > |y|\). Anything else, you get garbage, including in the \(x=y\) case. This hit me several times and trying to deal with all the sign issues, combined with the dealing with 1s-complement sign issues, was really painful. I ended up with a massive hand-written helper routine for dividing two numbers to try to work around it; I’m sure it’s wrong, and I haven’t written the equivalent modules one yet.
Modern machines use 2s-complement numbers:
+2 = 000 0000 0000 0010 +1 = 000 0000 0000 0001 0 = 000 0000 0000 0000 -1 = 111 1111 1111 1111 -2 = 111 1111 1111 1110
However, the AGC uses 1s-complement instead:
+2 = 000 0000 0000 0010 +1 = 000 0000 0000 0001 +0 = 000 0000 0000 0000 -0 = 111 1111 1111 1111 -1 = 111 1111 1111 1110 -2 = 111 1111 1111 1101
This has two big advantages over 2s-complement arithmetic, in that \(~x = -x\), and that it also allows both positive and negative zeroes. However, it also has two huge disadvantages: the gap between +3 and +1 is not the same as the gap between +1 and -1, …and that it allows both positive and negative zeroes.
I spent a great deal of time wrestling with this, trying to encode opcodes for the DSKY properly. It didn’t help that I got it mixed up with Signed Magnitude Representation:
+2 = 000 0000 0000 0010 +1 = 000 0000 0000 0001 +0 = 000 0000 0000 0000 -0 = 100 0000 0000 0000 -1 = 100 0000 0000 0001 -2 = 100 0000 0000 0010
My several explanations of 1s-complement in the video are simply wrong. Please disregard.
The registers are exposed on I/O ports! So reading from I/O port 0 gives you the accumulator.
This is useful, because the AGC has no OR or XOR instructions! But you can read from an I/O port and logically or or xor into the accumulator, so this works:
CA rhs ; A := rhs TS L ; L := A CA lhs ; A := LHS ROR L ; A := A | L
(There is a
MASKinstruction for doing ANDs.)
L or Q can’t be directly read from or written to memory! But you can swap them with memory (using
EXTEND QXCH. (You can swap A, too, with
There are no shift instructions! But there are memory-mapped shift registers. Write a value to address 22 and it’ll be rotated left. But some instructions (like
INDEX) will read and then write back their operand, thus resulting in it being shifted again!
There is one instruction called
EDRUPTwhich is so called because it was requested by Ed Smally, and used only by him. It appears to read but then ignore its operand and then fake an interrupt to the next address, first executing the instruction whose opcode is in A. I think. Its purpose is obscure.
It is possible to call the accumulator as a subroutine. This writes the return address into Q, at address 2. As the encoding for
TCis 0, that address is a valid jump instruction. So, you get to execute the two instructions in A and L and then return to where you were. This is… useful, I guess?
This is the only CPU I’ve ever encountered with a dedicated radar interrupt. Plus, I want to know what that memory-mapped register simply called ‘thrust’ does.
These are also on the video itself, for clickability.
- 00:00:00 Introduction
- 00:04:27 DAY 1 START
- 00:28:00 I manage to crash VirtualAGC
- 00:32:24 First subroutine call (with no parameters)
- 00:35:50 We add our first (and only) type — an int15
- 00:39:12 Start working on expression handling and the virtual stack
- 00:45:40 We assign a value to a variable! Pity we can’t represent values yet
- 00:47:50 Start implementing the constant pool
- 00:53:50 Actually run code for loading our variable
- 00:59:15 We can load variables too
- 00:59:42 Start adding numbers
- 01:09:30 Planning how to write to the DSKY
- 01:14:50 Passing parameters to subroutines
- 01:32:35 Discovery that YUL can’t store addresses as constants
- 01:36:49 DEBUGGING BREAK — and solve previous problem through creative evil
- 01:43:30 Subroutine calls with input parameters work
- 01:47:00 Adventures with multiplication
- 01:56:30 Wrestling with the DSKY
- 02:08:15 Actually get something displayed; conditionals and loops work
- 02:19:40 Setting individual DSKY digits works; start work on outputting numbers
- 02:30:10 More problems with labels
- 02:42:15 Wrestling with division
- 03:22:55 TEA BREAK
- 03:33:28 First successful division!
- 03:37:00 Discover that my virtual AGC keeps resetting; interrupt system debuggin
- 03:48:30 More fighting with the DSKY
- 03:58:00 Discover that my DSKY parameter generation is overflowing
- 04:22:35 1s-complement arithmetic is hard (doesn’t help that I got it completely wrong)
- 04:24:00 Start implementing logic operations
- 04:33:45 Start go off on a total wild goose chase due to not understanding 1s complement
- 05:10:00 Realise my mistake and give up
- 05:24:10 SLEEP. DAY 2 START
- 05:26:10 Realise I need to keep my own copy of the DSKY state
- 05:54:35 Revenge of the labels!
- 06:09:00 Finally, I’m writing numbers to all the DSKY displays! The right numbers!
- 06:20:20 And now we can display negative numbers!
- 06:20:55 Start writing the actual game
- 06:26:00 First attempt to read the Rotational Hand Controller pitch control. This fails
- 06:45:00 DEBUGGING BREAK — I give up wrestling with the RHC and the DSKY and do some offline debugging. All better now
- 06:67:12 Writing the main game loop
- 06:53:30 First run of our prototype game. It’s very broken
- 07:01:50 Discover a nasty arithmetic bug that just happened not to show up in our tests
- 07:08:10 The bride of division… from the swamp!
- 07:28:30 Division works now. Maybe?
- 07:29:33 Start debugging memory corruption issues
- 07:35:45 Octal: the cause of, and definitely not the solution to, all of life’s problems.
- 07:38:30 Our game now looks like a bit like a game
- 07:40:00 Turns out that 15 bits isn’t enough precision for our game, so fake some more
- 07:45:15 Our game is now playable, although it needs lots of tuning. Also being able to control the engine would be nice
- 07:48:05 The player’s throttle control (the RHC) works. It’s a game!
- 07:52:40 The game has a bit of challenge to it. I still land successfully due to leet skills
- 07:53:30 Start implementing the new game / replay loop, controlled from the DSKY
- 08:06:34 You can crash and land and be prompted for a new game
- 08:08:50 The game is done!
- 08:09:00 I decide to add one more effect, and immediately run out of block 2 fixed memory space
- 08:38:14 The moment when I realise I’ve run out of block 2 fixed memory space
- 08:50:50 The moment when I discover that block 3 fixed memory doesn’t kick in automatically
- 09:00:33 I manage to shrink the program to just fit in block 2 (using all the available space)
- 09:00:50 THE GAME IS FINISHED