Linux Binaries on Windows

Published: 2016 November 3

Index

Important note. I am, unfortunately, unlikely to ever work on this again. Given that Microsoft have killed Interix, I no longer have a machine which will run it, and it was a vile hack anyway that only really worked by accident.

Incidentally, we've migrated from SourceForge to GitHub. 

What?

LBW is a Linux system call translator for Windows. It allows you to run unmodified Linux applications on top of Windows.

It is not virtualisation; only one operating system is running, which is Windows. It is not emulation; Linux applications run directly on the processor, resulting in (theoretically) full native performance.

Consider it as being like WINE, but in reverse.

Right now LBW is in a proof-of-concept stage. It's in no way suitable for doing real work on as it's full of bugs. On the other hand it's adequate for running a Debian chroot, downloading and installing packages with apt and dpkg, compiling and running programs with gcc, connecting to remote servers with ssh, and even running some basic X applications.

Everybody loves screenshots:

LBW in action

Right now LBW runs on 32-bit Windows XP only.

Danger!

LBW is evil. It is about fifteen different hacks all balanced precariously on top of each other. Lots of things in LBW don't work. Lots of things will never work.

It will crash.

Back up your data, keep your system secure, and above all else:

You have been warned.

Important links

The Frequently Asked Questions list

How to install Interix (which LBW needs)

A list of things that I would like help fixing

The bug tracker on GitHub (probably the best place to get in touch with me)

Where?

LBW is hosted on GitHub.

Important! LBW requires Interix, so you'll need to install this first. Please see the installing Interix page for instructions.

How?

LBW works by running Linux code and intercepting the page faults that occur when the Linux code does something that Windows doesn't like --- such as making a system call into the Linux kernel.

When this happens it simply (ha!) looks at the registers, determines what system call the application is trying to make, does it, and returns to the application. The application thinks the Linux kernel did the work; instead it was LBW.

LBW relies on Interix to do the heavy lifting. Interix provides a Unix file system, process model, groups, users, pipes, sockets, etc. LBW's job therefore becomes vastly easier. Unfortunately, Interix is a rather old-school Unix, and Linux has a lot of functionality that Interix simply can't do; so we have to emulate it where possible, and fail where not.

Currently LBW implements, with varying degrees of success, 132 different system calls out of a total of about 350. That's enough to run a lot of programs, although I regularly come across new ones that need adding.

In addition, LBW contains an ELF binary loader for getting applications into memory in the first place, although luckily we can use Linux' own dynamic loader for dealing with shared libraries.

There's more to it than that, of course: Linux uses registers rather differently to Windows, so we have to do some really unpleasant things to make that work, which unfortunately are currently badly hurting performance; and the Interix chroot() isn't useful to us, so I have to implement my own VFS layer; etc.

Can you help? Are you an Interix guru? Do you know more about the Windows NT kernel than any sane person should? If so, check out the technical wishlist page for things that I could really use help on!

LBW is all my own work, and contains no Linux kernel code (apart from trivial lists of symbol names). It's written in C++ with a tiny bit of inline assembler.

What's new?

Version 0.1, 2010-04-01: First version released to an unsuspecting world. Many thanks to Jayson Smith for being the first guinea-pig to try this ever. He's a brave man and a lucky one.

Who?

LBW was written by David Given. The program is freely distributable under the terms of the MIT License.

Frequently Asked Questions

Haven't I seen something like this before?

Possibly. LBW's main inspirations are LINE and LOW. However, they never really got off the drawing board; LBW is considerably more capable. But I'd never have done LBW without them, so credit is due.

Does this replace Cygwin?

No. LBW is incredibly immature. Right now it works just enough to be interesting. It should under no circumstances be used for doing real work. Cygwin is written by real programmers who really know what they're doing and it really works.

(That said, the screenshot on the front page was resized using netpbm installed into a LBW Debian chroot. And I check all the source code into GIT using a static binary of Linux' GIT running under LBW.)

Does this replace virtualisation?

No. Virtualisation solves a different problem to LBW. Virtualisation lets you run two complete operating systems on the same machine, with isolation between them. LBW lets you run Linux programs using the Windows operating system; there is no isolation. LBW's file system is the Windows file system. LBW processes appear in the Windows task list. They use the Windows TCP/IP stack.

How complete is LBW?

Terrible. I run into new unimplemented system calls on a daily basis. Plus, some of the existing system calls aren't implemented properly, or are stubbed out.

Worse, some system calls cannot be implemented under Windows --- futex() and clone(), for example, require Windows kernel support (as far as I can tell). This means that LBW can't run Linux programs that use threads. I think I have a plausible workaround, but it'll be a while yet.

What's the performance like?

Lousy --- much poorer than it ought to be. I think I know why; see the technical wishlist page for details.

I've tried to run this program and I get a register dump and 'unimplemented syscall 123 (sys_foo_bar)'!

You've tried to use a syscall that LBW does not yet implement. Please let me know on the mailing list and I'll add it, or at least try to.

I've tried to run this program and I get a memory dump and 'unable to interpret above instruction sequence'!

See the technical wishlist for full details, but the short summary is that you've tried to execute an instruction that the processor cannot run natively, and LBW has failed to analyse it correctly to translate it. Please let me know on the mailing list and I'll fix it, or at least try to.

I've tried to run this program and I get some other crash!

I did mention LBW was full of bugs, right?

Please let me know on the mailing list and I'll try to look into it... but if you really want to make me happy, it would be utterly awesome if you could try and figure out exactly why it's crashing and let me know.

Does LBW contain any Linux source code?

No. As of writing, every line of code in LBW is mine. (Apart from trivial lists of symbol definitions taken from the Linux kernel headers.)

Does LBW contain any Microsoft source code?

No. It does call a few undocumented Windows NT kernel entry points, but it's based on publically available documentation everywhere.

Is it a clean-room reimplementation of the Linux kernel?

No. I've used the Linux kernel source extensively to try and figure out how the various system calls work --- the Linux system call documentation is not great.

Installing Interix

LBW requires Interix, a.k.a. Services for Unix, a.k.a. Subsystem for Unix Applications, a.k.a. Microsoft Unix.

This is a Microsoft product that nobody's ever heard of. It provides a rather decent if dated Unix system that runs side-by-side with win32, on top of the Windows NT kernel. It comes with all the usual development tools, like gcc, make, a full set of Unix command line utilities and daemons, etc. LBW development is done on Interix.

It's free and easy to install, but it is irritatingly fiddly.

Windows XP

Download Services for Unix 3.5. It's about 220MB, but we're only going to install a small part of it.

Run the executable. It'll decompress into a folder. Then run the installer in that folder. It'll ask you a series of questions:

  • you want a custom installation.
  • when prompted as to which features you want to install, disable everything except 'Base Utilities' (inside 'Utilities'). [If you want to do LBW development, you'll need to install more than this --- ask on the mailing list.]
  • Leave case sensitive file system and setuid binaries to off.
  • You want to use local user name mapping.
  • You want to use password and group files, not NIS. When prompted for filenames, just press NEXT.
  • You want to install to the default location (C:\SFU).
  • Leave cron and the other daemons turned off.

It may take a while at the 'configuring security services' stage --- let it run, it'll get there in the end.

Once finished, it'll make you reboot.

Windows XP Home

Interix does not install out of the box on XP Home, because Microsoft apparently think you're too cheap.

However, it's trivially easy to hack the installer to work. You will need a hex editor.

First, download and decompress the Services for Unix installer as described above. Then, load the SfuSetup.msi file into your hex editor. Search for:

NOT (VersionNT = 501 AND MsiNTSuitePersonal)

Change the 501 to 510 and save.

You can now proceed with the installation as described above.

Windows Vista

I don't know --- I don't have access to a Windows Vista machine. I think it's like Windows 7 (see below). I do know that Interix is only available on Vista Pro and Vista Ultimate systems.

If you get Interix working on Vista, please let me know so I can update this page!

Windows 7

Note: LBW does not work on Windows 7 yet!

Interix is built in, but disabled.

To enable it, go to Control Panel -> Programs and Features and check the 'Subsystem for Unix-based Applications' and turn it on. You'll have to reboot.

[If you want to do LBW development, you'll need to do more than this --- ask on the mailing list.]

Stuff That Is Broken

Lots of stuff in LBW is broken. Quite a lot of it I don't know how to fix. Can you help?

Windows Vista & Windows 7

Right now LBW only works on Windows XP with Interix 3.5.

LBW is known not to work on Windows 7 (with Interix 6.0). I do not know why. I have done some debugging, and it would appear that a whole bunch of stuff doesn't work --- mmap() producing EIO errors randomly but consistently, memory corruption when starting new processes, etc.

My only real development machine is Windows XP. I would dearly love for someone to look into why LBW doesn't work on other versions. Given that Interix is a pain to install on Windows XP and trivial to install on Windows 7, it's a pity it doesn't work there.

Plus, I have no access to any Vista machine, so have no idea how it stands there...

%gs and segmentation

Linux uses the %gs register to identify the currently running thread. It does this by creating a 4GB-long GDT with base address at the thread's descriptor block.

This allows the process to do things like:

mov [gs:0], eax

...to load the quad at the start of the descriptor block into %eax. On the register-starved ia32 architecture this improves performance drastically over other ways of doing it.

Unfortunately Windows won't let me create GDTs. It will let me create LDTs using miscellaneous undocument Windows NT kernel calls, but they're not quite good enough --- Windows enforces a size limit on them to stop them extending above about $7ff00000.

The issue here is that Linux processes also do things like:

mov [gs:0xfffffffc], eax

...to load the quad immediately before the start of the thread descriptor block. It can do this because address arithmetic in a 4GB-long segment wraps round, so adding 0xfffffffc is equivalent to subtracting 4.

But Windows won't let me create a 4GB LDT.

What I'm doing instead is leaving %gs set to 0. This causes a page fault to occur every time the Linux process tries to execute an instruction that involves %gs. I can examine the code that it tried to execute, generate a fragment of equivalent code that does not use %gs, and run that instead, before returning to the process.

This works, but it's dog slow --- page faults are not fast, and cripplingly, Linux assumes that %gs references are fast, so it thinks nothing of doing them in inner loops.

I am attempting to patch the code with the translated fragments where possible, but I need five bytes to make this possible (the size of a jump instruction), and frequently it's not.

Does anyone know a way to make Windows create a 4GB GDT or LDT? Preferably one that doesn't involve a custom kernel driver.

mmap()

Linux makes huge use of the mmap() system call. This attaches a file to the VM, causing a section of memory to become a view of the file. It's used all over the place, from loading code to copying files.

Interix supports mmap() --- I would not even have attempted this if it hadn't. Unfortunately, Windows and Linux have rather different mmap() semantics.

The big issue is: Linux allows mmap()ing on 4kB boundaries. Windows requires 64kB boundaries.

This becomes a big problem when it comes to loading code. Linux applications are loaded at 0x0804800, which is not 64kB-aligned. Therefore I cannot mmap() it. It gets worse when it comes to shared libraries; ld.so assumes that it can map a file to an arbitrary address and then map a 4kB page immediately after it.

What I've got, therefore, is a ghastly mess of code that attempts to work out whether it's possible to mmap() the file directly or whether it has to allocate RAM and physically load the file data into it. While it currently appears to work, there are certain combinations of flags that won't work --- MAP_SHARED|MAP_FIXED to an address that is not 64kB-aligned, for example. It's also slow and uses lots of RAM.

Does anyone know a way to make Windows map files using 4kB granularity?

clone() and futex()

Linux' threading primitives all boil down to just two system calls: clone(), which starts a new thread or process, and futex(), which is a basic synchronisation primitive.

Interix has neither of these.

Right now I only support clone() enough to make fork() work. Trying to create a thread will fail. futex() contains just enough stub support to make glibc start up, and no more.

I believe that it is not possible to implement futex() on Windows, simply due to mismatches between the differing way synchronisation works on the two platforms --- there is no Windows NT primitive that is equivalent, and I cannot emulate futex() due to needing to be able to do stuff atomically.

Can anyone prove me wrong?

I do have a backup plan, which is to provide a replacement Linux pthreads library that calls out to the Interix pthreads library to do the work; but this is ugly and won't help with static binaries.

Signals

LBW's signal handling is a broken mess. Right now it only works by accident.

Linux supports 64 signals (32 conventional ones and 32 real-time signals). Interix supports only 32, and what's more Interix doesn't support any of the signal-handling extensions that Linux does such as sigaltstack() or SA_SIGINFO.

I have a horrible feeling I'm going to have to implement a complete interprocess signal handling layer on top of Interix'.

However, I don't actually know much about signals. Can anyone offer insight?

File handles

Linux supports large files, with 64-bit lengths and offsets.

Interix does not, even though Windows NT does. As a result, trying to use files bigger than 4GB (and probably 2GB) is going to work very badly.

I could use the Windows NT kernel file manipulation functions directly, thus working around the Interix limit... if I knew the Windows NT file handle.

Does anyone know how to get the Windows NT file handle from an Interix file descriptor?