James Stanley


Play with SCAMP from the comfort of your browser

Mon 9 August 2021
Tagged: cpu, software

Today I ported the SCAMP emulator to the web, using emscripten to compile C to WebAssembly, and Xterm.js to provide the terminal emulator.

You can play with it here: SCAMP Emulator ».

Fun things for you to do with the emulator include playing Hamurabi, writing a "Hello, world" program in SLANG using kilo, finding new and interesting ways to cause kernel panics, and fixing whatever bug is causing half the files that should be in /src/ to go missing.

(If you actually do manage to cause a kernel panic, and it's not one of: deleting files from /proc/, filling up the disk, tinkering with kernel memory - then please email me and let's try and fix it!).

The disk image is not persistent, which means you lose all your work as soon as you close the tab, and there's also no way to upload or download files. Maybe that'll come later, maybe not. There's also no documentation, so... good luck.

This was actually my 2nd attempt at putting together a web emulator. The first time I gave up because I couldn't think of a quick and easy way to share the C code between the CLI emulator and the web-based version, because the CLI version's profiling and debugging hooks get in the way. This time I decided to just copy and paste the code and throw away the parts I don't want.

You might think that having 2 near-identical copies of the same code is distasteful: I agree. But I'd rather have an ugly program in the computer than an elegant one in my head. Perfect is the enemy of good, etc. Maybe one day I'll tidy it all up so that the common parts are shared. (Ha, good joke).

Emscripten

Setting up emscripten is much easier than you might expect. You can just copy and paste ~5 commands from the getting started guide, and then you're ready to go. After that, I didn't bother with the rest of the official guide, I instead followed a much shorter introduction from Mozilla.

I've been really impressed with emscripten. It seems that at every turn, if it's possible for emscripten to do some magic to make your life easier, then it does it. It even does a little bit of magic that doesn't seem possible. I'd like to use emscripten more often.

Xterm.js

Xterm.js is also really good, but its documentation leaves a lot to be desired. (Yeah, yeah, that's a bit rich, I know).

It has the kind of documentation that you get if you think comprehensive auto-generated descriptions of internal interfaces is preferable to short sections of typical example usage.

Apart from the auto-generated interface documentation, you're in good hands if you want 4000 words on Parser Hooks & Terminal Sequences, but if you want to know how to resize the terminal to 80x25, you have to read the source and figure it out yourself.

(Protip: term.resize(80,25). If this is documented then I certainly couldn't find it).

Apart from the documentation, Xterm.js is actually quite easy to use. Just make a <div id="terminal" />, let term = new Terminal(); term.open(document.getElementById('terminal'), hook onKey() to find out about user input, and term.write(...) to provide output. It pretty much handles being a terminal emulator exactly the way you'd expect.

Building the emulator

Emulating the CPU itself is mostly simple. I just simulate all the changes to the control signals on a negative clock edge, and then a positive clock edge, and then repeat. The annoying parts are interfacing with the terminal, populating the disk contents, and populating the boot ROM and microcode ROM contents.

To populate the boot ROM and microcode ROM, I wrote a Perl script to convert the ROM files into C source files that contain their contents. I initially tried to use the same technique for the disk image, but at 32 megabytes it was too big to compile on my laptop and emcc ran out of memory trying to compile it.

I subsequently learnt that emcc has an option, --preload-file, that allows you to include existing files inside the emscripten environment. This turned out to be perfect: just compile with --preload-file os.disk and then read the disk contents from os.disk at runtime using normal C filesystem interaction. This is really incredible technology. I expected emscripten to compile C code to WebAssembly, I didn't expect it to provide a simulated operating system!

My complete command line is:

emcc -o scamp.js scamp.c ucode.c rom.c --preload-file os.disk -s WASM=1 -O3 -s NO_EXIT_RUNTIME=1 -s EXPORTED_RUNTIME_METHODS=['ccall'] -s ALLOW_MEMORY_GROWTH=1

I think NO_EXIT_RUNTIME means your program maintains its state even after main() returns. Putting ccall in EXPORTED_RUNTIME_METHODS lets you call C functions from JavaScript, and ALLOW_MEMORY_GROWTH is required because I want to allocate 32 megabytes of memory to store the disk image. (I actually load the disk image with mmap() because that's how I did it for the CLI emulator, and amazingly this "just works" thanks to yet more incredible emscripten technology).

Part of the point of using emscripten rather than writing the emulator in Javascript is that I don't trust Javascript to be sufficiently performant to run the CPU at 1 MHz. To that end, I want to keep Javascript out of the loop as much as possible, so my "please run a clock cycle" function takes an argument that says how many clock cycles to run, and that way we don't need to initiate 1 million function calls per second from Javascript:

EMSCRIPTEN_KEEPALIVE
char *tick(int N, char *input) {

(EMSCRIPTEN_KEEPALIVE prevents the compiler from optimising the function out on the basis that it is never called - we need it because we'll be calling it from Javascript).

The idea of tick() is that console input characters are passed as an argument, and console output is returned. This is kind of weird and unpleasant, but it does the trick. Calling this function from Javascript looks like:

let pending_output = ccall('tick', 'string', ['number', 'string'], [10000, pending_input]);

The ccall() arguments are:

Having spent most of this year writing low-level code for SCAMP, where the computer can't even work out if you've passed the correct number of arguments to a function, it blows my mind that emscripten can just magically turn Javascript types into the appropriate C types for the specific function you're calling, and it all just works. Very impressive stuff.

To get IO to/from Xterm.js, keypresses and text output are sent between the main web page and the web worker using postMessage().

Web stuff

Charlie complained that the page took a long time to load over a mobile connection. This was almost entirely down to having to load the 32 megabyte disk image, but since the disk is almost completely empty, it seems a bit wasteful. I asked nginx to compress it:

location /scamp/scamp.data {
    gzip on;
    gzip_types application/octet-stream;
}

And that solved the problem: it now only transfers 362 kilobytes for the disk image, almost 100x improvement.

kilo originally used Ctrl-S to save the file, but this collides with software flow control, which means I can't save anything if I'm using the real hardware via a USB serial cable in screen. So I changed it to Ctrl-W to write the file. Unfortunately Ctrl-W collides with "please close my tab and don't let the page intercept the keypress" in Firefox, which means it's impossible to save anything in the web-based emulator. So now I've changed it to Ctrl-O, which matches nano and stands for "output". I can't wait to find out what sort of UI Ctrl-O is going to interact poorly with.



If you like my blog, please consider subscribing to the RSS feed or the mailing list: