Collapse OS Documentation Browser

Assembling binaries

Collapse OS features many assemblers. Each of them have their
specificities, but they are very similar in the way they work.

This page describes common behavior. Some assemblers stray from
it. Refer to arch-specific documentation for details.

Initial setup

Assemblers live in their arch-specific blkfs. To load it, you
first need to run "ARCHM" to have arch-specific loaders, and
then call your assembler loader (for example, "Z80A").

Loaded alone, an assembler will spit opcodes for a "live" tar-
get, that is, the computer it's running on.

As long as you don't relocate the code, you will be able to run
it just fine, but if you need to relocate it, you will need to
load XCOMP *before* you load your assembler so that you have
the necessary tooling to craft relocatable binaries.

See doc/cross for details.

Wrapping native code

You will often want to wrap your native code in such a way that
it can be used from within forth. You do that with CODE.

CODE allows you to create a new word, but instead of compiling
references to other words, you write native code directly.
Example:

CODE 1+ BC INC, ;CODE

This word can then be used like any other (and is of course
very fast).

Unlike the regular compiling process, you don't go in "compile
mode" when you use CODE. You stay in regular INTERPRET mode.
All CODE does is spit the proper word header.

Be sure to read about your target platform in doc/code. These
documents specify which registers are assigned to what role.

Usage

To spit binary code, use opcode words, such as "LD,", preceded
by proper arguments. For example, "A B LD," in the z80 assembler
spits the "LD" opcode in it's "rr" (8b register transfer) form.

Each assembler has its own little specificities as to how to
spit ops in their different forms. See arch-specific asm docs.

In all assemblers, arguments precede the opcode word ("A B LD,"
instead of "LD A B").

Some assemblers do basic argument checking and will let you know
if you're trying to spit something that doesn't make sense, but
none of them has complete error checking, so they will let you
spit nonsensical opcodes.

Why insist on prefix notation?

Compared to regular assemblers, which place their arguments
after the opcode mnemonic, Collapse OS assemblers are a bit mind
bending. Why do we adopt this notation?

First and foremost, simplicity. By using a notation that is the
same as forth's, we get the parsing part for free.

However, one can think of many simple ways of achieving regular
notation and these ways, compared the the overall system
complexity, wouldn't be such a complexity burden. Why insist
in prefix?

Macros. By sticking to regular forth, we have macro-ability for
free. If you add a postfix parsing mechanism in there, you need
to add special provisions for macros, and then things get icky.

So, that's why.

Labels and flow

Assemblers, of course, implement their "flow" ops (jumps) but
these are often awkward to use directly. To help with that,
Collapse OS has a unified "flow" interface:

IFZ, .. ELSE, .. THEN, \ part 1 if Z is set, part 2 otherwise
IFNZ, .. THEN, \ execute if Z is unset
IFC, .. THEN, \ execute if C is set
IFNC, .. THEN, \ execute if C is unset
BEGIN, .. BR JRi, \ loop forever
BEGIN, .. BR JRZi, \ loop if Z is set
FJR JRi, .. THEN, \ unconditional forward jump

This unified flow layer lives at B007 and is loaded with
assemblers. This layer requires the assembler to supply these
words (which are ofter simple aliases):

JRi,    off --    relative unconditional jump
JRZ,    off --    relative conditional jump if Z is set
JRNZ,   off --    relative conditional jump if Z is unset
JRC,    off --    relative conditional jump if C is set
JRNC,   off --    relative conditional jump if C is unset

This is not related to flow, but for xcomp, these words are
also defined by every assembler:

JMPi,   addr --   unconditional absolute jump
JMP(i), addr --   unconditional indirect jump
CALLi,  addr --   unconditional absolute call
i>,     n --      push n to Parameter stack
(i)>,   addr --   push value at addr to Parameter Stack

These words generate the appropriate native code to perform the
described actions.

These structured flow are elegant, but limited because they need
to be symmetric. There is no way, for example, to jump out of
an infinite loop using only those words.

Labels can also be used with those flow words for more
flexibility:

LSET L1 .. L1 BR JRi, .. L1 JMPi, \ backward jumps
FJR JRi, TO L1 .. L1 FMARK \ forward jump
BEGIN, FJR JRi, TO L1 .. BR JRi, .. L1 FMARK \ exiting loop

Labels are simple VALUEs. For example, you can create a label
with "0 VALUE lblmylabel". If in an XCOMP context, make sure you
declare your labels before XSTART. XCOMP pre-declares L1 L2 and
L3 which can be used in local contexts.
This page generated at 2025-03-09 21:05:02 from documentation in CollapseOS snapshot 20230427.