duskos

dusk os fork
git clone git://git.alexwennerberg.com/duskos
Log | Files | Refs | README | LICENSE

hal.txt (13132B) - raw


      1 # Harmonized Assembly Layer
      2 
      3 The Harmonized Assembly Layer is a set of words implemented by all Dusk kernels
      4 which have the same semantics and compile native code that has consistent
      5 results on all architectures. For example, "RSP) 2 +) 16b) +," will, on all
      6 arches, compile a set of instructions that will result in the 16-bit addition
      7 of RSP+2 into the Work register. On i386, this is the same as
      8 "ax sp 2 d) 16b) add,".
      9 
     10 This layer allows us to generate performant code in a cross-arch manner. It is
     11 also what compilers such as the C compiler rely on to generate code.
     12 
     13 Of course, as with any abstraction, we sometimes lose a little bit in speed and
     14 binary space compared to direct assembler instructions, but in general, the
     15 result is pretty good and direct assembler should be needed only in the tightest
     16 of the loops.
     17 
     18 The HAL is implemented at the kernel level and is available from the very
     19 beginning of the boot sequence, which makes extensive use of it to bootstrap
     20 into a usable system.
     21 
     22 The HAL is always for the "live" system. It has not been designed with cross-
     23 compiling in mind.
     24 
     25 ## Concepts
     26 
     27 ### Register allocation
     28 
     29 The HAL has 4 virtual registers: W, A, PSP, RSP. Each architecture implementing
     30 the HAL will need to map those virtual registers to actual registers. For
     31 example, on i386, W=eax A=edi PSP=esi and RSP=esp.
     32 
     33 ### W and A registers
     34 
     35 The HAL mainly operates over 3 locations: the W register, the A register, and
     36 memory addresses.
     37 
     38 The W register is the "work" register and the default destination of all HAL
     39 instructions. When we say that "@," means "fetch", we mean "fetch into the
     40 destination", which is the W register by default.
     41 
     42 The A register is a second register that can be used as a target, exactly like
     43 the W register. To do so, we use the "A>)" operand modifier.
     44 
     45 ### Operands
     46 
     47 All HAL instructions take either no operand (inherent) or one operand parameter.
     48 That operand parameter is a 32-bit number with an arch-specific bit structure
     49 and that contains all the information the instruction needs to know the source
     50 and destination of the instruction.
     51 
     52 Operand words all end with ")". For example, "A) +," means "add 32-bit location
     53 where the A register points to the W register".
     54 
     55 Some operand words are not directly operands, but operand modifiers. For
     56 example, "+)" adds a numerical offset to an operand. "W) 4 +)" refers to the
     57 memory location where W points to, with a 4 bytes displacement. The "8b)"
     58 modifier transforms the operand into a 8-bit operand.
     59 
     60 By default, all operands refer to a memory location. Only through the "&)"
     61 operand (see below) can we refer directly to a value in a register.
     62 
     63 ### &) operand modifier
     64 
     65 The &) word takes an input operand and returns its dereferenced counterpart. For
     66 example, m) becomes i), W) becomes a direct reference to W, etc. This also works
     67 with displacements. For example, "RSP) 4 +) &)" yields an operand that points
     68 to RSP+4.
     69 
     70 This operand might not be adressable directly by the host CPU. In that case, the
     71 HAL operator will compile two instructions. For example, "RSP) 4 +) &) +," under
     72 i386 would yield "bx sp 4 +) lea, ax bx add,".
     73 
     74 The "&)" word never writes instructions directly, only operator words. The
     75 "lea," above wouldn't be written when "&)" is called, but when "+," is.
     76 
     77 The &) operand always results in a 32-bit operation. Don't try to apply 16b) or
     78 8b) afterwards, this results in undefined behavior.
     79 
     80 &) can't be used with i).
     81 
     82 ### <>) operand modifiers
     83 
     84 The <>) word inverts the destination of the HAL instruction, allowing
     85 arithmetic result to be stored directly in memory. For example,
     86 "$1234 m) 8b) <>) +," adds the 8-bit value at address $1234 to W and stores the
     87 result directly in address $1234 without affecting W.
     88 
     89 ### 8b) and 16b) arithmetics
     90 
     91 8b) and 16b) modifiers only apply to memory access and all arithmetics are
     92 "upscaled" to 32-bit with regards to flags settings and carry management
     93 (the C flag is never set in 16b) or 8b) mode).
     94 
     95 This also applies to compare, which means that, for example,
     96 "$4242 i) @, RSP) 8b) compare," will never set the Z flag because even if RSP)
     97 is $42, comparison is done one the whole W register.
     98 
     99 ### RSP) and [rcnt]
    100 
    101 The only HAL operation that automatically adjusts [rcnt] (see "Local variables"
    102 in doc/usage) is rs+,. Other HAL operations don't touch [rcnt]. Therefore,
    103 special care must be taken when using the RSP) operand.
    104 
    105 If you're inside of a regular "code" word, you don't care about [rcnt], so you
    106 can ignore this warning.
    107 
    108 However, if you're writing HAL as part of a macro that could be used in a word
    109 that has local variables, then every time you write a HAL operation that
    110 modifies RSP ("RSP) @+," for example), you need to adjust [rcnt] accordingly or
    111 else you'll break local variables.
    112 
    113 ### Branching and flags
    114 
    115 The HAL can generate branching, conditional or not, through its "branch"
    116 instructions. "branchC,", the conditional branching generator, takes a "cond"
    117 argument. This argument is generated by words like "Z)", ">)", etc. and the
    118 number it yields is arch-specific. The idea is that through this number, the
    119 "branchC," instruction knows the kind of native branch instruction to generate.
    120 
    121 These conditions depend on flags being set or not and the conditions under which
    122 these flags are set or not is not exactly the same across achitectures.
    123 
    124 To be able to rely on consistant condition branching, HAL instructions make
    125 guarantees on the flags set by certain instructions. If an instruction has a "Z"
    126 next to it in the listing below, it's safe to conditionally branch using "Z)" or
    127 "NZ)" right after having called it. Even if the native instruction for a
    128 particular HAL word doesn't supply that flag, the HAL instruction will generate
    129 the necessary native instructions to make it so, at the cost of speed. For this
    130 reason, we minimize flag guarantees in HAL words.
    131 
    132 Arithmetic conditions (">)", "<=)", etc.) have no associated flag and can only
    133 be used after a "compare,".
    134 
    135 If you look at branching words signatures, you'll notice something weird: the
    136 take an address parameter and yield an address result. This is because those
    137 words can be used for both backward branching or forward branching. What they do
    138 is to write down a branch to the supplied address, but also yield an address to
    139 the memory location that can then be used by "branch!".
    140 
    141 Therefore, a backward branch looks like "begin .. branch, drop" and a forward
    142 branch looks like "0 branch, .. here swap branch!"
    143 
    144 All addresses passed to branching words are absolute addresses. If the native
    145 instructions use relative branching addressing, the HAL takes care of the
    146 translation.
    147 
    148 ## pushret, and popret,
    149 
    150 In Dusk, "Call" means "Push the address of the instruction following the current
    151 one to RSP, and then jump to the address being called". "Return" means "Pop RSP
    152 and jump to that address".
    153 
    154 On "traditional" CPU architectures, this maps exactly to the behavior of the
    155 native "call" and "return" instructions, so we can live a happy life of
    156 blissful ignorance when using these CPUs.
    157 
    158 On some CPUs such as ARM, the native "call" model is to save the address we'll
    159 want to return to to a register and leave the task of push/popping to RSP to the
    160 programmer.
    161 
    162 Of course, one thing we could do is to simply wrap all calls and returns in Dusk
    163 into RSP push/pop operation, but that would squander a wonderful speedup
    164 opportunity: With such an approach to calling, we can avoid one push and one pop
    165 on each "leaf" routine call, that is, on each call to a routine that doesn't
    166 call any other routine. That adds up to quite a lot of pushes and pops.
    167 
    168 To grab this opportunity, the HAL has two words: pushret, and popret,
    169 
    170 On "traditional" CPUs, these are noops. On ARM, these words push and pop the
    171 return address register to and from RSP.
    172 
    173 Words defined through "high level" mechanism such as ":" call those words
    174 automatically, no need to worry. However, words created with "code" don't.
    175 
    176 This means that if you create such a word and that this word calls another word,
    177 it needs to call "pushret," as a prelude and to call "popret," before it
    178 returns. Leaf words don't need to do that, which makes them faster.
    179 
    180 ## HAL API
    181 
    182 Operand words:
    183 
    184 W)    -- op          Indirect W register
    185 A)    -- op          Indirect A register
    186 PSP)  -- op          Indirect PSP register
    187 RSP)  -- op          Indirect RSP register
    188 i)    n -- op        Immediate operand. Can't use with <>)
    189 m)    addr -- op     Absolute address
    190 +)    op disp -- op  Apply displacement to op. Can be applied multiple times.
    191                      Displacement can be negative.
    192 A>)   op -- op       A register is the destination instead of W
    193 &)    op -- op       Dereference operand (see above)
    194 <>)   op -- op       Direction of the operation is inverted (see above)
    195 8b)   op -- op       Make op 8-bit
    196 16b)  op -- op       Make op 16-bit
    197 32b)  op -- op       Make op 32-bit (default)
    198 
    199 Branching and conditions:
    200 
    201 Z)
    202 NZ)
    203 <)
    204 <=)
    205 >)
    206 >=)
    207 s<)   Signed comparison
    208 s<=)
    209 s>)
    210 s>=)
    211 
    212 C>W,       cond --
    213   If cond is met, W=1. Otherwise, W=0.
    214 
    215 branch,    a -- a
    216   Branch to address a, yielding a "forward" address for "branch!"
    217 branchC,   a cond -- a
    218   Branch to address a if condition is met, yielding "a" for "branch!"
    219 branch!    tgtaddr braddr --
    220   Given "braddr" yielded by a previous "branch" instruction, change the
    221   reference at the address so that it targets "tgtaddr". Used for forward
    222   branching.
    223 branchR,   a --
    224   Compile a branch to address a while at the same time setting the "return
    225   address" (commonly, that means pushing to RSP, but not always) to the
    226   instruction directly following this one. This is commonly called a "call".
    227 branchA,   --
    228   Branch to the address held in the A register.
    229 exit,      --
    230   Compile a return from a call.
    231 pushret,   --
    232   Push the current return address to RSP (on relevant CPUs)
    233 popret,    --
    234   Pop RSP in return address register (on relevant CPUs)
    235 
    236 Instructions:
    237 
    238 @,       op --      Read source into dest
    239 !,       op --      Write dest to source. Shortcut for "<>) @,"
    240 @!,      op --      Swap dest and source
    241 +,       op --   Z  dest + source
    242 -,       op --   Z  dest - source
    243 *,       op --      dest * source
    244 /mod,    op --      divide dest by source and put remainder in A register.
    245                     Can't be used with A>) or <>).
    246 <<,      op --      dest lshift source
    247 >>,      op --      dest rshift source
    248 &,       op --   Z  dest and source
    249 |,       op --   Z  dest or source
    250 ^,       op --   Z  dest xor source
    251 @+,      op --      Read source into dest and then add 4/2/1 to operand's
    252                     dereferenced source. Cannot be used with m) i) &)
    253                     If source is the same as dest, behavior is undefined.
    254 !+,      op --      Equivalent to "<>) @+,". Source==dest is weird, but fine.
    255 -@,      op --      Subtract 4/2/1 to operand's dereferenced source and then
    256                     read source into dest. Cannot be used with m) i) &)
    257 -!,      op --      Equivalent to "<>) -@,".
    258 compare, op --   *  Compare source to dest (all flags set)
    259 +n,      n op -- Z  Add n to source without affecting dest
    260                     Can't use with i) or <>)
    261 
    262 ps+,    n --        Add n to PSP
    263 rs+,    n --        Add n to RSP
    264 -W,     --          W = -W
    265 
    266 ## Examples
    267 
    268 To give a better idea of how the HAL works, here are examples with their
    269 corresponding i386 instructions (W=EAX A=EBX RSP=ESP PSP=ESI):
    270 
    271 PSP) @,                                ax si 0 d) mov,
    272 A) 8b) !,                              bx 0 d) al mov,
    273 RSP 4 +) A>) +,                        bx sp 4 d) add,
    274 PSP) &) A>) @!,                        bx si xchg,
    275 PSP) <>) <<,                           cx ax mov,
    276                                        si 0 d) cl shl,
    277 RSP) @+,                               ax sp 0 d) mov,
    278                                        sp 4 i) add,
    279 A) 16b) !+,                            bx 0 d) 16b) ax mov,
    280                                        bx 2 i) add,
    281 A) 16 +) &) @,                         bx 16 d) lea,
    282 $1234 m) +n,                           $1234 m) 42 i) add,
    283 42 PSP) &) +n,                         si 42 i) add,
    284 54 i) -,                               ax 54 i) sub,
    285 
    286 ## HAL number bank
    287 
    288 Numbers supplied to i) m) and +) can be any number of the 32-bit range.
    289 Nevertheless, as per HAL API constraints, all operands occupy only one PS slot.
    290 
    291 Therein lies a problem: how can a 32-bit operand include its necessary metadata
    292 along with a possible offset that can be anything in the 32-bit range? It does
    293 so through a number bank mechanism.
    294 
    295 The number bank is a 4b * 16 global and static rolling buffer. This allows us to
    296 assign arbitrary number to slots numbering from 0 to 15. This slot number
    297 occupies only 4 bit in our HAL operand, which is much more manageable.
    298 
    299 This allows up to 16 operands associated with numbers to coexist at once on PS,
    300 making HAL and assemblers (which piggy-back on this API) pretty macro-able.
    301 
    302 Every kernel implement this number bank and expose this API:
    303 
    304 hbank' ( slot -- a )
    305   Get address associated to bank slot.
    306 
    307 hbank! ( n -- slot )
    308   Reserve a new slot and write "n" to it. Yield the ID of the new slot.
    309 
    310 hbank@ ( slot -- n )
    311   Yield number in slot.