hal.txt (13132B) - raw
1 # Harmonized Assembly Layer 2 3 The Harmonized Assembly Layer is a set of words implemented by all Dusk kernels 4 which have the same semantics and compile native code that has consistent 5 results on all architectures. For example, "RSP) 2 +) 16b) +," will, on all 6 arches, compile a set of instructions that will result in the 16-bit addition 7 of RSP+2 into the Work register. On i386, this is the same as 8 "ax sp 2 d) 16b) add,". 9 10 This layer allows us to generate performant code in a cross-arch manner. It is 11 also what compilers such as the C compiler rely on to generate code. 12 13 Of course, as with any abstraction, we sometimes lose a little bit in speed and 14 binary space compared to direct assembler instructions, but in general, the 15 result is pretty good and direct assembler should be needed only in the tightest 16 of the loops. 17 18 The HAL is implemented at the kernel level and is available from the very 19 beginning of the boot sequence, which makes extensive use of it to bootstrap 20 into a usable system. 21 22 The HAL is always for the "live" system. It has not been designed with cross- 23 compiling in mind. 24 25 ## Concepts 26 27 ### Register allocation 28 29 The HAL has 4 virtual registers: W, A, PSP, RSP. Each architecture implementing 30 the HAL will need to map those virtual registers to actual registers. For 31 example, on i386, W=eax A=edi PSP=esi and RSP=esp. 32 33 ### W and A registers 34 35 The HAL mainly operates over 3 locations: the W register, the A register, and 36 memory addresses. 37 38 The W register is the "work" register and the default destination of all HAL 39 instructions. When we say that "@," means "fetch", we mean "fetch into the 40 destination", which is the W register by default. 41 42 The A register is a second register that can be used as a target, exactly like 43 the W register. To do so, we use the "A>)" operand modifier. 44 45 ### Operands 46 47 All HAL instructions take either no operand (inherent) or one operand parameter. 48 That operand parameter is a 32-bit number with an arch-specific bit structure 49 and that contains all the information the instruction needs to know the source 50 and destination of the instruction. 51 52 Operand words all end with ")". For example, "A) +," means "add 32-bit location 53 where the A register points to the W register". 54 55 Some operand words are not directly operands, but operand modifiers. For 56 example, "+)" adds a numerical offset to an operand. "W) 4 +)" refers to the 57 memory location where W points to, with a 4 bytes displacement. The "8b)" 58 modifier transforms the operand into a 8-bit operand. 59 60 By default, all operands refer to a memory location. Only through the "&)" 61 operand (see below) can we refer directly to a value in a register. 62 63 ### &) operand modifier 64 65 The &) word takes an input operand and returns its dereferenced counterpart. For 66 example, m) becomes i), W) becomes a direct reference to W, etc. This also works 67 with displacements. For example, "RSP) 4 +) &)" yields an operand that points 68 to RSP+4. 69 70 This operand might not be adressable directly by the host CPU. In that case, the 71 HAL operator will compile two instructions. For example, "RSP) 4 +) &) +," under 72 i386 would yield "bx sp 4 +) lea, ax bx add,". 73 74 The "&)" word never writes instructions directly, only operator words. The 75 "lea," above wouldn't be written when "&)" is called, but when "+," is. 76 77 The &) operand always results in a 32-bit operation. Don't try to apply 16b) or 78 8b) afterwards, this results in undefined behavior. 79 80 &) can't be used with i). 81 82 ### <>) operand modifiers 83 84 The <>) word inverts the destination of the HAL instruction, allowing 85 arithmetic result to be stored directly in memory. For example, 86 "$1234 m) 8b) <>) +," adds the 8-bit value at address $1234 to W and stores the 87 result directly in address $1234 without affecting W. 88 89 ### 8b) and 16b) arithmetics 90 91 8b) and 16b) modifiers only apply to memory access and all arithmetics are 92 "upscaled" to 32-bit with regards to flags settings and carry management 93 (the C flag is never set in 16b) or 8b) mode). 94 95 This also applies to compare, which means that, for example, 96 "$4242 i) @, RSP) 8b) compare," will never set the Z flag because even if RSP) 97 is $42, comparison is done one the whole W register. 98 99 ### RSP) and [rcnt] 100 101 The only HAL operation that automatically adjusts [rcnt] (see "Local variables" 102 in doc/usage) is rs+,. Other HAL operations don't touch [rcnt]. Therefore, 103 special care must be taken when using the RSP) operand. 104 105 If you're inside of a regular "code" word, you don't care about [rcnt], so you 106 can ignore this warning. 107 108 However, if you're writing HAL as part of a macro that could be used in a word 109 that has local variables, then every time you write a HAL operation that 110 modifies RSP ("RSP) @+," for example), you need to adjust [rcnt] accordingly or 111 else you'll break local variables. 112 113 ### Branching and flags 114 115 The HAL can generate branching, conditional or not, through its "branch" 116 instructions. "branchC,", the conditional branching generator, takes a "cond" 117 argument. This argument is generated by words like "Z)", ">)", etc. and the 118 number it yields is arch-specific. The idea is that through this number, the 119 "branchC," instruction knows the kind of native branch instruction to generate. 120 121 These conditions depend on flags being set or not and the conditions under which 122 these flags are set or not is not exactly the same across achitectures. 123 124 To be able to rely on consistant condition branching, HAL instructions make 125 guarantees on the flags set by certain instructions. If an instruction has a "Z" 126 next to it in the listing below, it's safe to conditionally branch using "Z)" or 127 "NZ)" right after having called it. Even if the native instruction for a 128 particular HAL word doesn't supply that flag, the HAL instruction will generate 129 the necessary native instructions to make it so, at the cost of speed. For this 130 reason, we minimize flag guarantees in HAL words. 131 132 Arithmetic conditions (">)", "<=)", etc.) have no associated flag and can only 133 be used after a "compare,". 134 135 If you look at branching words signatures, you'll notice something weird: the 136 take an address parameter and yield an address result. This is because those 137 words can be used for both backward branching or forward branching. What they do 138 is to write down a branch to the supplied address, but also yield an address to 139 the memory location that can then be used by "branch!". 140 141 Therefore, a backward branch looks like "begin .. branch, drop" and a forward 142 branch looks like "0 branch, .. here swap branch!" 143 144 All addresses passed to branching words are absolute addresses. If the native 145 instructions use relative branching addressing, the HAL takes care of the 146 translation. 147 148 ## pushret, and popret, 149 150 In Dusk, "Call" means "Push the address of the instruction following the current 151 one to RSP, and then jump to the address being called". "Return" means "Pop RSP 152 and jump to that address". 153 154 On "traditional" CPU architectures, this maps exactly to the behavior of the 155 native "call" and "return" instructions, so we can live a happy life of 156 blissful ignorance when using these CPUs. 157 158 On some CPUs such as ARM, the native "call" model is to save the address we'll 159 want to return to to a register and leave the task of push/popping to RSP to the 160 programmer. 161 162 Of course, one thing we could do is to simply wrap all calls and returns in Dusk 163 into RSP push/pop operation, but that would squander a wonderful speedup 164 opportunity: With such an approach to calling, we can avoid one push and one pop 165 on each "leaf" routine call, that is, on each call to a routine that doesn't 166 call any other routine. That adds up to quite a lot of pushes and pops. 167 168 To grab this opportunity, the HAL has two words: pushret, and popret, 169 170 On "traditional" CPUs, these are noops. On ARM, these words push and pop the 171 return address register to and from RSP. 172 173 Words defined through "high level" mechanism such as ":" call those words 174 automatically, no need to worry. However, words created with "code" don't. 175 176 This means that if you create such a word and that this word calls another word, 177 it needs to call "pushret," as a prelude and to call "popret," before it 178 returns. Leaf words don't need to do that, which makes them faster. 179 180 ## HAL API 181 182 Operand words: 183 184 W) -- op Indirect W register 185 A) -- op Indirect A register 186 PSP) -- op Indirect PSP register 187 RSP) -- op Indirect RSP register 188 i) n -- op Immediate operand. Can't use with <>) 189 m) addr -- op Absolute address 190 +) op disp -- op Apply displacement to op. Can be applied multiple times. 191 Displacement can be negative. 192 A>) op -- op A register is the destination instead of W 193 &) op -- op Dereference operand (see above) 194 <>) op -- op Direction of the operation is inverted (see above) 195 8b) op -- op Make op 8-bit 196 16b) op -- op Make op 16-bit 197 32b) op -- op Make op 32-bit (default) 198 199 Branching and conditions: 200 201 Z) 202 NZ) 203 <) 204 <=) 205 >) 206 >=) 207 s<) Signed comparison 208 s<=) 209 s>) 210 s>=) 211 212 C>W, cond -- 213 If cond is met, W=1. Otherwise, W=0. 214 215 branch, a -- a 216 Branch to address a, yielding a "forward" address for "branch!" 217 branchC, a cond -- a 218 Branch to address a if condition is met, yielding "a" for "branch!" 219 branch! tgtaddr braddr -- 220 Given "braddr" yielded by a previous "branch" instruction, change the 221 reference at the address so that it targets "tgtaddr". Used for forward 222 branching. 223 branchR, a -- 224 Compile a branch to address a while at the same time setting the "return 225 address" (commonly, that means pushing to RSP, but not always) to the 226 instruction directly following this one. This is commonly called a "call". 227 branchA, -- 228 Branch to the address held in the A register. 229 exit, -- 230 Compile a return from a call. 231 pushret, -- 232 Push the current return address to RSP (on relevant CPUs) 233 popret, -- 234 Pop RSP in return address register (on relevant CPUs) 235 236 Instructions: 237 238 @, op -- Read source into dest 239 !, op -- Write dest to source. Shortcut for "<>) @," 240 @!, op -- Swap dest and source 241 +, op -- Z dest + source 242 -, op -- Z dest - source 243 *, op -- dest * source 244 /mod, op -- divide dest by source and put remainder in A register. 245 Can't be used with A>) or <>). 246 <<, op -- dest lshift source 247 >>, op -- dest rshift source 248 &, op -- Z dest and source 249 |, op -- Z dest or source 250 ^, op -- Z dest xor source 251 @+, op -- Read source into dest and then add 4/2/1 to operand's 252 dereferenced source. Cannot be used with m) i) &) 253 If source is the same as dest, behavior is undefined. 254 !+, op -- Equivalent to "<>) @+,". Source==dest is weird, but fine. 255 -@, op -- Subtract 4/2/1 to operand's dereferenced source and then 256 read source into dest. Cannot be used with m) i) &) 257 -!, op -- Equivalent to "<>) -@,". 258 compare, op -- * Compare source to dest (all flags set) 259 +n, n op -- Z Add n to source without affecting dest 260 Can't use with i) or <>) 261 262 ps+, n -- Add n to PSP 263 rs+, n -- Add n to RSP 264 -W, -- W = -W 265 266 ## Examples 267 268 To give a better idea of how the HAL works, here are examples with their 269 corresponding i386 instructions (W=EAX A=EBX RSP=ESP PSP=ESI): 270 271 PSP) @, ax si 0 d) mov, 272 A) 8b) !, bx 0 d) al mov, 273 RSP 4 +) A>) +, bx sp 4 d) add, 274 PSP) &) A>) @!, bx si xchg, 275 PSP) <>) <<, cx ax mov, 276 si 0 d) cl shl, 277 RSP) @+, ax sp 0 d) mov, 278 sp 4 i) add, 279 A) 16b) !+, bx 0 d) 16b) ax mov, 280 bx 2 i) add, 281 A) 16 +) &) @, bx 16 d) lea, 282 $1234 m) +n, $1234 m) 42 i) add, 283 42 PSP) &) +n, si 42 i) add, 284 54 i) -, ax 54 i) sub, 285 286 ## HAL number bank 287 288 Numbers supplied to i) m) and +) can be any number of the 32-bit range. 289 Nevertheless, as per HAL API constraints, all operands occupy only one PS slot. 290 291 Therein lies a problem: how can a 32-bit operand include its necessary metadata 292 along with a possible offset that can be anything in the 32-bit range? It does 293 so through a number bank mechanism. 294 295 The number bank is a 4b * 16 global and static rolling buffer. This allows us to 296 assign arbitrary number to slots numbering from 0 to 15. This slot number 297 occupies only 4 bit in our HAL operand, which is much more manageable. 298 299 This allows up to 16 operands associated with numbers to coexist at once on PS, 300 making HAL and assemblers (which piggy-back on this API) pretty macro-able. 301 302 Every kernel implement this number bank and expose this API: 303 304 hbank' ( slot -- a ) 305 Get address associated to bank slot. 306 307 hbank! ( n -- slot ) 308 Reserve a new slot and write "n" to it. Yield the ID of the new slot. 309 310 hbank@ ( slot -- n ) 311 Yield number in slot.