dusk os fork
git clone git://git.alexwennerberg.com/duskos
Log | Files | Refs | README | LICENSE

commit 9d3056effbcccbc4e4db16d1b26eb7068b3e2c47
parent 53fcf840cc1df53246b8270ab70cabc21d8b7ddd
Author: Virgil Dupras <hsoft@hardcoded.net>
Date:   Thu, 29 Dec 2022 09:29:13 -0500

doc: add xcomp/bootlo walkthrough in doc/code

Mfs/doc/code.txt | 108+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 108 insertions(+), 0 deletions(-)

diff --git a/fs/doc/code.txt b/fs/doc/code.txt @@ -1,5 +1,18 @@ # Understanding Dusk's code +Dusk's code is not the easiest to read. Before a piece of code can be read, it +has to be properly contextualized and that context is large. + +First of all, a solid knowledge of doc/usage and doc/dict is assumed. You're +supposed to know the basic mechanisms of Dusk (for example, structs and width +modulation). + +Then come general patterns in code that only make sense when you're aware of +them. This document can help. + +Then, when you want to understand a particular unit, you'll also want to read +documentation about it which is generally in doc/ (not in code comments). + ## Patterns in code @ means "fetch". Its presence in a word indicate that we fetch a value from @@ -60,3 +73,98 @@ context. For example, calling "(?br)" makes no sense. "(?br)" is compiled by & means "create doer" and is given to "does words" compilers. For example, "42 &+" means "create an adder with a 42 constant". + +## Understanding a unit + +Units are generally accompanied by a documentation page and this is where you'll +have high level information about that code. The code is often commented, but +it's generally for something specific to those few lines of code, not macro +information. + +## Understanding drivers + +To understand a driver, it is absolutely necessary to understand the target +hardware, otherwise the code is gibberish. This is true not only of Dusk, but of +a whole bunch of operating systems. Don't expect driver code to teach you about +hardware. Dusk doesn't contain this information, you'll have to get this +elsewhere. + +## Understanding C/Forth interoperability + +One of Dusk's primary goals is to elegantly marry C and Forth, so having C and +Forth together is common in the code. All pieces of C code are wrapped in Forth. +Invoking "cc<<" directly is a bit awkward for the user, so even the purest of C +unit, for example "foo.c", will not exist alone. It will always be accompanied +by a "foo.fs" unit that takes care of compiling the unit. But more often than +not, that's not the only thing that "foo.fs" will do. It will also include words +that are easier to implement in Forth than in C. If some structures in C are +useful to have in Forth, then that unit will also export those structures. + +## Understanding bootlo + +xcomp/bootlo.fs is one of the more complex units of Dusk and its trickiest +parts, the beginning, can't be commented because we don't have comments yet. +Here's a walkthrough to that code. + +First, the context. When we begin running bootlo, all we have is the kernel and +that's quite limited. We have literal parsing and memory (@ ! , etc.) words and +constants to important memory areas (HERE and sysdict), we have the very +important and arch-specific exit, and execute, and we have the extremely +important flow words [ and ] , but we don't have : or ; yet! + +The first task of bootlo is thus to implement those 2 very important words. +These first few words are a good lesson for what constitutes a word in Dusk. +The very first word we want is the equivalent of ": w>e 5 - ;" + +First, it's a dictionary entry, which we create manually: a stream of +characters, a null metadata, a link to prev, which is contained in "sysdict @", +and finally, the length field. + +Now, we want to update sysdict to register this new word. sysdict is a linked +list of *entries*. The address of the entry we've just added is easy to obtain, +it's "here w>e"... but we don't have w>e yet. So we go with a little code +duplication. + +Alright, our entry is set up and "here" points to the word-to-be. Now we need to +compile the code for "5 - ;". We *could* be hardcore and go with something like +"5 litn, ' - execute, ret," but the kernel already contains the mechanism to +compile words, so let's just us it and switch to compile mode with ]. + +The following "5 -" have the exact same effect as if they were in a regular +definition. + +We don't have ; yet, so we need to switch back to interpret mode and compile a +native "ret" with the word exit, + +Whew, our first word! The next one will be a little easier because we have w>e. +It's ; . That one is useful too! + +Same principle as before, but length is $81 instead of 1 because it's an +immediate. This time, we don't compile the word in compile mode because it's +tricky, we have to write the equivalent of "[compile] [ compile exit, ;" but +we don't have [compile] or compile yet. We do this all in interpret mode. + +Now, creating entries manually is fun and all, but it's getting repetitive, so +let's implement "entry". This one is created as before, but this time the +implementation is a bit more complex because it has to align the *word* part +(not the beginning of the entry) to 32-bit. This is the home of the "nextmeta" +mechanism which allows metadata to be defined before entries themselves (for +doctrings for example). Also, notice the "0 [rcnt] !" part which is crucial to +the local variables mechanism. + +From here, things will get easier. "code" is nothing else than "entry" hardcoded +on sysdict and taking its name from the input stream. The all-important ":" is +nothing but "code ]". + +The rest is self-explanatory. The first few words are implemented in a "need +first" basis, but soon enough, critical words have been implemented and we can +begin organizing words in themes. + +We try to keep these words to a minimum. Some of these words might appear like +they don't belong in bootlo. The reason why they're there there is one of the +following: + +1. They're needed in storage drivers or fatlo. +2. They're too close to one of the words already there and it's awkward to but + them elsewhere. For example, we might need "0<" and not "0>=", but do we + really want those words not to live close?