commit 9d3056effbcccbc4e4db16d1b26eb7068b3e2c47
parent 53fcf840cc1df53246b8270ab70cabc21d8b7ddd
Author: Virgil Dupras <hsoft@hardcoded.net>
Date: Thu, 29 Dec 2022 09:29:13 -0500
doc: add xcomp/bootlo walkthrough in doc/code
Diffstat:
M | fs/doc/code.txt | | | 108 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
1 file changed, 108 insertions(+), 0 deletions(-)
diff --git a/fs/doc/code.txt b/fs/doc/code.txt
@@ -1,5 +1,18 @@
# Understanding Dusk's code
+Dusk's code is not the easiest to read. Before a piece of code can be read, it
+has to be properly contextualized and that context is large.
+
+First of all, a solid knowledge of doc/usage and doc/dict is assumed. You're
+supposed to know the basic mechanisms of Dusk (for example, structs and width
+modulation).
+
+Then come general patterns in code that only make sense when you're aware of
+them. This document can help.
+
+Then, when you want to understand a particular unit, you'll also want to read
+documentation about it which is generally in doc/ (not in code comments).
+
## Patterns in code
@ means "fetch". Its presence in a word indicate that we fetch a value from
@@ -60,3 +73,98 @@ context. For example, calling "(?br)" makes no sense. "(?br)" is compiled by
& means "create doer" and is given to "does words" compilers. For example,
"42 &+" means "create an adder with a 42 constant".
+
+## Understanding a unit
+
+Units are generally accompanied by a documentation page and this is where you'll
+have high level information about that code. The code is often commented, but
+it's generally for something specific to those few lines of code, not macro
+information.
+
+## Understanding drivers
+
+To understand a driver, it is absolutely necessary to understand the target
+hardware, otherwise the code is gibberish. This is true not only of Dusk, but of
+a whole bunch of operating systems. Don't expect driver code to teach you about
+hardware. Dusk doesn't contain this information, you'll have to get this
+elsewhere.
+
+## Understanding C/Forth interoperability
+
+One of Dusk's primary goals is to elegantly marry C and Forth, so having C and
+Forth together is common in the code. All pieces of C code are wrapped in Forth.
+Invoking "cc<<" directly is a bit awkward for the user, so even the purest of C
+unit, for example "foo.c", will not exist alone. It will always be accompanied
+by a "foo.fs" unit that takes care of compiling the unit. But more often than
+not, that's not the only thing that "foo.fs" will do. It will also include words
+that are easier to implement in Forth than in C. If some structures in C are
+useful to have in Forth, then that unit will also export those structures.
+
+## Understanding bootlo
+
+xcomp/bootlo.fs is one of the more complex units of Dusk and its trickiest
+parts, the beginning, can't be commented because we don't have comments yet.
+Here's a walkthrough to that code.
+
+First, the context. When we begin running bootlo, all we have is the kernel and
+that's quite limited. We have literal parsing and memory (@ ! , etc.) words and
+constants to important memory areas (HERE and sysdict), we have the very
+important and arch-specific exit, and execute, and we have the extremely
+important flow words [ and ] , but we don't have : or ; yet!
+
+The first task of bootlo is thus to implement those 2 very important words.
+These first few words are a good lesson for what constitutes a word in Dusk.
+The very first word we want is the equivalent of ": w>e 5 - ;"
+
+First, it's a dictionary entry, which we create manually: a stream of
+characters, a null metadata, a link to prev, which is contained in "sysdict @",
+and finally, the length field.
+
+Now, we want to update sysdict to register this new word. sysdict is a linked
+list of *entries*. The address of the entry we've just added is easy to obtain,
+it's "here w>e"... but we don't have w>e yet. So we go with a little code
+duplication.
+
+Alright, our entry is set up and "here" points to the word-to-be. Now we need to
+compile the code for "5 - ;". We *could* be hardcore and go with something like
+"5 litn, ' - execute, ret," but the kernel already contains the mechanism to
+compile words, so let's just us it and switch to compile mode with ].
+
+The following "5 -" have the exact same effect as if they were in a regular
+definition.
+
+We don't have ; yet, so we need to switch back to interpret mode and compile a
+native "ret" with the word exit,
+
+Whew, our first word! The next one will be a little easier because we have w>e.
+It's ; . That one is useful too!
+
+Same principle as before, but length is $81 instead of 1 because it's an
+immediate. This time, we don't compile the word in compile mode because it's
+tricky, we have to write the equivalent of "[compile] [ compile exit, ;" but
+we don't have [compile] or compile yet. We do this all in interpret mode.
+
+Now, creating entries manually is fun and all, but it's getting repetitive, so
+let's implement "entry". This one is created as before, but this time the
+implementation is a bit more complex because it has to align the *word* part
+(not the beginning of the entry) to 32-bit. This is the home of the "nextmeta"
+mechanism which allows metadata to be defined before entries themselves (for
+doctrings for example). Also, notice the "0 [rcnt] !" part which is crucial to
+the local variables mechanism.
+
+From here, things will get easier. "code" is nothing else than "entry" hardcoded
+on sysdict and taking its name from the input stream. The all-important ":" is
+nothing but "code ]".
+
+The rest is self-explanatory. The first few words are implemented in a "need
+first" basis, but soon enough, critical words have been implemented and we can
+begin organizing words in themes.
+
+We try to keep these words to a minimum. Some of these words might appear like
+they don't belong in bootlo. The reason why they're there there is one of the
+following:
+
+1. They're needed in storage drivers or fatlo.
+2. They're too close to one of the words already there and it's awkward to but
+ them elsewhere. For example, we might need "0<" and not "0>=", but do we
+ really want those words not to live close?