duskos

dusk os fork
git clone git://git.alexwennerberg.com/duskos
Log | Files | Refs | README | LICENSE

commit c791a72f8738a107de38c902fbb60f39c747de7d
parent d34f22b8a0f7c3b66c6aecc158b0e72995bdf53a
Author: Virgil Dupras <hsoft@hardcoded.net>
Date:   Mon, 26 Dec 2022 20:44:59 -0500

Improve doc/arch slightly

Diffstat:
Mfs/doc/arch.txt | 124++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------------
Mfs/doc/code.txt | 2+-
Afs/doc/hw/i386/arch.txt | 50++++++++++++++++++++++++++++++++++++++++++++++++++
Dfs/doc/impl.txt | 24------------------------
Mfs/doc/index.txt | 4+---
Dfs/doc/x86.txt | 50--------------------------------------------------
Mfs/xcomp/bootlo.fs | 2+-
Mfs/xcomp/i386/kernel.fs | 2+-
8 files changed, 139 insertions(+), 119 deletions(-)

diff --git a/fs/doc/arch.txt b/fs/doc/arch.txt @@ -1,5 +1,10 @@ # Dusk OS Architecture +This document describes the hardware-independent part of Dusk OS. To have a +complete picture, you'll also want to read the hardware-dependent part: + +* i386: doc/hw/i386/arch + ## Subroutine Threaded Code This Forth is a Subroutine Thread Code (STC) Forth, that is, each reference to @@ -15,7 +20,7 @@ first element is a 4 bytes pointer to the next element. ## Dictionary structure Words in this Forth are embedded in a dictionary, which is a list of entries -each pointing to the previous entry. We keep that last added entry in "current". +each pointing to the previous entry. We keep that last added entry in "sysdict". Dictionaries are a form of linked list. The structure of each entry is: @@ -44,6 +49,41 @@ from a given address. Except for words specifically made for manipulating dictionary entries, we rarely deal with "entry" pointers. We most often deal with word pointers. +Entries are created with the "entry" word, which creates the structure described +below in a way that ensures that the *word* is 32-bit aligned. This means that +metadata and link fields are necessarily out of alignment. But the callable word +is aligned. + +"sysdict" hold a pointer to the latest entry of the system dictionary, "current" +yields the word associated to that entry. + +The most important dictionary in Dusk is the system dictionary, but there are +other dictionaries (for example, each struct has its own dictionary) and they +all share the same terminology. + +## Dictionary links + +Dictionary links, such as the "sysdict" variable, are not only linked lists, but +also "noop" entries, that is, they have a 0 "len" field following the "next" +field. + +This peculiarity opens interesting possibilities as it allows an entry to use it +as an "indirect" link. + +For example, let's say that you create a "fork" entry (not part of the system +dict, such as is the case in a struct[ definition) and link it to "sysdict @", +that is, the last entry to have been added to the system dict. It's going to +work, but your new "dictionary branch" will be frozen in time: new entries added +to the system dictionary will not be present in the fork. + +If, instead, you link your new forked entry to "sysdict" (without the @ +dereferencing), then your "fork" becomes fluid: entries added to the system +dictionary will also be part of the "fork" (but "under" it). However, if the +byte following the "next" link has an undefined value, it's possible (although +very unlikely), that this "fake" entry matches a real "find" operation. This is +why we make sure that dictionary links such as the "sysdict" variable are always +followed by a 0 len byte. + ## Cross-compilation When we use the word "cross-compiled" below, it means that the binary that is @@ -73,9 +113,10 @@ for the i386 kernel. At this point, we're done with cross-compiled binaries and we're now entirely on our own. Let's pick ourselves up by the bootstraps! -The boot layer is Forth code that has been embedded in the binary and is -directly available in Dusk's memory. Its content depends on the target system -and is assembled at build time. It has this structure: +The boot layer is Forth code that has been embedded in the binary. loaded in +memory by the bootloader and thus directly available in Dusk's memory. Its +content depends on the target system and is assembled at build time. It has this +structure: 1. bootlo 2. boot storage driver @@ -116,10 +157,18 @@ The 2 first layers are machine-dependent and will not change unless something fundamental changes with your machine. The "init" layer, however, is where you shape the system you want as a user. -It lives in xcomp/init.fs and is loaded at the end of the "boot" layer. Its -responsibility is to define an "init" word which will then be called at the very -end of "boothi". You can call "init" in xcomp/init.fs, but by doing so, you'll -monopolize the file handle that was opened for xcomp/init.fs. +The file is called "/init.fs". It lives at the root of your filesystem and is +loaded at the end of "boothi". + +The role of this file is to load what it has to load and then define a word +called "init". When "boothi" has finished running "init.fs", its last act will +be to call the word "init". + +If you do your own deployment yourself, it is likely that you'll write the whole +"init.fs" file yourself. If your image came from one of the pre-defined builds +in the POSIX package, then this file has been constructed from two parts. A +machine- specific part and a machine-independent part. For the PC platform, +"init.fs" is constructed from xcomp/i386/pc/init.fs and xcomp/init.fs. One of the things that your "init" word has to do is to set up ConsoleIn (see doc/sys/io), which until now is still on BootIn, a structure that reads the @@ -129,37 +178,34 @@ is interactive, you'll typically want to load sys/rdln and have init call The system is yours. -## lib or sys? - -What's in /lib? What's in /sys? This question can sometimes lead to confusion. - -/lib is for "libraries", a set of mostly stateless logic. /sys is for -subsystems, also a set of logic, but often stateful, centered around one or -more resources. - -Libraries will typically be loaded by apps, subsystems and other libraries with -the help of ?f<< (load file if it's not already loaded). - -Subsystems will be loaded during init.fs. If a unit requires the subsystem, it -will indicate it with "require", which doesn't load anything, but errors out if -the unit isn't loaded. - -Sometimes, the line between the two is fuzzy. These questions will help draw -the line. - -1. Who will decide when we want to load it, the system operator or the -application writer? If it's the sysop, it's a subsystem. - -For example, some applications might needs some words from sys/rdln, but it -doesn't make sense to automatically load it when needed: if the sysop hasn't -specifically decided to load it in its system, this dependency has a strong -chance of being nonsensical. - -2. Is the unit centered around a resource that needs configuration? If yes, it's -a subsystem. - -For example, sys/scratch is centered around a buffer which can vary in size -depending on what the sysop wants. +## What is a subsystem? + +What's the difference between a unit in /sys and a unit in other directories? +For example, why isn't sys/screen in gr/screen? + +Units that aren't subsystems or drivers are mostly pieces of logic that aren't +tied to any particular resource. A subsystem, however, is a piece of logic that +wraps a specific resource, such as a device. Unlike a driver, a subsystem is +hardware-independent. A subsystem will often sit on top of a device driver and +provide wider pieces of logic for it. + +For example, sys/screen is a subsystem because it's tied to something very +tangible: the screen (or "a" screen if we ever support multiple screens). It +sits on a driver with a specific interface and has the responsibility of +managing its resource. The gr/rect unit might draw on the screen, but it doesn't +hold any resource and might as well draw anywhere. + +Another particularity of subsystems is that they're under the direct control of +the system operator. Applications don't auto-load subsystems, they require them +(with the word "require" instead of "?f<<"). An application might need a screen +to work, but having it auto-load sys/screen doesn't make any sense: how do we +configure that screen? If the sysop didn't load the subsystem, then something is +wrong and we must abort. + +Sometimes, the resource that a subsystem manages isn't a device, but a +singleton. For example, sys/scratch. It's the system scratchpad. Some +applications require it, but there can only be one and it's the sysop job to +configure its size. ## Parens words () diff --git a/fs/doc/code.txt b/fs/doc/code.txt @@ -1,4 +1,4 @@ -# Code conventions +# Understanding Dusk's code ## Patterns in code diff --git a/fs/doc/hw/i386/arch.txt b/fs/doc/hw/i386/arch.txt @@ -0,0 +1,50 @@ +# i386 architecture + +The i386 kernel source code is xcomp/i386/kernel.fs. Register roles: + +PSP: EBP +RSP: ESP + +All other registers are free. + +## EBP and PS + +Here is a schema of PS with ( 3 2 1 ) in it, 1 being the top + ebp>| +|--------|--------|--------|--------| +|<ebp-4 |<ebp+0 |<ebp+4 |<ebp+8 | +|--------|--------|--------|--------| +| ??? | 1 | 2 | 3 + +Here is a schema of an 8 bytes stack frame in C + + ebp>| +|--------|--------|--------|--------| +|<ebp-4 |<ebp+0 |<ebp+4 |<ebp+8 | +|--------|--------|--------|--------| +| ??? | int x | int y | ??? + ^ ^ + |-----------------| + Stack frame + +## Memory layout + +PS lives at the end of x86 conventional memory, that is $80000. + +RS lives at $7c00. The idea is that it lives below the $10000 line so that we +don't have to save/restore ESP when doing the int13h call in real mode. If we +have it live over this line, the 64K wrap-up makes int13h push to unexpected +places in memory. + +The bootloader is loaded at $7c00, and it then loads the kernel along with its +boot code, which are about $6000 bytes long. We load them at address $500, which +is the beginning of conventional memory. Then, we jump to the kernel which makes +"here" start at $8000. From there, the boot code begins interpreting itself. + +We have to make sure that initial filesystem buffers are all under $10000 so +that INT13h can access it without problems (we don't play with segments in the +int13h routine, so we're limited to 64K). + +During higher level initialization, we're expected to deal with the A20 gate, +stop needing int13h, and then make "here" jump to its final playground, that is +$100000 and above. diff --git a/fs/doc/impl.txt b/fs/doc/impl.txt @@ -1,24 +0,0 @@ -# Implementation details - -## Dictionary links - -Dictionary links, such as the "sysdict" variable, are not only linked lists, but -also "noop" entries, that is, they have a 0 "len" field following the "next" -field. - -This peculiarity opens interesting possibilities as it allows an entry to use it -as an "indirect" link. - -For example, let's say that you create a "fork" entry (not part of the system -dict, such as is the case in a struct[ definition) and link it to "sysdict @", -that is, the last entry to have been added to the system dict. It's going to -work, but your new "dictionary branch" will be frozen in time: new entries added -to the system dictionary will not be present in the fork. - -If, instead, you link your new forked entry to "sysdict" (without the @ -dereferencing), then your "fork" becomes fluid: entries added to the system -dictionary will also be part of the "fork" (but "under" it). However, if the -byte following the "next" link has an undefined value, it's possible (although -very unlikely), that this "fake" entry matches a real "find" operation. This is -why we make sure that dictionary links such as the "sysdict" variable are always -followed by a 0 len byte. diff --git a/fs/doc/index.txt b/fs/doc/index.txt @@ -23,9 +23,7 @@ install Deploy Dusk to another machine terms Terminology dirs Directory structure arch Architecture details -impl Implementation details -code Code conventions -x86 i386 implementation details +code Understanding Dusk's code asm/ Assemblers documentation cc/ C compiler documentation design/ Description and justification of design decisions diff --git a/fs/doc/x86.txt b/fs/doc/x86.txt @@ -1,50 +0,0 @@ -# x86 architecture - -The x86 kernel source code is xcomp/i386/kernel.fs. Register roles: - -PSP: EBP -RSP: ESP - -All other registers are free. - -## EBP and PS - -Here is a schema of PS with ( 3 2 1 ) in it, 1 being the top - ebp>| -|--------|--------|--------|--------| -|<ebp-4 |<ebp+0 |<ebp+4 |<ebp+8 | -|--------|--------|--------|--------| -| ??? | 1 | 2 | 3 - -Here is a schema of an 8 bytes stack frame in C - - ebp>| -|--------|--------|--------|--------| -|<ebp-4 |<ebp+0 |<ebp+4 |<ebp+8 | -|--------|--------|--------|--------| -| ??? | int x | int y | ??? - ^ ^ - |-----------------| - Stack frame - -## Memory layout - -PS lives at the end of x86 conventional memory, that is $80000. - -RS lives at $7c00. The idea is that it lives below the $10000 line so that we -don't have to save/restore ESP when doing the int13h call in real mode. If we -have it live over this line, the 64K wrap-up makes int13h push to unexpected -places in memory. - -The bootloader is loaded at $7c00, and it then loads the kernel along with its -boot code, which are about $6000 bytes long. We load them at address $500, which -is the beginning of conventional memory. Then, we jump to the kernel which makes -"here" start at $8000. From there, the boot code begins interpreting itself. - -We have to make sure that initial filesystem buffers are all under $10000 so -that INT13h can access it without problems (we don't play with segments in the -int13h routine, so we're limited to 64K). - -During higher level initialization, we're expected to deal with the A20 gate, -stop needing int13h, and then make "here" jump to its final playground, that is -$100000 and above. diff --git a/fs/xcomp/bootlo.fs b/fs/xcomp/bootlo.fs @@ -257,7 +257,7 @@ does> ( 'struct ) struct[ Struct sfield dict - 1 sallot \ 1b that is always zero after dict link. See doc/impl + 1 sallot \ 1b that is always zero after dict link. See doc/arch sfield size sfield lastfield \ pointer to field *word* ]struct diff --git a/fs/xcomp/i386/kernel.fs b/fs/xcomp/i386/kernel.fs @@ -113,7 +113,7 @@ xcode HERE lblhere pspushN, ret, -pc to lblsysdict 0 , 0 c, \ 1b zero len field. see doc/impl +pc to lblsysdict 0 , 0 c, \ 1b zero len field. see doc/arch xcode sysdict lblsysdict pspushN, ret,