commit c8aa95ec0f86efd6c360cdf4246da10e3ba6804f
parent 7ab521ddf5dcd20e904644954e7cf37f5cc4aa6f
Author: Virgil Dupras <hsoft@hardcoded.net>
Date: Mon, 1 May 2023 17:01:00 -0400
Simplify iterator boilerplate
Sometimes I have good ideas! See doc/iter.
Diffstat:
3 files changed, 26 insertions(+), 33 deletions(-)
diff --git a/fs/doc/hal.txt b/fs/doc/hal.txt
@@ -214,6 +214,7 @@ addr, op -- Store the effective address of the operand in dest
ps+, n -- Add n to PSP
rs+, n -- Add n to RSP
+LIT>W, n -- Set W to n
W+n, n -- Z Add n to W
A+n, n -- Z Add n to A
W>A, -- Copy W to A
diff --git a/fs/doc/iter.txt b/fs/doc/iter.txt
@@ -9,7 +9,7 @@ Before diving into the gory details of creating new iterators (because that's
what's great about them: it's easy to create new ones), let's see how to use the
two iterators that are built-in: "for" and "for2".
-"for" takes a single arguments and counts down to 0 from that argument. In the
+"for" takes a single argument and counts down to 0 from that argument. In the
loop body, you can refer to the value-like word "i":
: foo 5 for i . spc> next ;
@@ -35,20 +35,26 @@ The heavy lifting is done by ":iterator", which is a does word generating
immediate compiling words (in this instance, "for"). When that word is called, a
few things happen:
-1. 12 bytes are reserved on RS for "i" and "j". It is always 12 bytes for all
- iterators, and it's always "i", "j" and "k", even when they aren't used.
-2. A call to "for"'s body is written.
-3. Two intertwined ahead jumps are written in a way that allow "unyield" to exit
- the loop in cases where the iterator has no yield.
+1. 12 bytes are reserved on RS for "i", "j" and "k". It is always 12 bytes for
+ all iterators, and it's always "i", "j" and "k", even when they aren't used.
+2. Push "for"'s address to RS.
+3. Write a forward jump that targets the "yield" that the "next" word is about
+ to write.
4. We continue compiling the loop body.
-5. When "next" (an immediate too) is called, a "yield" is compiled, followed by
- a backward jump to the beginning of the loop, followed by a forward target
- for the exit jump compiled at "for".
+5. When "next" (an immediate too) is called, we close the forward jump opened at
+ step 3, then a "yield" is compiled, followed by a backward jump to the
+ beginning of the loop.
6. De-allocate the 12 bytes reseved for i/j/k.
Iterators are expected to keep PS and RS balanced between yields. For this
reason, iteration values should exclusively be passed through i/j/k
+One can wonder why we push the iterator's address to RS and defer its call to
+the following "yield" rather than calling it directly. In most cases, it would
+work fine, but in cases where no iteration take place, we end up returning in
+the middle of the loop body rather than at the end of it. For this reason, we
+always begin an iterator loop by jumping at the end of it.
+
## i, j and k
"i", "j" and "k" are value-like words (obey "to" semantics) that live on RS.
@@ -65,13 +71,13 @@ We use RS for those variables for multiple reasons:
and it can't push directly to it because RS+0, the coroutine swapping
address, has to stay there. It's awkward.
-The easiest and simplest way to be directly on RS.
+The easiest and simplest way to be directly on RS through i/j/k.
## Breaking
It's a common pattern to break from an iterator early. To exit the loop early,
-you can use the "break" word (again, an immediate). This words de-allocates "i"
-and "j" RS slots, the coroutine address RS slot and then jumps out of the loop
+you can use the "break" word (again, an immediate). This words de-allocates
+i/j/k RS slots, the coroutine address RS slot and then jumps out of the loop
in a way that is similar to a begin..repeat, that is, to a following then. Yes,
when you use a "break", you need to add a "then" (and optionally a "else")
after the "next". This allows you to conditionally execute code based on whether
@@ -91,7 +97,7 @@ a global variable. This means that:
loop that's going to process the "break" and it won't have the expected
results.
3. These limitations, of course, are at compile time, which means that "break"
- works fine when the look calls a word that has a "next" loop inside it.
+ works fine when the loop calls a word that has a "next" loop inside it.
4. Break only works in "next" loop, not other loops.
## unyield
@@ -111,16 +117,3 @@ We already know how many bytes such a jump takes: CALLSZ + CELLSZ.
Therefore, if we want to exit the iterator loop, all we need to do is to add
CALLSZ + CELLSZ to RS+0. That's what "unyield" does. Then, we exit the loop,
execute the loop cleanup code and go on with our lives.
-
-Simple right? I'm glad you agree! ... but there's a caveat, a small wart in this
-otherwise gorgeous scheme: it's possible that the iterator was empty and no
-actual yield ever took place. In that case, at the time "unyield" is called,
-RS+0 point to the address right after the initial iterator call, right before
-the loop body. If we add CALLSZ + CELLSZ to that, we'll end in the middle of
-nowhere.
-
-To that end, ":iterator" compiles a forward jump right after the initial call
-(which has a size of... CALLSZ + CELLSZ!). That jump goes to the loop body.
-However, in between that jump and the loop body is another forward jump, but
-this time to the loop's exit. This way, if "unyield" is called without a yield,
-we end up on that jump, and then at loop cleanup. All good.
diff --git a/fs/xcomp/bootlo.fs b/fs/xcomp/bootlo.fs
@@ -221,14 +221,13 @@ alias execute | immediate
: xtcomp [compile] ] begin word runword compiling not until ;
: ivar, ( off -- ) RSP) swap +) toptr@ execute ;
: i 4 ivar, ; immediate : j 8 ivar, ; immediate : k 12 ivar, ; immediate
-: :iterator doer immediate xtcomp does>
- -12 rs+, execute, -4 [rcnt] +!
- [compile] ahead \ jump to loop
- [compile] ahead \ exit jump
- swap [compile] then [compile] begin ( loop ) ;
+: :iterator doer immediate xtcomp does> ( w -- yieldjmp loopaddr )
+ -16 rs+, RSP) !, LIT>W, RSP) @!,
+ [compile] ahead \ jump to yield
+ [compile] begin ( loop ) ;
0 value _breaklbl
-: next
- [compile] yield [compile] again [compile] then
+: next ( yieldjmp loopaddr -- )
+ swap [compile] then [compile] yield [compile] again
12 rs+, 4 [rcnt] +! 0 to@! _breaklbl ?dup drop ; immediate
: unyield BRSZ RSP) [+n], ; immediate
: break 16 rs+, [compile] ahead to _breaklbl ; immediate