Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Word Wrap Code Kata in Forth - please review

147 views
Skip to first unread message

Peter Kofler

unread,
Dec 18, 2010, 5:25:31 PM12/18/10
to
Dear experienced Forthers ;-)

Please have a look at the following code. I am new to Forth and to
improve my Forth skills I did the Word Wrap Code Kata. The task is:

Write a single static function named wrap that takes two arguments,
a string, and a column number. The function returns the string, but
with line breaks inserted at just the right places to make sure that
no line is longer than the column number. You try to break lines at
word boundaries.
see http://thecleancoder.blogspot.com/2010/10/craftsman-62-dark-path.html

I did the kata TDD styled using "T{" and "}T") and tried to implement
the minimal functionality to have the tests pass in the given order:

T{ S" " 1 wrap S" " COMPARE -> 0 }T
T{ S" this" 10 wrap S" this" COMPARE -> 0 }T
T{ S" word" 2 wrap S" wo<cr>rd" COMPARE -> 0 }T
T{ S" abcdefghij" 3 wrap S" abc<cr>def<cr>ghi<cr>j" COMPARE -> 0 }
T
T{ S" word word" 5 wrap S" word<cr>word" COMPARE -> 0 }T
T{ S" word word" 6 wrap S" word<cr>word" COMPARE -> 0 }T
T{ S" word word" 3 wrap S" wor<cr>d<cr>wor<cr>d" COMPARE -> 0 }T
T{ S" word word" 4 wrap S" word<cr>word" COMPARE -> 0 }T
T{ S" word word word" 6 wrap S" word<cr>word<cr>word" COMPARE -> 0 }T
T{ S" word word word" 11 wrap S" word word<cr>word" COMPARE -> 0 }T

I'm using ANS Forth and had to come up with all the string handling
myself. I did not want to play around with "\n", so I used the string
"<cr>" instead, but this makes no difference to the kata. My final
implementation after the last test passes is

\ character/string "primitives"

: cdup ( c-addr1 u1 -- c-addr1 u1 c-addr1 u1 )
2DUP ;

: cover ( c-addr1 u1 c-addr2 u2 -- c-addr1 u1 c-addr2 u2 c-addr1 u1 )
2OVER ;

: cdrop ( c-addr1 u1 -- )
2DROP ;

: cswap ( c-addr1 u1 c-addr2 u2 -- c-addr2 u2 c-addr1 u1 )
2SWAP ;

: 2cdup ( c-addr1 u1 c-addr2 u2 -- c-addr1 u1 c-addr2 u2 c-addr1 u1 c-
addr2 u2 )
cover cover ;

: ctuck ( c-addr1 u1 c-addr2 u2 -- c-addr2 u2 c-addr1 u1 c-addr2 u2 )
cswap cover ;


\ character/string heap functions

: cinit ( c-addr u -- )
\ initialize given string with 0s
0 FILL ;

: callocate ( u -- c-addr u )
\ allocate new string of u characters and initialize it
DUP CHARS ALLOCATE THROW
SWAP cdup cinit ;

: c2heap ( c-addr -- c-addr1 u1 )
\ convert hardcoded/build-in string (C" ") into heap string
COUNT DUP callocate ( c-addr u1 c-addr1 u1 )
cswap cover ROT DROP CMOVE ;

: cfree ( c-addr1 u1 -- )
DROP FREE DROP ; \ ignore status


\ character "literals"

: "" ( -- c-addr 0 )
\ push an empty string
C" " c2heap ;

: crlf ( -- c-addr u )
\ push crlf (in fact simulate crlf)
C" <cr>" c2heap ;


\ character/string helpers functions

: cmove_head ( c-addr1 u1 c-addr2 u2 -- )
\ copy min(u1,u2) characters from c-addr1 to c-addr2
ROT MIN ( c-addr1 c-addr2 u )
CMOVE ;

: clength ( c-addr u -- u )
NIP ;

: 2clength ( c-addr1 u1 c-addr2 u2 -- u1 u2 )
\ return the lengths of two strings
clength ROT DROP ;

: csplit ( c-addr u n -- c-addr1 u1 c-addr2 u2 )
\ split given string into two strings at given position n
DUP 0 = IF DROP "" cswap
ELSE
2DUP 1 + < IF DROP ""
ELSE
callocate ( c-addr u c-addr1 u1 )
\ c-addr1 is empty
2cdup cmove_head ( c-addr u c-addr1 u1 )
\ c-addr1 is now filled

\ reduce remaining number of chars u-u1
DUP >R cswap R> ( c-addr1 u1 c-addr u u1 )
/STRING
THEN
THEN ;

: cappend ( c-addr1 u1 c-addr2 u2 -- c-addr u )
\ concat the 2 strings into a new string (and try to free the old
ones).
DUP 0 = IF cfree \ second is empty, free it
ELSE
cswap
DUP 0 = IF cfree \ first is empty, free it
ELSE
\ create the new char array
( c-addr2 u2 c-addr1 u1 )
2cdup 2clength + callocate ( c-addr2 u2 c-addr1 u1 c-addr u )
2cdup cmove_head
\ c-addr is now half way filled

cswap cdup clength ( c-addr2 u2 c-addr u c-addr1 u1 u1 )
>R cfree ctuck ( c-addr u c-addr2 u2 c-addr u )
R> /STRING ( c-addr u c-addr2 u2 c-addr' u2 )
cmove_head
\ c-addr is now filled
THEN
THEN ;

: rpick ( -- w ) ( R: w -- w )
\ pick top level element from return stack
POSTPONE R> POSTPONE DUP POSTPONE >R ; IMMEDIATE

: clastindexof ( c-addr1 u1 c u2 -- u flag )
\ if flag is T return index u <= u2 of last occurance of c in c-
addr1
\ if flag is F return 0
1 + ROT MIN ( c-addr1 c u )
BEGIN
DUP 0 > IF
\ check char at current pos u against c
1 - 2 PICK OVER CHARS + C@
2 PICK = IF
TRUE \ found same char
TRUE \ end loop
ELSE
FALSE \ continue with loop
THEN
ELSE
FALSE \ return false after
TRUE \ end loop
THEN
UNTIL
ROT DROP ROT DROP ;

: wrap ( c-addr1 u1 u -- c-addr2 u2 )
\ return the given string with line breaks inserted so no line is
\ longer than u columns.
2DUP 1 + < IF
DROP
ELSE
>R
\ is there a space somewhere in c-addr1 up to u?
cdup 32 rpick clastindexof IF
csplit
\ drop first character of second string which is space
1 /STRING
ELSE
\ no space here, just split word
DROP rpick csplit
THEN
crlf cswap
R> RECURSE \ repeat for remaining part of string
cappend cappend
THEN ;

As I said I am new to Forth and definitely not in the right "Forth
mind".
So what should I do different?

Regards,
Peter

Howerd

unread,
Dec 18, 2010, 8:35:35 PM12/18/10
to
Hi Peter,

Its late and I may have missed something but ...

: wrap ( c-addr n n -- ) rot rot over + swap do i c@ emit i over
mod 0= if cr then loop drop ;

outputs the result to the console. If you want it outputted as a
string, you could vector the output to a buffer.

Best regards,
Howerd

On Dec 18, 10:25 pm, Peter Kofler <peter.codecop.kof...@gmail.com>
wrote:


> Dear experienced Forthers ;-)
>
> Please have a look at the following code. I am new to Forth and to
> improve my Forth skills I did the Word Wrap Code Kata. The task is:
>
> Write a single static function named wrap that takes two arguments,
> a string, and a column number.  The function returns the string, but
> with line breaks inserted at just the right places to make sure that
> no line is longer than the column number.  You try to break lines at
> word boundaries.

> seehttp://thecleancoder.blogspot.com/2010/10/craftsman-62-dark-path.html

Marcel Hendrix

unread,
Dec 19, 2010, 4:06:17 AM12/19/10
to
Peter Kofler <peter.code...@gmail.com> writes Re: Word Wrap Code Kata in Forth - please review
[..]

> Please have a look at the following code. I am new to Forth and to
> improve my Forth skills I did the Word Wrap Code Kata. The task is:

> Write a single static function named wrap that takes two arguments,
> a string, and a column number. The function returns the string, but
> with line breaks inserted at just the right places to make sure that
> no line is longer than the column number. You try to break lines at
> word boundaries.
> see http://thecleancoder.blogspot.com/2010/10/craftsman-62-dark-path.html

: wrap ( c-addr1 u1 column -- c-addr2 u2 )
0 0 0 locals| ix cnt break? column |
bounds ?do
i C@ dup BL = IF cnt column = IF drop ^J -1 TO cnt
ELSE ix TO break?
ENDIF
ENDIF ( char)
cnt column = IF break? IF ^J PAD break? + C!
ELSE ^J PAD ix + C! 1 +TO ix
ENDIF CLEAR cnt
ENDIF
( char ) PAD ix + C! 1 +TO ix 1 +TO cnt
loop PAD ix ;

Hmm, multiple consecutive blanks ... embedded TABs and CR/LFs ... need work.
But it passes your tests.

-marcel

Elizabeth D Rather

unread,
Dec 19, 2010, 2:20:36 PM12/19/10
to
On 12/18/10 12:25 PM, Peter Kofler wrote:
> Dear experienced Forthers ;-)
>
> Please have a look at the following code. I am new to Forth and to
> improve my Forth skills I did the Word Wrap Code Kata. The task is:
>
> Write a single static function named wrap that takes two arguments,
> a string, and a column number. The function returns the string, but
> with line breaks inserted at just the right places to make sure that
> no line is longer than the column number. You try to break lines at
> word boundaries.
> see http://thecleancoder.blogspot.com/2010/10/craftsman-62-dark-path.html
...

>
> I'm using ANS Forth and had to come up with all the string handling
> myself. I did not want to play around with "\n", so I used the string
> "<cr>" instead, but this makes no difference to the kata. My final
> implementation after the last test passes is
>
> \ character/string "primitives"
>
> : cdup ( c-addr1 u1 -- c-addr1 u1 c-addr1 u1 )
> 2DUP ;
>
> : cover ( c-addr1 u1 c-addr2 u2 -- c-addr1 u1 c-addr2 u2 c-addr1 u1 )
> 2OVER ;
>
> : cdrop ( c-addr1 u1 -- )
> 2DROP ;
>
> : cswap ( c-addr1 u1 c-addr2 u2 -- c-addr2 u2 c-addr1 u1 )
> 2SWAP ;
>
> : 2cdup ( c-addr1 u1 c-addr2 u2 -- c-addr1 u1 c-addr2 u2 c-addr1 u1 c-
> addr2 u2 )
> cover cover ;
>
> : ctuck ( c-addr1 u1 c-addr2 u2 -- c-addr2 u2 c-addr1 u1 c-addr2 u2 )
> cswap cover ;

Sorry I don't have time to go over your code in detail, but for
starters, the above definitions don't add anything except bulk and
overhead, and I think you should toss them.

>
> \ character "literals"
>
> : "" ( -- c-addr 0 )
> \ push an empty string
> C" " c2heap ;
>
> : crlf ( -- c-addr u )
> \ push crlf (in fact simulate crlf)
> C"<cr>" c2heap ;

The word CR is defined to do a crlf. If what you want is a string, a
better approach is:

CREATE CRLF HEX 2 C, 0D C, 0A C, DECIMAL

( in code: ) CRLF COUNT \ returns addr len

> \ character/string helpers functions
>
> : cmove_head ( c-addr1 u1 c-addr2 u2 -- )
> \ copy min(u1,u2) characters from c-addr1 to c-addr2
> ROT MIN ( c-addr1 c-addr2 u )
> CMOVE ;
>
> : clength ( c-addr u -- u )
> NIP ;
>
> : 2clength ( c-addr1 u1 c-addr2 u2 -- u1 u2 )
> \ return the lengths of two strings
> clength ROT DROP ;

Again, I don't think these buy you anything except added overhead and
making your code more opaque to readers who know the Forth words thus
concealed.

Also, please note that C" strings return the address of a "counted
string" (common internal string format), whereas S" returns an addr len
pair, which is generally preferable.

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather (US & Canada) 800-55-FORTH
FORTH Inc. +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================

Marcel Hendrix

unread,
Dec 19, 2010, 1:50:22 PM12/19/10
to
Peter Kofler <peter.code...@gmail.com> writes Re: Word Wrap Code Kata in Forth - please review
[..]
> Please have a look at the following code. I am new to Forth and to
> improve my Forth skills I did the Word Wrap Code Kata. The task is:

> Write a single static function named wrap that takes two arguments,
> a string, and a column number. The function returns the string, but
> with line breaks inserted at just the right places to make sure that
> no line is longer than the column number. You try to break lines at
> word boundaries.
> see http://thecleancoder.blogspot.com/2010/10/craftsman-62-dark-path.html

This is version 2, which passes two additional tests.
Multiple consecutive blanks and embedded TABs/CR/LF still not done.

-marcel

-- vsn2 ------------------------------------------


: wrap ( c-addr1 u1 column -- c-addr2 u2 )

0 0 0 -1 locals| break? ix cnt offs column |
bounds ?do i C@ BL
= IF
cnt column = IF ^J PAD ix + C! 1 +TO ix -1 TO break? CLEAR cnt CLEAR offs
ELSE BL PAD ix + C! 1 +TO ix ix 1- TO break? 1 +TO cnt CLEAR offs
ENDIF
ELSE cnt column
= IF break? -1
= IF ^J PAD ix + C! 1 +TO ix
i C@ PAD ix + C! 1 +TO ix 1 TO cnt 1 TO offs
ELSE ^J PAD break? + C! -1 TO break? offs 1+ TO cnt 1 TO offs
i C@ PAD ix + C! 1 +TO ix
ENDIF
ELSE i C@ PAD ix + C! 1 +TO ix 1 +TO cnt 1 +TO offs
ENDIF
ENDIF
loop PAD ix ;

T{ S" " 1 wrap S\" " COMPARE -> 0 }T

T{ S" this" #10 wrap S\" this" COMPARE -> 0 }T
T{ S" word" 2 wrap S\" wo\lrd" COMPARE -> 0 }T
T{ S" abcdefghij" 3 wrap S\" abc\ldef\lghi\lj" COMPARE -> 0 }T
T{ S" word word" 5 wrap S\" word\lword" COMPARE -> 0 }T
T{ S" word word" 6 wrap S\" word\lword" COMPARE -> 0 }T
T{ S" word word" 3 wrap S\" wor\ld\lwor\ld" COMPARE -> 0 }T
T{ S" word word" 4 wrap S\" word\lword" COMPARE -> 0 }T
T{ S" word word word" 6 wrap S\" word\lword\lword" COMPARE -> 0 }T
T{ S" word word word" #11 wrap S\" word word\lword" COMPARE -> 0 }T

T{ S" word wordmacro" 6 wrap S\" word\lwordma\lcro" COMPARE -> 0 }T
T{ S" word wordmacro" 6 wrap S\" word \lwordma\lcro" COMPARE -> 0 }T

Josh Grams

unread,
Dec 20, 2010, 1:04:14 PM12/20/10
to
Peter Kofler wrote:
> Dear experienced Forthers ;-)
>
> Please have a look at the following code. I am new to Forth and to
> improve my Forth skills I did the Word Wrap Code Kata. The task is:

> I'm using ANS Forth and had to come up with all the string handling


> myself. I did not want to play around with "\n", so I used the string
> "<cr>" instead, but this makes no difference to the kata.

If you're using gforth, you can use S\" which is like S" but allows
escape sequences using backslash.

> My final implementation after the last test passes is

> \ character/string "primitives"

I'd agree with Elizabeth -- leave out these words which are just
wrappers for stack manipulation. They don't add anything, and actually
make it more difficult to read for anyone who is comfortable with Forth.

>: rpick ( -- w ) ( R: w -- w )
> \ pick top level element from return stack
> POSTPONE R> POSTPONE DUP POSTPONE >R ; IMMEDIATE

There is a standard word for this, named R@ ("r-fetch").

>: wrap ( c-addr1 u1 u -- c-addr2 u2 )
> \ return the given string with line breaks inserted so no line is
> \ longer than u columns.
> 2DUP 1 + < IF
> DROP
> ELSE
> >R
> \ is there a space somewhere in c-addr1 up to u?
> cdup 32 rpick clastindexof IF

I'd use the standard word BL instead of 32.

> csplit
> \ drop first character of second string which is space
> 1 /STRING
> ELSE
> \ no space here, just split word
> DROP rpick csplit
> THEN
> crlf cswap
> R> RECURSE \ repeat for remaining part of string
> cappend cappend
> THEN ;
>
> As I said I am new to Forth and definitely not in the right "Forth
> mind".
> So what should I do different?

Your top-level word (wrap) doesn't look too bad to me. Mainly I'd say
you should think more carefully about memory management. I would treat
the input as a static string, and build a separate output string, either
ALLOCATEd or ALLOTted. Then csplit can simply do address arithmetic,
and skip all the memory-management mess.

Here's a version like that, with ALLOCATE/RESIZE. Note that
empty-string and append-to could easily be rewritten to use ALLOT.

--Josh


( blanks )

: blank? ( char -- flag ) bl 1+ < ;
\ remove trailing characters up to and including last blank
: last-blank ( c-addr u -- c-addr u' )
begin dup while 1- 2dup chars + c@ blank? until then ;
\ remove leading blanks
: -leading ( c-addr u -- c-addr' u' )
begin dup while over c@ blank? while
1 /string
repeat then ;

( cell-counted strings )

: @+ ( a-addr -- a-addr' x ) dup cell+ swap @ ;
: lplace ( c-addr u a-addr -- )
2dup ! cell+ swap chars move ;
: +lplace ( c-addr u a-addr -- )
2dup 2>r @+ chars + swap chars move 2r> +! ;


( dynamically allocated cell-counted strings )

: empty-string ( -- addr ) 1 cells allocate throw 0 over ! ;
: append-to ( c-addr u addr -- addr' )
2dup @ + cell+ resize throw dup >r +lplace r> ;
create newline 1 c, 10 c,
: append-newline ( addr -- addr' ) newline count rot append-to ;


( split and wrap )

: split-at ( c-addr u u2 -- c-addr' u' c-addr u2 )
>r over r@ swap >r /string r> r> ;
\ wrap C-ADDR U at U2, appending first line to dynamic string at ADDR
: wrap-at ( addr c-addr u u2 -- addr' c-addr' u' )
split-at 2swap 2>r rot append-to append-newline 2r> ;

\ assumes c-addr u contains no line breaks.
: wrap-line ( c-addr u u.width -- c-addr2 u2 )
>r empty-string -rot begin dup r@ > while
over r@ 1+ last-blank -trailing nip
?dup if wrap-at -leading else r@ wrap-at then
repeat r> drop rot append-to @+ ;

w_a_x_man

unread,
Dec 20, 2010, 6:13:21 PM12/20/10
to
On Dec 19, 12:50 pm, m...@iae.nl (Marcel Hendrix) wrote:
> Peter Kofler <peter.codecop.kof...@gmail.com> writes Re: Word Wrap Code Kata in Forth - please review

> [..]
>
> > Please have a look at the following code. I am new to Forth and to
> > improve my Forth skills I did the Word Wrap Code Kata. The task is:
> > Write a single static function named wrap that takes two arguments,
> > a string, and a column number.  The function returns the string, but
> > with line breaks inserted at just the right places to make sure that
> > no line is longer than the column number.  You try to break lines at
> > word boundaries.
> > seehttp://thecleancoder.blogspot.com/2010/10/craftsman-62-dark-path.html

>
> This is version 2, which passes two additional tests.
> Multiple consecutive blanks and embedded TABs/CR/LF still not done.
>
> -marcel
>
> -- vsn2 ------------------------------------------
> : wrap ( c-addr1 u1 column -- c-addr2 u2 )
>   0 0 0 -1 locals| break? ix cnt offs column |
>   bounds ?do  i C@ BL
>                = IF
>                     cnt column = IF  ^J   PAD ix + C! 1 +TO ix     -1 TO break?  CLEAR cnt  CLEAR offs
>                                ELSE  BL   PAD ix + C! 1 +TO ix  ix 1- TO break?  1 +TO cnt  CLEAR offs
>                               ENDIF
>                ELSE cnt column
>                       = IF  break? -1  
>                                = IF  ^J   PAD ix + C! 1 +TO ix                                
>                                      i C@ PAD ix + C! 1 +TO ix  1 TO cnt  1 TO offs
>                                ELSE  ^J   PAD break? + C!  -1 TO break?  offs 1+ TO cnt  1 TO offs
>                                      i C@ PAD ix + C! 1 +TO ix                
>                               ENDIF
>                       ELSE  i C@ PAD ix + C!  1 +TO ix  1 +TO cnt  1 +TO offs  
>                      ENDIF
>               ENDIF
>         loop  PAD ix ;
>

You ought to learn something about factoring.

Sp...@controlq.com

unread,
Dec 21, 2010, 2:00:34 PM12/21/10
to
On Mon, 20 Dec 2010, w_a_x_man wrote:
>> -- vsn2 ------------------------------------------
>> : wrap ( c-addr1 u1 column -- c-addr2 u2 )
>>   0 0 0 -1 locals| break? ix cnt offs column |
>>   bounds ?do  i C@ BL
>>                = IF
>>                     cnt column = IF  ^J   PAD ix + C! 1 +TO ix     -1 TO break?  CLEAR cnt  CLEAR offs
>>                                ELSE  BL   PAD ix + C! 1 +TO ix  ix 1- TO break?  1 +TO cnt  CLEAR offs
>>                               ENDIF
>>                ELSE cnt column
>>                       = IF  break? -1  
>>                                = IF  ^J   PAD ix + C! 1 +TO ix                                
>>                                      i C@ PAD ix + C! 1 +TO ix  1 TO cnt  1 TO offs
>>                                ELSE  ^J   PAD break? + C!  -1 TO break?  offs 1+ TO cnt  1 TO offs
>>                                      i C@ PAD ix + C! 1 +TO ix                
>>                               ENDIF
>>                       ELSE  i C@ PAD ix + C!  1 +TO ix  1 +TO cnt  1 +TO offs  
>>                      ENDIF
>>               ENDIF
>>         loop  PAD ix ;
>>
>
> You ought to learn something about factoring.
>
And seldom does one see such a perfect example of why this is so!

Marcel Hendrix

unread,
Dec 21, 2010, 2:00:46 PM12/21/10
to
Sp...@ControlQ.com wrote Re: Word Wrap Code Kata in Forth - please review
[..]

> And seldom does one see such a perfect example of why this is so!

The OP already showed the factored solution.
A link was posted that didn't agree with my stomach, so I produced
some gas in the general direction.

-marcel

Peter Kofler

unread,
Jan 8, 2011, 4:28:26 AM1/8/11
to
On 19 Dez. 2010, 10:06, m...@iae.nl (Marcel Hendrix) wrote:
> : wrap ( c-addr1 u1 column -- c-addr2 u2 )
> 0 0 0 locals| ix cnt break? column |
> bounds ?do

> i C@ dup BL = IF cnt column = IF drop ^J -1 TO cnt
> ELSE ix TO break?
> ...
> ENDIF CLEAR cnt
> ENDIF
> ( char ) PAD ix + C! 1 +TO ix 1 +TO cnt
> loop PAD ix ;
I tried your code, but +TO and CLEAR are not defined in gforth. What
Forth are you using?

On 19 Dez. 2010, 20:20, Elizabeth D Rather <erat...@forth.com> wrote:
> ...


> the above definitions don't add anything except bulk

> ...


> making your code more opaque to readers who know the Forth words thus
> concealed.

This makes sense. Propably a beginner mistake, as I am not as used to
the Forth words. I removed all these little words. Thank you.

> If what you want is a string, a better approach is:
> CREATE CRLF HEX 2 C, 0D C, 0A C, DECIMAL

> ...


> Also, please note that C" strings return the address of a "counted
> string" (common internal string format), whereas S" returns an addr len
> pair, which is generally preferable.

Thank you for these, too. I changed my code accordingly.

On 20 Dez. 2010, 19:04, Josh Grams <j...@qualdan.com> wrote:
> ...


> If you're using gforth, you can use S\" which is like S" but allows
> escape sequences using backslash.

I knew about S\", but wanted to use only ANS words. Still, for the
tests S\" is much more convenient.

> >: rpick ( -- w ) ( R: w -- w )

> There is a standard word for this, named R@ ("r-fetch").

> ...


> I'd use the standard word BL instead of 32.

Thank you for reminding me that I should always rtfm before defining
new words or using magic numbers.

> ... I would treat


> the input as a static string, and build a separate output string, either
> ALLOCATEd or ALLOTted. Then csplit can simply do address arithmetic,
> and skip all the memory-management mess.

That's a great idea. It simplified the code considerably, still I
could keep the structure I came up during the kata.

> Here's a version like that, with ALLOCATE/RESIZE. Note that
> empty-string and append-to could easily be rewritten to use ALLOT.

Thank you a lot for that code example, it was esp. helpful as its
structure is similar to my own one, I got several ideas from it.

On 21 Dez. 2010, 00:13, w_a_x_man <w_a_x_...@yahoo.com> wrote:
> ... factoring.
I take it that factoring is very important. I will keep it in mind.

Regards,
Peter

fyi I appended the improved version:

CREATE crlf 1 C, 10 C,

: split-at ( c-addr1 u1 n -- c-addr1 n c-addr2 u2=u1-n )
\ split string into two strings at given position n
SWAP >R
2DUP R> SWAP /STRING ;

: cresize ( addr1 u1 – addr2 )
\ resize the (allocated) string and update its length, may return a
different addr
SWAP OVER 1 + CHARS
RESIZE THROW ( u1 addr2 )
TUCK C! ;

: append-to ( addr1 c-addr2 u2 -- addr3 )
\ append second string to the (allocated) first, may return a
different addr
DUP 0 = IF
\ second string is empty, do nothing
2DROP
ELSE
\ get old size of string
ROT DUP C@ ( c-addr2 u2 addr1 u1 )
\ calculate new size of string
>R OVER R@ + ( c-addr2 u2 addr1 u3=u1+u2 )
cresize ( c-addr2 u2 addr3 )
\ determine target to copy second string
DUP 1 R> + CHARS + ( c-addr2 u2 addr3 addr3' )
\ copy remaining characters
SWAP >R SWAP CMOVE
R>
THEN ;

: last-blank ( c-addr1 u1 u2 -- u flag )
\ if flag is TRUE u <= u2 is the last occurance of blank inside c-
addr1
\ else there was no blank inside c-addr1 and u=0 is returned
1 + MIN ( c-addr1 u )
BEGIN
DUP WHILE
\ check char at current pos u-1 against BL
1 -
2DUP CHARS + C@
BL =
UNTIL
TRUE
ELSE
FALSE
THEN
ROT DROP ;

: insert-linebreaks ( addr1 c-addr2 u2 u -- addr3 )
\ insert linebreaks into the second string so no line is longer
than u columns and append to first
2DUP 1 + < IF
\ string is shorter than u characters, just append it
DROP
append-to


ELSE
>R
\ is there a space somewhere in c-addr1 up to u?

2DUP R@ last-blank IF
\ there is a space, so split at found position
split-at


\ drop first character of second string which is space
1 /STRING
ELSE

\ no space here, just split word at u
DROP R@ split-at
THEN ( addr1 c-addr2 u2' c-addr3 u3 )

2>R append-to
\ add crlf after first string
crlf COUNT append-to

\ repeat for the remaining 2nd string
2R> R> RECURSE
THEN ;

: empty-string ( -- addr )

\ creates an empty (allocated) string
1 CHARS ALLOCATE THROW
0 OVER C! ;

: wrap ( c-addr1 u1 u -- c-addr2 u2 )

\ return the given string with line breaks inserted so no line is
longer than u columns.
>R 2>R
empty-string
2R> R>
insert-linebreaks
COUNT ;

Marcel Hendrix

unread,
Jan 9, 2011, 1:14:20 PM1/9/11
to
Peter Kofler <peter.code...@gmail.com> writes Re: Word Wrap Code Kata in Forth - please review

> On 19 Dez. 2010, 10:06, m...@iae.nl (Marcel Hendrix) wrote:
>> : wrap ( c-addr1 u1 column -- c-addr2 u2 )
>> 0 0 0 locals| ix cnt break? column |
>> bounds ?do

>> i C@ dup BL =3D IF cnt column =3D IF drop ^J -1 TO cnt


>> ELSE ix TO break?
>> ...
>> ENDIF CLEAR cnt
>> ENDIF
>> ( char ) PAD ix + C! 1 +TO ix 1 +TO cnt
>> loop PAD ix ;
> I tried your code, but +TO and CLEAR are not defined in gforth. What
> Forth are you using?

I guess you don't need the name of my Forth, but code to patch your
version of it.

-marcel

-- wrap.fs -------------------------------
[defined] -wrap [if] -wrap [then] marker -wrap

include ttester.fs

#10 CONSTANT ^J

: wrap ( c-addr1 u1 column -- c-addr2 u2 )

0 0 0 -1 locals| break? ix cnt offs column |
bounds ?do i C@ BL
= IF
cnt column = IF ^J PAD ix + C! ix 1+ TO ix -1 TO break? 0 TO cnt 0 TO offs
ELSE BL PAD ix + C! ix 1+ TO ix ix 1- TO break? cnt 1+ TO cnt 0 TO offs
THEN

ELSE cnt column
= IF break? -1

= IF ^J PAD ix + C! ix 1+ TO ix
i C@ PAD ix + C! ix 1+ TO ix 1 TO cnt 1 TO offs


ELSE ^J PAD break? + C! -1 TO break? offs 1+ TO cnt 1 TO offs

i C@ PAD ix + C! ix 1+ TO ix
THEN
ELSE i C@ PAD ix + C! ix 1+ TO ix cnt 1+ TO cnt offs 1+ TO offs
THEN
THEN
loop PAD ix ;

T{ S" " 1 wrap S\" " COMPARE -> 0 }T
T{ S" this" #10 wrap S\" this" COMPARE -> 0 }T
T{ S" word" 2 wrap S\" wo\lrd" COMPARE -> 0 }T
T{ S" abcdefghij" 3 wrap S\" abc\ldef\lghi\lj" COMPARE -> 0 }T
T{ S" word word" 5 wrap S\" word\lword" COMPARE -> 0 }T
T{ S" word word" 6 wrap S\" word\lword" COMPARE -> 0 }T
T{ S" word word" 3 wrap S\" wor\ld\lwor\ld" COMPARE -> 0 }T
T{ S" word word" 4 wrap S\" word\lword" COMPARE -> 0 }T
T{ S" word word word" 6 wrap S\" word\lword\lword" COMPARE -> 0 }T
T{ S" word word word" #11 wrap S\" word word\lword" COMPARE -> 0 }T

T{ S" word wordmacro" 6 wrap S\" word\lwordma\lcro" COMPARE -> 0 }T
T{ S" word wordmacro" 6 wrap S\" word \lwordma\lcro" COMPARE -> 0 }T

0 [IF]

Gforth 0.7.0, Copyright (C) 1995-2008 Free Software Foundation, Inc.
Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
Type `bye' to exit
include wrap.fs ok
S" word word word" #11 wrap TYPE word word
word ok
S" word word word" #11 wrap cr TYPE
word word
word ok

[THEN]
-- -------------------------------

0 new messages