Google CodeJam?

Ian Osgood

unread,

Apr 11, 2012, 7:23:28 PM4/11/12

to

Four years, and never a Forth entry. Anyone want to put Forth on the
scoreboard this year? Registration is open and the qualification round
is on Friday.

http://code.google.com/codejam/

Hugh Aguilar

unread,

Apr 12, 2012, 3:43:19 AM4/12/12

to

Armed with my novice package, I'm the only Forther on the planet who
has any chance at all. All of the problems seem to involve reading in
a data file, processing it, and writing out a solution file. My LIST.
4TH would work very well (especially the SEQ lists). I also have
ordered associative arrays, which could come in handy. I've got a lot
of code in there --- no Forther is even close to me.
http://www.forth.org/novice.html

Without regular expressions though, I don't have a very good chance
--- that was something I meant to add to the novice package, but never
got around to doing --- the only CodeJam example problem I looked at
closely was "alien language," and having regular expressions would
have helped with it. Of course, it is possible to write code manually
to do string pattern-matching, but with the tight time limits that
wouldn't be competitive against the myriad scripting languages in use
nowadays.

I'm hesitant to take Friday off, as Friday is my big money day driving
the cab --- so I won't enter.

Just for fun, lets have our own comp.lang.forth contest to write the
best Forth solution to the "alien language" problem.
http://code.google.com/codejam/contest/90101/dashboard#s=p0
This will be a loser's consolation contest, as none of us have any
chance at the real contest.

hughag...@yahoo.com

unread,

Apr 21, 2012, 6:13:43 PM4/21/12

to

On Thursday, April 12, 2012 1:43:19 AM UTC-6, Hugh Aguilar wrote:
> Just for fun, lets have our own comp.lang.forth contest to write the
> best Forth solution to the "alien language" problem.
> http://code.google.com/codejam/contest/90101/dashboard#s=p0
> This will be a loser's consolation contest, as none of us have any
> chance at the real contest.

I've waited and waited, but nobody has come forward. To qualify for the CodeJam contest, the program had to be written in under 8 minutes. That is very fast programming; I think that a typical programmer using a modern language would take about 1/2 hour. My own Forth program took me about 3 hours, so I am 6 times slower than pretty much everybody. This is a big part of why Forth is not used in the work world. No employer is going to pay anybody to program in Forth when it takes 6 times longer to write a program than it does in any other language. Also, I had the advantage of having my novice package available. Without the novice package, I think most Forth programmers would take maybe 3 days to write a program like this (that is why nobody responded to my challenge).

It seems extremely unlikely that any Forther is going to come up with a Forth program to compete against mine. I would like to see programmers of other languages, such as Lisp and Ruby and so forth, present their own programs along with a mention of how much time was required. It is okay to post non-Forth code on comp.lang.forth --- nobody is posting Forth code --- if we are going to get any code posted, it will have to be in other languages.

Here is my own Forth code:

\ This is the solution to the "alien language" example problem from Google CodeJam.
\ http://code.google.com/codejam/contest/90101/dashboard#s=p0

\ Written by Hugh Aguilar --- copyright (c) 2012 --- BSD license

\ requires novice.4th and list.4th

marker alien.4th

\ ******
\ ****** input and output
\ ******

variable #letters
variable #words
variable #patterns

: seqs ( name-str -- word-seq pattern-seq )
read-seq
dup .line @ count evaluate
#patterns ! #words ! #letters !
.fore @ \ -- word-seq
dup #words @ nth \ -- word-seq pattern-seq
delink
dup #patterns @ nth \ -- word-seq pattern-seq extraneous-seq
delink kill-seq \ -- word-seq pattern-seq
\ error checking
over length #words @ <> abort" *** bad word-seq ***"
dup length #patterns @ <> abort" *** bad pattern-seq ***"
over each[ .line @ c@ #letters @ <> abort" *** string in word-seq is wrong length ***" ]each ;

: dump-result ( pattern-list name-str -- )
<cstr +cstr c" .result" +cstr cstr> write-seq ;

\ ******
\ ****** convert pattern string into any-seq list
\ ****** each node in the list represents one char in the target string
\ ****** the .LINE string of each node contains all of the character that would match
\ ******

: check-any-str { pattern-str head -- }
head length #letters @ <> if
cr ." *** any-seq is the wrong length ***"
cr pattern-str count type
cr true abort" *** aborting ***
then ;

\ If CHECK-ANY-STR fails, this is usually because the pattern-str is longer than 255 characters and it got truncated.
\ This happens in the file: A-large-practice.in
\ I could upgrade the program to deal with this problem, but doing so would involve rewriting the SEQ code in LIST.4TH.

: <make-any-seq> ( head str -- head )
dup c@ if new-seq link \ str has characters in it
else drop then ;

: make-any-seq { pattern-str | group? -- any-seq }
nil <cstr
pattern-str count bounds ?do \ -- head
group? if \ if inside of ( ) group
I c@ [char] ) = if false to group? cstr> <make-any-seq> <cstr
else I c@ char+cstr then
else \ else outside of ( ) group
I c@ [char] ( = if true to group?
else I c@ char+cstr cstr> <make-any-seq> <cstr then
then
loop
cstr> <make-any-seq>
pattern-str over check-any-str ;

\ ******
\ ****** generate pattern matcher
\ ******

char & comment \ this is an example of what will get generated

:noname ( char-adr -- match? ) \ this is for: a(bc)

false \ -- char-adr any?
over c@ 97 = if true or then
0= if drop false exit then
1+ \ -- new-char-adr

false \ -- char-adr any?
over c@ 98 = if true or then
over c@ 99 = if true or then
0= if drop false exit then
1+ \ -- new-char-adr

drop true ;

&

: <generate-pattern> { node -- }
s" false " evaluate
node .line @ count bounds do
s" over c@ " evaluate
I c@ lit,
s" = if true or then " evaluate
loop
s" 0= if drop false exit then 1+ " evaluate ;

: generate-pattern ( any-seq -- xt )
>r
s" :noname ( char-adr -- match? ) " evaluate
r> ['] <generate-pattern> each
s" drop true ; " evaluate ;

\ ******
\ ****** upgrade pattern-seq
\ ******

seq \ .LINE starts out as pattern-str, later gets changed to result-str
w field .any \ pointer to any-seq
w field .xt \ xt of generated pattern matcher
w field .matches \ count of matches for this pattern
constant pattern

: <kill-pattern> ( node -- )
dup .any @ kill-seq
<kill-seq> ;

: kill-pattern ( head -- )
each[ <kill-pattern> ]each ;

: init-pattern ( pattern-str node -- node )
init-seq >r
r@ .line @ make-any-seq r@ .any !
r@ .any @ generate-pattern r@ .xt !
0 r@ .matches !
r> ;

: new-pattern ( str -- node )
pattern alloc
init-pattern ;

: upgrade-pattern ( pattern-seq -- pattern-list ) \ create a PATTERN list given a SEQ list
nil swap \ -- pattern-list pattern-seq
each[ .line @ new-pattern link ]each ;

\ ******
\ ****** pattern-match
\ ******

: <check-word> ( pattern-list-node word-seq-node -- )
.line @ count drop \ -- pattern-list-node char-adr \ assume str size is correct
over .xt @ execute \ -- pattern-list-node match?
if 1 over .matches +! then ; \ -- pattern-list-node

: <check-pattern> ( word-seq pattern-list-node -- word-seq )
over ['] <check-word> each \ -- word-seq pattern-list-node
drop ;

: check-pattern ( word-seq pattern-list -- )
['] <check-pattern> each \ -- word-seq
drop ;

\ ******
\ ****** make result strings
\ ******

: u>str ( u -- adr cnt )
u>d <# #s #> ;

: <fill-result> ( pattern# pattern-list-node -- new-pattern# )
dup .line @ dealloc \ get rid of pattern-str in .LINE
<cstr
c" Case #" +cstr
over u>str <+cstr>
c" : " +cstr
dup .matches @ u>str <+cstr>
cstr> hstr swap .line ! \ -- pattern# \ set result-str to .LINE
1+ ; \ -- new-pattern#

: fill-result ( pattern-list -- )
1 swap ['] <fill-result> each \ -- pattern#
drop ;

\ ******
\ ****** main program
\ ******

: alien ( name-str -- )
dup seqs { name-str word-seq pattern-seq | pattern-list -- }
s" marker upgrade-pattern-stuff " evaluate \ so we can get rid of the UPGRADE-PATTERN words
pattern-seq upgrade-pattern to pattern-list
word-seq pattern-list check-pattern
pattern-list fill-result
pattern-list name-str dump-result
\ clean up
word-seq kill-seq
pattern-seq kill-seq
pattern-list kill-pattern
s" upgrade-pattern-stuff " evaluate ;

Bernd Paysan

unread,

Apr 21, 2012, 8:19:54 PM4/21/12

to

hughag...@yahoo.com wrote:
> It seems extremely unlikely that any Forther is going to come up with
> a Forth program to compete against mine.

Probably. This is a task screaming for a language where regexps are
already implemented in a very similar form used in the quest data. I.e.
you replace the ( with a [ and then use the regexp comparison operation,
and be almost done. It *is* an 8 minute task.

I wouldn't use Forth for that. And I even have a regexp package for my
Forths, which makes it actually easier than using your novice package,
but still not as easy as with awk or Perl or Python or whatever regexp-
based language of the year you want to use for that (you could do it
with the shell and grep and tr). Well, maybe I really should invest the
time to add a Perl- or Python-compatible regexp compiler as frontend,
because *then*, it is just as easy as with the competition.

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://bernd-paysan.de/

WJ

unread,

Apr 22, 2012, 12:03:33 AM4/22/12

to

Ruby:

_, numwords, numpatterns = gets().split.map{|s| s.to_i}

words = (1 .. numwords).map{ gets().strip }
patterns = (1 .. numpatterns).map{ gets().strip }

patterns.each_with_index{|pat,i|
regex = Regexp.new( pat.gsub("(", "[").gsub(")", "]") )
printf "Case #%d: %d\n", i, words.grep( regex ).size
}

Alex Wegel

unread,

Apr 22, 2012, 6:12:16 AM4/22/12

to

<hughag...@yahoo.com> wrote:

> It seems extremely unlikely that any Forther is going to come up with
> a Forth program to compete against mine.

OK, now that you managed to escape my kill filter, i take the challenge.

After 1 hour and 15 minutes, i got this (ugly but working) solution (with the
sample data coming from the file "alien-sample"):

#! /usr/local/bin/gforth
variable ifh
s" alien-sample" r/o open-file throw ifh !

variable #ib
create ifb 256 allot
: getl ifb 256 ifh @ read-line throw drop #ib ! ;

variable L
variable D
variable N
: parse-headline ifb #ib @ evaluate N ! D ! L ! ;

getl parse-headline

D @ L @ * constant #ad
variable adp
: c>msk 1 swap [char] a - lshift ;
: add-achar c>msk adp @ ! 4 adp +! ;
: add-aword ifb #ib @ bounds do i c@ add-achar loop ;
create adic #ad 4 * allot
: read-adic adic adp ! D @ 0 do getl add-aword loop ;
read-adic

create pat L @ 4 * allot
variable pp

: ccl 0 swap begin 1+ dup c@ dup [char] ) <> while c>msk rot or swap
repeat drop swap ;

: make-pat
pat pp !
ifb
L @ 0 do
dup c@
dup [char] ( = if
drop ccl
else
c>msk
then
pp @ ! 4 pp +!
1+
loop drop ;

: chk-1match L @ 4 * 0 do dup i + @ i pat + @ and 0= if unloop drop false exit
then 4 +loop drop true ;
: matches 0 adic #ad 4 * bounds do i chk-1match 1 and + L @ 4 * +loop ;
: chk-all
N @ 1+ 1 do getl make-pat matches ." Case #" i . ." : " . cr loop ;

chk-all

ifh @ close-file throw

Alex Wegel

unread,

Apr 22, 2012, 8:11:36 AM4/22/12

to

Alex Wegel <awe...@arcor.de> wrote:

Small correction: The input buffer must be much larger for the big
example, so change

create ifb 256 allot
: getl ifb 256 ifh @ read-line throw drop #ib ! ;

to sth. like

create ifb 10000 allot
: getl ifb 10000 ifh @ read-line throw drop #ib ! ;

(and take care that gforth has enough dict-mem available, eg. by using
"gforth -m 4M")

Also, i didn't care to make the output format match the question exactly
(there are 2 superfluous spaces in every line) - well..

Alex Wegel

unread,

Apr 22, 2012, 8:17:43 AM4/22/12

to

Another remark: In my even more rusty perl, it was much simpler to do
quick & dirty (by following the suggestion to change the ()'s to []'s):

#!/usr/bin/perl
my ($l, $d, $n) = split " ",<>;

my @d=();
foreach my $i (1..$d) { push @d, scalar <>; }

my @p=();
foreach my $i (1..$n) { push @p, scalar <>; }

map { $_ =~ s/$/[/g; } @p;
map { $_ =~ s/$/]/g; } @p;

map { print "".$_."\n"; } @p;

my $c=0;
map {
my $m=0;
for ($i=0; $i<$d; ++$i) {
if ( $d[$i] =~ m($_) ) {++$m;}
}
print "Case #".++$c.": $m\n";
} @p;

Marcel Hendrix

unread,

Apr 22, 2012, 12:23:19 PM4/22/12

to

awe...@arcor.de (Alex Wegel) writes Re: Google CodeJam?
[..]

> After 1 hour and 15 minutes, i got this (ugly but working) solution (with the
> sample data coming from the file "alien-sample"):

It took me somewhat longer because I started on the wrong foot (generating
all possible strings :-) and encountered 'performance problems.'

-marcel

-- ---------------------------------

'z' 'a' - 1+ =: CMAXI
#15 =: LMAXI
#5000 =: DMAXI

0 VALUE L
0 VALUE D
0 VALUE N
0 VALUE #matches

CREATE clists LMAXI CMAXI * CHARS ALLOT
CREATE dict LMAXI DMAXI * CHARS ALLOT

: clear-clists ( -- ) clists LMAXI CMAXI * CHARS ERASE ;
: cl-row ( row# -- addr ) CMAXI * CHARS clists + ;
: dict-row ( row# -- addr ) LMAXI * CHARS dict + ;
: char! ( addr char -- ) 'a' - CHARS + 1 SWAP C! ;
: char? ( addr char -- ) 'a' - CHARS + C@ ;
: next-line ( -- c-addr u ) REFILL 0= IF QUIT ENDIF ;
: GET-L,D,N ( -- ) next-line 0 WORD COUNT EVALUATE TO N TO D TO L ;
: create-dictword ( index -- ) next-line SOURCE ROT dict-row PACK DROP SOURCE >IN ! DROP ;
: BUILD-DICT ( -- ) D 0 ?DO I create-dictword LOOP ;

: next-char ( -- char )
SOURCE >IN @ <= IF DROP 0 EXIT ENDIF
( addr) >IN @ + C@ 1 >IN +! ;

: collect-chars ( addr -- )
>R BEGIN next-char DUP ')' <>
WHILE R@ SWAP char!
REPEAT DROP R> DROP ;

: create-clists ( -- )
clear-clists
L 0 ?DO I cl-row next-char DUP '(' = IF DROP collect-chars ELSE char! ENDIF LOOP ;

: cl-test? ( char pos -- TRUE=match ) cl-row SWAP char? ;
: $possible? ( c-addr u -- n ) 0 LOCAL #ok 0 ?DO C@+ I cl-test? +TO #ok LOOP DROP #ok ;
: matches ( -- ) D 0 ?DO I dict-row COUNT $possible? L = IF 1 +TO #matches ENDIF LOOP ;
: one-case ( -- ) next-line create-clists matches ;
: CASES ( -- ) N 0 ?DO CLEAR #matches one-case CR ." Case #" I 1+ 0 .R SPACE #matches . LOOP ;
: MATCH ( -- ) GET-L,D,N BUILD-DICT CASES ;

-- Output -----------------------------------------------

Case #1 1
Case #2 0
Case #3 1
Case #4 6
Case #5 8
Case #6 1
Case #7 8
Case #8 1
Case #9 9
Case #10 0

Marcel Hendrix

unread,

Apr 22, 2012, 12:29:27 PM4/22/12

to

m...@iae.nl (Marcel Hendrix) writes Re: Google CodeJam?

> awe...@arcor.de (Alex Wegel) writes Re: Google CodeJam?
> [..]

>> After 1 hour and 15 minutes, i got this (ugly but working) solution (with the
>> sample data coming from the file "alien-sample"):

> It took me somewhat longer because I started on the wrong foot (generating
> all possible strings :-) and encountered 'performance problems.'

Before I forget, there is an error in the original input file
A-small-practice.in :

---
10 25 10
nwlrbbmqbh
cdarzowkky
...
---

That should be 10 26 10. It may influence the results.

-marcel

Alex Wegel

unread,

Apr 22, 2012, 3:36:20 PM4/22/12

to

Marcel Hendrix <m...@iae.nl> wrote:

> Before I forget, there is an error in the original input file
> A-small-practice.in :
>
> ---
> 10 25 10
> nwlrbbmqbh
> cdarzowkky
> ...
> ---
>
> That should be 10 26 10. It may influence the results.

Nope - it just *looks* like it should be ..26.. because the first of the
10 patterns is a plain literal (with no parentheses)!

Alex Wegel

unread,

Apr 22, 2012, 4:24:11 PM4/22/12

to

Marcel Hendrix <m...@iae.nl> wrote:

> awe...@arcor.de (Alex Wegel) writes Re: Google CodeJam?
> [..]
>
> > After 1 hour and 15 minutes, i got this (ugly but working) solution
> > (with the sample data coming from the file "alien-sample"):
>
> It took me somewhat longer because I started on the wrong foot (generating
> all possible strings :-) and encountered 'performance problems.'

Wouldn't have happened with a quantum computer, i guess ;)

> -- Output -----------------------------------------------
> Case #1 1
> Case #2 0
> Case #3 1
> Case #4 6
> Case #5 8
> Case #6 1
> Case #7 8
> Case #8 1
> Case #9 9
> Case #10 0

I think the result is wrong (mostly because you were too smart about the
input - see other message. Not sure why your case #9 got 1 more than my
case 10, though).

I got

Case #1: 0
Case #2: 1
Case #3: 0
Case #4: 1
Case #5: 6
Case #6: 7
Case #7: 1
Case #8: 8
Case #9: 1
Case #10: 8

..and it was accepted as correct.

BTW (Just for boasting):
Runtime (gforth on powerpc mac @1.8GHz) for the large set was 2.0s (with
1.5s utime), or 0.4s (0.3s utime) using gforth-fast :-)

Anyway, my solution is the best, because the aliens would have no chance
to decipher the program :-)

Alex

Bernd Paysan

unread,

Apr 22, 2012, 5:08:56 PM4/22/12

to

Alex Wegel wrote:

> <hughag...@yahoo.com> wrote:
>
>> It seems extremely unlikely that any Forther is going to come up with
>> a Forth program to compete against mine.
>
> OK, now that you managed to escape my kill filter, i take the
> challenge.
>
> After 1 hour and 15 minutes, i got this (ugly but working) solution
> (with the sample data coming from the file "alien-sample"):

If you learn how to use CELL, CELL+ and CELLS instead of 4, 4 + and 4 *,
your program will work on a 64 bit CPU as well.

I like this solution, because it shows that this subset of pattern
matching can be done without too much hassle (and good performance) in
plain Forth.

Hugh Aguilar

unread,

Apr 23, 2012, 2:06:53 AM4/23/12

to

On Apr 22, 3:08 pm, Bernd Paysan <bernd.pay...@gmx.de> wrote:
> Alex Wegel wrote:

> > <hughaguila...@yahoo.com> wrote:
>
> >> It seems extremely unlikely that any Forther is going to come up with
> >> a Forth program to compete against mine.
>
> > OK, now that you managed to escape my kill filter, i take the
> > challenge.
>
> > After 1 hour and 15 minutes, i got this (ugly but working) solution
> > (with the sample data coming from the file "alien-sample"):
>
> If you learn how to use CELL, CELL+ and CELLS instead of 4, 4 + and 4 *,
> your program will work on a 64 bit CPU as well.
>
> I like this solution, because it shows that this subset of pattern
> matching can be done without too much hassle (and good performance) in
> plain Forth.

Mine isn't plain Forth??? What does that term mean?

When I wrote mine, I originally got off on the "wrong foot" by writing
a program that used <SPLIT> to convert the pattern-string into a SEQ
list, and would then interpret that list for every pattern match. This
became complicated, as it involved a lot of nested loops. I think that
it would have been slow too, because it involved doing the same work
redundantly for every pattern match (I don't know what the speed would
have been though, as I never got this version running). Switching over
to generating a pattern-matching function simplified the program
considerably and also made it faster.

So my 3 hour effort actually included writing *two* programs --- one
that didn't work and was badly designed, and a second that you see
above. With better planning, I could have possibly gotten the time
down to 1 hour --- but, of course, you have to still consider the time
required to think up the plan, which would put me back at the 3 hour
figure.

Message has been deleted

Hugh Aguilar

unread,

Apr 23, 2012, 2:17:20 AM4/23/12

to

On Apr 21, 10:03 pm, "WJ" <w_a_x_...@yahoo.com> wrote:
> Ruby:
>
> _, numwords, numpatterns = gets().split.map{|s| s.to_i}
>
> words = (1 .. numwords).map{ gets().strip }
> patterns = (1 .. numpatterns).map{ gets().strip }
>
> patterns.each_with_index{|pat,i|
> regex = Regexp.new( pat.gsub("(", "[").gsub(")", "]") )

> printf "Case #%d: %d\n", i, words.grep(...
>

I don't know anything about Ruby, but I can certainly admire the
conciseness of your program. How does it compare in speed to mine?

Marcel Hendrix

unread,

Apr 23, 2012, 2:59:16 PM4/23/12

to

awe...@arcor.de (Alex Wegel) writes Re: Google CodeJam?

> Marcel Hendrix <m...@iae.nl> wrote:

>> awe...@arcor.de (Alex Wegel) writes Re: Google CodeJam?
[..]

> I think the result is wrong (mostly because you were too smart about the
> input - see other message. Not sure why your case #9 got 1 more than my
> case 10, though).

Yes, the off-by-one caused the mismatch. However, the large sample exposed
a really shameful bug in my program: it overwrote the dictionary strings
because I had forgotten the count bytes (size 15 instead of 16).

[..]

> BTW (Just for boasting):
> Runtime (gforth on powerpc mac @1.8GHz) for the large set was 2.0s (with
> 1.5s utime), or 0.4s (0.3s utime) using gforth-fast :-)

Thanks for posting this number, it prompted me to get the bug out
so I could beat that :-)

Using iForth (64bit) on my old Core i7 920 @2.67 GHz machine, it ran the
large sample in 139 ms.

> Anyway, my solution is the best, because the aliens would have no chance
> to decipher the program :-)

No discussion with that.

-marcel

WJ

unread,

Apr 23, 2012, 3:27:55 PM4/23/12

to

It takes 6.5 seconds for the large input file on my laptop.

The program is run this way:

ruby code-jam.rb Code-jam.in

I'll explain a few things.

_, numwords, numpatterns = gets().split.map{|s| s.to_i}

gets() reads a line from stdin or the file given on the
command-line; .split splits that string on whitespace,
yielding an array or list of strings;
.map works as in Lisp, in this case converting each string
in the array or list into an integer.

words = (1 .. numwords).map{ gets().strip }

(1 .. numwords) is a range; some examples of ranges
(the .to_a expands the range into an array or list):
(1 .. 5).to_a
==>[1, 2, 3, 4, 5]
("b" .. "f").to_a
==>["b", "c", "d", "e", "f"]
Here I'm simply using the range to read numwords lines
from the file. I could have done something like
words = []
numwords.times{ words << gets().strip }
.strip removes all whitespace from beginning and end
of the string.

patterns.each_with_index{|pat,i|

Iterating through the patterns; pat, of course, is
the pattern. i is assigned the index of the current
item, starting with 0 (which means my output has an
off-by-1 error).

printf "Case #%d: %d\n", i, words.grep( regex ).size

.grep works on an array or list of strings, selecting
only the items that match the regular expression.

Paul Rubin

unread,

Apr 23, 2012, 4:56:10 PM4/23/12

to

m...@iae.nl (Marcel Hendrix) writes:
> Using iForth (64bit) on my old Core i7 920 @2.67 GHz machine, it ran the
> large sample in 139 ms.

Pretty impressive. I tried a simple Python script using Python's regexp
module and it took about 2.1 sec to do the large sample on my laptop,
2.5 ghz Core 2 Duo using a single core. Coding time including fixing a
couple of errors was probably in the 8 minute range (I didn't time it).
I got a bit careless by trying to code too fast, which of course slowed
me down.

Alex Wegel

unread,

Apr 23, 2012, 5:19:45 PM4/23/12

to

Bernd Paysan <bernd....@gmx.de> wrote:

> If you learn how to use CELL, CELL+ and CELLS instead of 4, 4 + and 4 *,
> your program will work on a 64 bit CPU as well.

That's wrong - i know them already, but didn't use them in this
quick&dirty approach (that's part of what i meant by "ugly"?).

One thing keeping me from using CELLxy words is that the number 4
doesn't really relate to the systems/cpus cell size, but rather to the
size (in AU) of the bit-set representation (26 bits).
So, using CELLxy wouldn't be the right answer for smaller cpus or larger
bit-sets (what if scientists find out that the aliens use 65 characters
on tuesday?).

I think it would be more worthwhile to use a more compact representation
for the alien-dictionary (e.g. by storing the chars themselves instead
of bitsets having exactly one bit set) and thereby getting rid of the 4*
stuff alltogether (at least concerning the dictionary).

> I like this solution, because it shows that this subset of pattern
> matching can be done without too much hassle (and good performance) in
> plain Forth.

Thanks.
The alien example was perfect for such.

Cheers,
Alex

Hugh Aguilar

unread,

Apr 24, 2012, 1:08:17 AM4/24/12

to

Ruby is pretty impressive; I should learn more about it. BTW, do you
know Icon?

As I've said before, my Straight Forth will only be for micro-
controllers. I will have a "sister language" that will be used for
desktop-computer programs that are related in some way to the micro-
controller programs. I had been planning on Racket, but maybe I should
consider Ruby instead. Ruby is a lot more popular, although I think
that Racket has more novice-oriented documentation (a good thing, as
most micro-controller enthusiasts are more interested in hardware than
in software; they don't want to learn something complicated even if it
is more powerful, especially for desktop-computer programming which is
only peripherally related to micro-controllers).

Over on comp.lang.lisp, I've heard Ruby described as "Matz-Lisp" ---
meaning that Ruby is just Lisp with a more friendly syntax (for people
accustomed to infix). Would you consider that to be an accurate
description of Ruby?

Hugh Aguilar

unread,

Apr 24, 2012, 1:35:33 AM4/24/12

to

On Apr 23, 3:19 pm, awe...@arcor.de (Alex Wegel) wrote:

> Bernd Paysan <bernd.pay...@gmx.de> wrote:
> > If you learn how to use CELL, CELL+ and CELLS instead of 4, 4 + and 4 *,
> > your program will work on a 64 bit CPU as well.
>
> That's wrong - i know them already, but didn't use them in this
> quick&dirty approach (that's part of what i meant by "ugly"?).
>
> One thing keeping me from using CELLxy words is that the number 4
> doesn't really relate to the systems/cpus cell size, but rather to the
> size (in AU) of the bit-set representation (26 bits).
> So, using CELLxy wouldn't be the right answer for smaller cpus or larger
> bit-sets (what if scientists find out that the aliens use 65 characters
> on tuesday?).

I understand how your program works. I hadn't thought of this
technique at all. So I've learned something!

Your program is really hard to read. Now that we are no longer speed-
programming on a stop-watch, can you provide a cleaner version with
some comments?

It is possible to use your technique, and to also generate a function
the way that I did.

> I think it would be more worthwhile to use a more compact representation
> for the alien-dictionary (e.g. by storing the chars themselves instead
> of bitsets having exactly one bit set) and thereby getting rid of the 4*
> stuff alltogether (at least concerning the dictionary).

That would only be true if the patterns were mostly absolute chars. If
the patterns contain a lot of () sets, and the () sets typically
contain more than 4 chars, you'll save memory and have a faster
program by sticking with your current technique.

BTW, isn't anybody going to criticize my program because it fails to
work at all on the large input set? This is due to the fact that I
used my SEQ lists from LIST.4TH, and they are limited to 255 char
strings, whereas most of the pattern strings in the large input file
are upwards of 1K in size. This is easy to fix --- all I have to do is
write a new version of SEQ that allows big strings (I'll do this in
the next novice-package upgrade). When I was speed-programming
however, I just used my existing SEQ rather than write all of that low-
level stuff from scratch. I didn't realize this was a problem until
after the program was written and I noticed that it worked fine on the
small input file but not the large input file (I hadn't even visually
inspected the large file at that time, but had just vaguely assumed
that it was the same as the small input file just with more patterns
and more words).

Paul Rubin

unread,

Apr 24, 2012, 3:51:55 AM4/24/12

to

awe...@arcor.de (Alex Wegel) writes:
> One thing keeping me from using CELLxy words is that the number 4
> doesn't really relate to the systems/cpus cell size, but rather to the
> size (in AU) of the bit-set representation (26 bits).

I think the Forth spirit is "solve the problem you have". The 32-bit
representation is just fine for the problem as stated.

> So, using CELLxy wouldn't be the right answer for smaller cpus or larger
> bit-sets (what if scientists find out that the aliens use 65 characters
> on tuesday?).

And if there were a million patterns to check instead of 500, you would
have wanted a totally different structure such as a decision tree, if
you cared about speed. Similarly the aliens might skip over 65 chars
and go directly to Unicode (Aliencode?) so again you'd need a totally
different approach. That's ok too.

WJ

unread,

Apr 24, 2012, 4:30:50 AM4/24/12

to

Years ago, Icon was my favorite language. (I have the book by Griswold
& Griswold.) Then I switched to Ruby.

>
> As I've said before, my Straight Forth will only be for micro-
> controllers. I will have a "sister language" that will be used for
> desktop-computer programs that are related in some way to the micro-
> controller programs. I had been planning on Racket, but maybe I should
> consider Ruby instead.

I'm trying to learn some Racket, too.

Ruby lacks 2 things (compared to Racket):

1. High-speed looping. (Don't try to generate the Mandelbrot Set
with it.) Most of the other "scripting" languages are faster
than Ruby, I believe.
2. Macros.

> Ruby is a lot more popular,

I wonder which language would have more users if one didn't count
those who just use Ruby to power web sites (Ruby on Rails, etc.).

> although I think
> that Racket has more novice-oriented documentation (a good thing, as
> most micro-controller enthusiasts are more interested in hardware than
> in software; they don't want to learn something complicated even if it
> is more powerful, especially for desktop-computer programming which is
> only peripherally related to micro-controllers).

When people like that learn Ruby, they should skip the advanced
object-oriented features, as I did. I basically use Ruby as one
would use Perl, Awk, Lua, Python, or Scheme. I very seldom create
a new class; I just write some functions. Ruby on Rails is Greek
to me.

>
> Over on comp.lang.lisp, I've heard Ruby described as "Matz-Lisp" ---
> meaning that Ruby is just Lisp with a more friendly syntax (for people
> accustomed to infix). Would you consider that to be an accurate
> description of Ruby?

Perhaps. The creator of Ruby wrote this:

* Ruby is a language designed in the following steps:

* take a simple lisp language (like one prior to CL).
* remove macros, s-expression.
* add simple object system (much simpler than CLOS).
* add blocks, inspired by higher order functions.
* add methods found in Smalltalk.
* add functionality found in Perl (in OO way).

So, Ruby was a Lisp originally, in theory.
Let's call it MatzLisp from now on. ;-)
matz.

Paul Graham, the Lisp guru, has advised those who cannot
use Lisp at work to see if they can use Ruby, which he
considers somewhat Lisp-like.

No matter how many other languages I dabble in, I'll probably
keep using Ruby for many things. It often makes programming
so easy that it's almost boring.

Gerry Jackson

unread,

Apr 24, 2012, 7:50:31 AM4/24/12

to

On 21/04/2012 23:13, hughag...@yahoo.com wrote:
> On Thursday, April 12, 2012 1:43:19 AM UTC-6, Hugh Aguilar wrote:
>> Just for fun, lets have our own comp.lang.forth contest to write the
>> best Forth solution to the "alien language" problem.
>> http://code.google.com/codejam/contest/90101/dashboard#s=p0
>> This will be a loser's consolation contest, as none of us have any
>> chance at the real contest.
>
> I've waited and waited, but nobody has come forward. To qualify for the CodeJam contest, the program had to be written in under 8 minutes. That is very fast programming; I think that a typical programmer using a modern language would take about 1/2 hour. My own Forth program took me about 3 hours, so I am 6 times slower than pretty much everybody. This is a big part of why Forth is not used in the work world. No employer is going to pay anybody to program in Forth when it takes 6 times longer to write a program than it does in any other language. Also, I had the advantage of having my novice package available. Without the novice package, I think most Forth programmers would take maybe 3 days to write a program like this (that is why nobody responded to my challenge).
>
> It seems extremely unlikely that any Forther is going to come up with a Forth program to compete against mine. I would like to see programmers of other languages, such as Lisp and Ruby and so forth, present their own programs along with a mention of how much time was required. It is okay to post non-Forth code on comp.lang.forth --- nobody is posting Forth code --- if we are going to get any code posted, it will have to be in other languages.
>

Yet another solution using simple character comparisons. Not a robust
solution as it expects perfectly formatted data, which is OK for the two
test files. Just stick the word alien as the first line of a test file
and include it.

0 value L 0 value D 0 value N
variable awords

: next-line ( -- ) refill 0= if cr ." Finished" cr quit then ;
: get-LDN ( -- ) next-line source evaluate to N to D to L ;
: load-words ( -- )
here awords !
D 0 do next-line source here over chars allot swap cmove loop
;

\ Match alternatives inside parentheses, ca points to a ( character
\ Returns ca2 which points to a ) character

: match-alts ( ch ca -- ca2 f ) \ f is true for a match
begin
char+ 2dup c@ dup [char] ) <> ( -- ch ca' ch ch2 f )
while
=
until
nip 1000 s" )" search nip \ match, skip to closing )
else
2drop nip 0 \ no match
then
;

\ ca1 is test case, ca2 is word of length L
\ returns 1 for a match else 0
: match-word ( ca1 ca2 -- 0|1 )
L chars over + swap
do ( -- ca1 )

dup c@ dup [char] ( <>

if
i c@ <> if 0= leave then
else
drop i c@ swap match-alts 0= if 0= leave then
then
char+ 1 chars
+loop
0<> negate
;

: match-all-words ( ca -- n )
0 awords @ D L chars * over + swap
do ( -- ca n )
over i match-word + ( -- ca n' )
L chars
+loop nip
;

: .case ( n i -- ) cr ." Case #" 0 .r ." : " . ;

: match-all-cases ( -- n )
N 1+ 1
do next-line source drop match-all-words i .case loop
;

: alien
cr get-LDN
load-words
match-all-cases
next-line
;

Results for the large file - do others agree?
\ -----------------

Case #1: 0
Case #2: 1
Case #3: 0
Case #4: 1

Case #5: 577
Case #6: 577
Case #7: 1
Case #8: 384
Case #9: 1
Case #10: 375
Case #11: 0
Case #12: 1
Case #13: 0
Case #14: 1
Case #15: 264
Case #16: 457
Case #17: 1
Case #18: 378
Case #19: 1
Case #20: 478
Case #21: 0
Case #22: 1
Case #23: 0
Case #24: 1
Case #25: 537
Case #26: 419
Case #27: 1
Case #28: 499
Case #29: 1
Case #30: 483
Case #31: 0
Case #32: 1
Case #33: 0
Case #34: 1
Case #35: 388
Case #36: 720
Case #37: 1
Case #38: 280
Case #39: 1
Case #40: 546
Case #41: 0
Case #42: 1
Case #43: 0
Case #44: 1
Case #45: 561
Case #46: 666
Case #47: 1
Case #48: 501
Case #49: 1
Case #50: 533
Case #51: 0
Case #52: 1
Case #53: 0
Case #54: 1
Case #55: 522
Case #56: 517
Case #57: 1
Case #58: 690
Case #59: 1
Case #60: 494
Case #61: 0
Case #62: 1
Case #63: 0
Case #64: 1
Case #65: 548
Case #66: 533
Case #67: 1
Case #68: 555
Case #69: 1
Case #70: 643
Case #71: 0
Case #72: 1
Case #73: 0
Case #74: 1
Case #75: 763
Case #76: 637
Case #77: 1
Case #78: 367
Case #79: 1
Case #80: 801
Case #81: 0
Case #82: 1
Case #83: 0
Case #84: 1
Case #85: 534
Case #86: 769
Case #87: 1
Case #88: 627
Case #89: 1
Case #90: 594
Case #91: 0
Case #92: 1
Case #93: 0
Case #94: 1
Case #95: 900
Case #96: 346
Case #97: 1
Case #98: 398
Case #99: 1
Case #100: 423
Case #101: 0
Case #102: 1
Case #103: 0
Case #104: 1
Case #105: 414
Case #106: 441
Case #107: 1
Case #108: 759
Case #109: 1
Case #110: 473
Case #111: 0
Case #112: 1
Case #113: 0
Case #114: 1
Case #115: 502
Case #116: 678
Case #117: 1
Case #118: 572
Case #119: 1
Case #120: 441
Case #121: 0
Case #122: 1
Case #123: 0
Case #124: 1
Case #125: 389
Case #126: 430
Case #127: 1
Case #128: 665
Case #129: 1
Case #130: 397
Case #131: 0
Case #132: 1
Case #133: 0
Case #134: 1
Case #135: 646
Case #136: 324
Case #137: 1
Case #138: 636
Case #139: 1
Case #140: 623
Case #141: 0
Case #142: 1
Case #143: 0
Case #144: 1
Case #145: 529
Case #146: 526
Case #147: 1
Case #148: 531
Case #149: 1
Case #150: 496
Case #151: 0
Case #152: 1
Case #153: 0
Case #154: 1
Case #155: 336
Case #156: 421
Case #157: 1
Case #158: 456
Case #159: 1
Case #160: 336
Case #161: 0
Case #162: 1
Case #163: 0
Case #164: 1
Case #165: 473
Case #166: 563
Case #167: 1
Case #168: 323
Case #169: 1
Case #170: 327
Case #171: 0
Case #172: 1
Case #173: 0
Case #174: 1
Case #175: 650
Case #176: 528
Case #177: 1
Case #178: 427
Case #179: 1
Case #180: 459
Case #181: 0
Case #182: 1
Case #183: 0
Case #184: 1
Case #185: 525
Case #186: 579
Case #187: 1
Case #188: 533
Case #189: 1
Case #190: 833
Case #191: 0
Case #192: 1
Case #193: 0
Case #194: 1
Case #195: 472
Case #196: 400
Case #197: 1
Case #198: 604
Case #199: 1
Case #200: 529
Case #201: 0
Case #202: 1
Case #203: 0
Case #204: 1
Case #205: 708
Case #206: 337
Case #207: 1
Case #208: 519
Case #209: 1
Case #210: 596
Case #211: 0
Case #212: 1
Case #213: 0
Case #214: 1
Case #215: 416
Case #216: 599
Case #217: 1
Case #218: 663
Case #219: 1
Case #220: 420
Case #221: 0
Case #222: 1
Case #223: 0
Case #224: 1
Case #225: 467
Case #226: 649
Case #227: 1
Case #228: 571
Case #229: 1
Case #230: 417
Case #231: 0
Case #232: 1
Case #233: 0
Case #234: 1
Case #235: 751
Case #236: 381
Case #237: 1
Case #238: 460
Case #239: 1
Case #240: 278
Case #241: 0
Case #242: 1
Case #243: 0
Case #244: 1
Case #245: 409
Case #246: 636
Case #247: 1
Case #248: 320
Case #249: 1
Case #250: 644
Case #251: 0
Case #252: 1
Case #253: 0
Case #254: 1
Case #255: 603
Case #256: 289
Case #257: 1
Case #258: 461
Case #259: 1
Case #260: 322
Case #261: 0
Case #262: 1
Case #263: 0
Case #264: 1
Case #265: 747
Case #266: 417
Case #267: 1
Case #268: 676
Case #269: 1
Case #270: 393
Case #271: 0
Case #272: 1
Case #273: 0
Case #274: 1
Case #275: 414
Case #276: 450
Case #277: 1
Case #278: 432
Case #279: 1
Case #280: 481
Case #281: 0
Case #282: 1
Case #283: 0
Case #284: 1
Case #285: 676
Case #286: 581
Case #287: 1
Case #288: 464
Case #289: 1
Case #290: 530
Case #291: 0
Case #292: 1
Case #293: 0
Case #294: 1
Case #295: 425
Case #296: 483
Case #297: 1
Case #298: 433
Case #299: 1
Case #300: 416
Case #301: 0
Case #302: 1
Case #303: 0
Case #304: 1
Case #305: 514
Case #306: 434
Case #307: 1
Case #308: 480
Case #309: 1
Case #310: 396
Case #311: 0
Case #312: 1
Case #313: 0
Case #314: 1
Case #315: 462
Case #316: 623
Case #317: 1
Case #318: 426
Case #319: 1
Case #320: 356
Case #321: 0
Case #322: 1
Case #323: 0
Case #324: 1
Case #325: 543
Case #326: 507
Case #327: 1
Case #328: 433
Case #329: 1
Case #330: 602
Case #331: 0
Case #332: 1
Case #333: 0
Case #334: 1
Case #335: 459
Case #336: 248
Case #337: 1
Case #338: 650
Case #339: 1
Case #340: 357
Case #341: 0
Case #342: 1
Case #343: 0
Case #344: 1
Case #345: 489
Case #346: 520
Case #347: 1
Case #348: 481
Case #349: 1
Case #350: 313
Case #351: 0
Case #352: 1
Case #353: 0
Case #354: 1
Case #355: 359
Case #356: 440
Case #357: 1
Case #358: 475
Case #359: 1
Case #360: 642
Case #361: 0
Case #362: 1
Case #363: 0
Case #364: 1
Case #365: 434
Case #366: 470
Case #367: 1
Case #368: 322
Case #369: 1
Case #370: 498
Case #371: 0
Case #372: 1
Case #373: 0
Case #374: 1
Case #375: 385
Case #376: 744
Case #377: 1
Case #378: 465
Case #379: 1
Case #380: 382
Case #381: 0
Case #382: 1
Case #383: 0
Case #384: 1
Case #385: 784
Case #386: 654
Case #387: 1
Case #388: 671
Case #389: 1
Case #390: 481
Case #391: 0
Case #392: 1
Case #393: 0
Case #394: 1
Case #395: 357
Case #396: 422
Case #397: 1
Case #398: 526
Case #399: 1
Case #400: 418
Case #401: 0
Case #402: 1
Case #403: 0
Case #404: 1
Case #405: 371
Case #406: 526
Case #407: 1
Case #408: 606
Case #409: 1
Case #410: 1045
Case #411: 0
Case #412: 1
Case #413: 0
Case #414: 1
Case #415: 495
Case #416: 496
Case #417: 1
Case #418: 540
Case #419: 1
Case #420: 506
Case #421: 0
Case #422: 1
Case #423: 0
Case #424: 1
Case #425: 372
Case #426: 601
Case #427: 1
Case #428: 575
Case #429: 1
Case #430: 450
Case #431: 0
Case #432: 1
Case #433: 0
Case #434: 1
Case #435: 450
Case #436: 507
Case #437: 1
Case #438: 589
Case #439: 1
Case #440: 390
Case #441: 0
Case #442: 1
Case #443: 0
Case #444: 1
Case #445: 441
Case #446: 447
Case #447: 1
Case #448: 397
Case #449: 1
Case #450: 296
Case #451: 0
Case #452: 1
Case #453: 0
Case #454: 1
Case #455: 226
Case #456: 407
Case #457: 1
Case #458: 509
Case #459: 1
Case #460: 619
Case #461: 0
Case #462: 1
Case #463: 0
Case #464: 1
Case #465: 517
Case #466: 467
Case #467: 1
Case #468: 483
Case #469: 1
Case #470: 569
Case #471: 0
Case #472: 1
Case #473: 0
Case #474: 1
Case #475: 481
Case #476: 509
Case #477: 1
Case #478: 451
Case #479: 1
Case #480: 348
Case #481: 0
Case #482: 1
Case #483: 0
Case #484: 1
Case #485: 713
Case #486: 424
Case #487: 1
Case #488: 391
Case #489: 1
Case #490: 640
Case #491: 0
Case #492: 1
Case #493: 0
Case #494: 1
Case #495: 650
Case #496: 479
Case #497: 1
Case #498: 470
Case #499: 1
Case #500: 826

--
Gerry

Paul Rubin

unread,

Apr 24, 2012, 11:22:56 AM4/24/12

to

Gerry Jackson <ge...@jackson9000.fsnet.co.uk> writes:
> Yet another solution using simple character comparisons.

Very nice and compact.

> Results for the large file - do others agree?

I did a quick eyeball check of a dozen or so values and didn't spot any
discrepancies with my own output.

Marcel Hendrix

unread,

Apr 24, 2012, 4:28:03 PM4/24/12

to

Gerry Jackson <ge...@jackson9000.fsnet.co.uk> writes Re: Google CodeJam?
[..]

> Yet another solution using simple character comparisons. Not a robust
> solution as it expects perfectly formatted data, which is OK for the two
> test files. Just stick the word alien as the first line of a test file
> and include it.

[..]

Results for the large file - do others agree?

[..]

The results match mine, modulo spaces, tabs, and eol characters.
Google has a checker that you could have used.

-marcel

Alex Wegel

unread,

Apr 24, 2012, 7:59:20 PM4/24/12

to

Hugh Aguilar <hughag...@yahoo.com> wrote:

> I understand how your program works. I hadn't thought of this
> technique at all. So I've learned something!
>
> Your program is really hard to read. Now that we are no longer speed-
> programming on a stop-watch, can you provide a cleaner version with
> some comments?

Yes i can :-)

But i did sth. else instead: Reducing the dict size by a factor of 4,
getting the source to less than 1K (this means: still no comments),
while roughly maintaining execution speed (still at 0.3sec for the large
set).

> It is possible to use your technique, and to also generate a function
> the way that I did.
>
> > I think it would be more worthwhile to use a more compact representation
> > for the alien-dictionary (e.g. by storing the chars themselves instead
> > of bitsets having exactly one bit set) and thereby getting rid of the 4*
> > stuff alltogether (at least concerning the dictionary).
>
> That would only be true if the patterns were mostly absolute chars.

I meant the dictionary of words, not the patterns (the 5000 words are
stored real wasteful in my first version, while there's only 1 pattern
stored in memory at any time).
The runtime penalty of the new form is a more or less a "lshift" for
each comparison (about 40.000.000 times), which isn't so very long on a
GHz-class cpu.

> BTW, isn't anybody going to criticize my program because it fails to
> work at all on the large input set?

It didn't work here (on gforth/powerpc) even for the small set, and i
didn't feel like debugging either of these packages (novice, list).

> This is due to the fact that I
> used my SEQ lists from LIST.4TH, and they are limited to 255 char
> strings, whereas most of the pattern strings in the large input file
> are upwards of 1K in size.

The patterns can't be larger than (26+2)*15 = 420 characters,
and actually the longest one has 382 chars.

> This is easy to fix --- all I have to do is
> write a new version of SEQ that allows big strings (I'll do this in
> the next novice-package upgrade). When I was speed-programming
> however, I just used my existing SEQ rather than write all of that low-
> level stuff from scratch. I didn't realize this was a problem until
> after the program was written and I noticed that it worked fine on the
> small input file but not the large input file (I hadn't even visually
> inspected the large file at that time, but had just vaguely assumed
> that it was the same as the small input file just with more patterns
> and more words).

It seems to me that the novice package was more an obstacle, rather than
a helpful tool (at least in this case).

Alex Wegel

unread,

Apr 24, 2012, 7:59:21 PM4/24/12

to

Gerry Jackson <ge...@jackson9000.fsnet.co.uk> wrote:

> Results for the large file - do others agree?

I ran it myself, and checked the output - it was correct (after deleting
the input which had got echoed into the output).

So - you won on program size, memory usage and probably on
clarity/simplicity/straightforwardness.

It took >10 sec utime to process the large example, though ;-)

Cheers,
Alex

Alex Wegel

unread,

Apr 24, 2012, 7:59:21 PM4/24/12

to

For those interested: I took some more time to make some changes to the
first version of my program.

Summary:

- Runs in pipe now, invoke like:

alien.fs <A-large-practice.in.txt >out.txt

or

gforth-fast alien.fs <A-large-practice.in.txt >out.txt

- Uses <80KB dictionary space now (was >300KB before), and even
de-allots the dict after running

- The source is smaller than 1000 bytes now, still using only plain
forth (ok, i admit that stdin isn't standard)

- Processes the large example in around 0.3sec, with apx. 0.27sec utime
on an 1.8GHz powerpc.

- Produces correct output (i.e. accepted by google codejam result check)

- Removed some of the 4*ish stuff, or changed to CELL..

- Switched from variables to values, mostly to make the code less
@-cluttered

- No comments (but that's no news)

Here it comes, enjoy:

#! /usr/local/bin/gforth-fast
create B 424 allot
: getl B dup 422 stdin read-line throw drop ;

0 value L
0 value D
0 value N

: read-head getl evaluate to N to D to L ;

: ord [char] a - ;
: word, bounds do i c@ ord c, loop ;
: read-dic align here D 0 do getl word, loop ;

create P 16 cells allot
-1 dup 1 rshift xor constant msb
: >msk msb swap rshift ;
: ccl) 0 >r
begin
char+ dup c@ ord
dup [ char z ord 1+ ] literal u<
while
>msk r> or >r
repeat
drop r> ;
: read-pat
P swap 0 do
>r dup c@ dup [char] ( = if
drop ccl)
else
ord >msk
then
invert r@ !
char+ r> cell+
loop 2drop ;

: check1
cells P + P do
i @ over c@ lshift
0< if unloop drop false exit then
char+
cell +loop
drop true ;
: matches 0 over D L * bounds do i L check1 1 and + L +loop ;
: .m ." Case #" 1 .r ." : " 1 .r cr ;
: chk-all N 0 do getl drop L read-pat matches i 1+ .m loop ;

' noop is bootmessage
read-head read-dic chk-all
here - allot
bye

Alex Wegel

unread,

Apr 24, 2012, 7:59:20 PM4/24/12

to

Paul Rubin <no.e...@nospam.invalid> wrote:

> I think the Forth spirit is "solve the problem you have". The 32-bit
> representation is just fine for the problem as stated.

After some more thinking, i tend to agree. :-)

Hugh Aguilar

unread,

Apr 25, 2012, 1:16:34 AM4/25/12

to

On Apr 24, 5:59 pm, awe...@arcor.de (Alex Wegel) wrote:

> Hugh Aguilar <hughaguila...@yahoo.com> wrote:
> > BTW, isn't anybody going to criticize my program because it fails to
> > work at all on the large input set?
>
> It didn't work here (on gforth/powerpc) even for the small set, and i
> didn't feel like debugging either of these packages (novice, list).

That is grossly unfair, to say that my novice.4th and list.4th
packages need debugging. They and the alien.4th program all work fine.

When I run alien.4th under SwiftForth I get the correct (according to
Google validation) result:

Case #1: 0
Case #2: 1
Case #3: 0
Case #4: 1

Case #5: 6
Case #6: 7
Case #7: 1
Case #8: 8
Case #9: 1
Case #10: 8

When I run alien.4th under Gforth, Gforth aborts with this message:
gforth: ./main.c.1188 optimize_bb: Assertion `ninsts<128' failed

I have no idea what that means, and I don't really care --- it is not
my job to debug Gforth --- even when Gforth does work, it is too slow
to be useful in any way, so I have no motivation to coax Gforth into
working.

The last time that I had software that didn't work under Gforth, Anton
Ertl criticized me for not knowing that some words don't have an xt
value. WTF??? How was I supposed to know that? Gforth is full of weird
quirks that have to be worked around. This is just a bug in Gforth ---
but no doubt Anton Ertl will blame me for it, just as you have done.

You know, there is a reason why I'm writing Straight Forth --- it is
because Gforth is garbage.

I really believe that Anton Ertl purposely crippled Gforth so that it
wouldn't be competitive against SwiftForth. He is just a stooge of
Forth Inc.. Most likely this bug is part of that effort --- he
crippled Gforth in some way so that it would crash on programs that
SwiftForth handles okay, in order to make SwiftForth look good --- so
that people wouldn't notice how bad SwiftForth is in regard to its
complete lack of optimization.

> It seems to me that the novice package was more an obstacle, rather than
> a helpful tool (at least in this case).

Your program is incredibly ugly. Bernd Paysan calls this "plain Forth"
--- but that really speaks volumes in regard to how bad Forth code
quality has become since the release of ANS-Forth in 1994 --- my
novice package allows me to write beautiful Forth code quickly.

I don't think that it is fair to disparage my novice package because
some of the string stuff is limited to strings < 255 characters in
length --- this has been a common limitation in Forth for decades ---
alien.4th is the first program that I've written in which this
limitation was an issue (the A-large-practice.in file containing
pattern strings longer than 255 chars). Also, as I said, it is easy
enough for me to write a version of SEQ that allows for big strings
--- this is actually trivial --- this has nothing to do with the core
ideas of the novice package.

Hugh Aguilar

unread,

Apr 25, 2012, 2:00:51 AM4/25/12

to

On Apr 24, 2:30 am, "WJ" <w_a_x_...@yahoo.com> wrote:
> No matter how many other languages I dabble in, I'll probably
> keep using Ruby for many things. It often makes programming
> so easy that it's almost boring.

That sounds like a good thing to me. This alien.4th program was fun,
but it was also a lot more complicated and time-consuming than it
would have been in any other language. I'd rather have a language in
which programming is boring --- this is especially true in a workplace
--- it is too stressful to risk getting fired every day, because every
program assigned to me is a challenge that I may or may not succeed
at. Now I discover that my alien.4th program crashes Gforth --- I
wouldn't want to have to deal with a mess like that in a workplace
with my boss breathing down my neck. I would rather have a language
that never crashed and that allowed me to write programs in a matter
of minutes --- then my only worry would be that the boss wouldn't
catch me goofing off on the internet during work hours. ;-)

BTW, have you tried Python? What do you think of it as compared to
Ruby?

Rails is Ruby's killer app. There are some web frameworks in Python,
but they aren't nearly as popular as Rails. Python doesn't have any
killer app, but Python is more popular for general-purpose scripting
than Ruby is. Python is what Qbasic used to be --- the easy language
that everybody learns, but which isn't taken very seriously among
professionals. That is my impression anyway.

Gerry Jackson

unread,

Apr 25, 2012, 4:49:16 AM4/25/12

to

On 25/04/2012 00:59, Alex Wegel wrote:
> Gerry Jackson<ge...@jackson9000.fsnet.co.uk> wrote:
>
>> Results for the large file - do others agree?
>
> I ran it myself, and checked the output - it was correct (after deleting
> the input which had got echoed into the output).
>
> So - you won on program size, memory usage and probably on
> clarity/simplicity/straightforwardness.
>

Kind of you to say so but the control was a bit too complex for my liking.

> It took>10 sec utime to process the large example, though ;-)
>

Yes it was bound to be much slower as it does no pre-processing of the
patterns, your approach is much better for efficiency.

Out of interest I coded up a version using regular expressions. While it
took much less coding time, running it on the large data file took about
13 times longer on GForthcompared to my previous solution. That's the
price of generality.

--
Gerry

Gerry Jackson

unread,

Apr 25, 2012, 4:50:23 AM4/25/12

to

On 24/04/2012 21:28, Marcel Hendrix wrote:
> Gerry Jackson<ge...@jackson9000.fsnet.co.uk> writes Re: Google CodeJam?
> [..]
>> Yet another solution using simple character comparisons. Not a robust
>> solution as it expects perfectly formatted data, which is OK for the two
>> test files. Just stick the word alien as the first line of a test file
>> and include it.
> [..]
> Results for the large file - do others agree?
> [..]
>
> The results match mine, modulo spaces, tabs, and eol characters.

Thanks

> Google has a checker that you could have used.
>

Sorry, I'd overlooked that.

--
Gerry

Alex Wegel

unread,

Apr 25, 2012, 8:25:48 AM4/25/12

to

Alex Wegel <awe...@arcor.de> wrote:

> Another remark: In my even more rusty perl, it was much simpler to do
> quick & dirty (by following the suggestion to change the ()'s to []'s):

I wasn't happy with the clumsy perl code i wrote, so i tried again:

#!/usr/bin/perl
my ($l, $d, $n) = split " ",<>;
while ($d--) { push @d, scalar <>; }
$c=0;
while (<>) {
($p = $_) =~ tr/()/[]/;
$m=0;
map { $m += m($p) } @d;
print "Case ".++$c.": $m\n";
}

Alex Wegel

unread,

Apr 25, 2012, 8:25:47 AM4/25/12

to

Hugh Aguilar <hughag...@yahoo.com> wrote:

> On Apr 24, 5:59 pm, awe...@arcor.de (Alex Wegel) wrote:
> > Hugh Aguilar <hughaguila...@yahoo.com> wrote:
> > > BTW, isn't anybody going to criticize my program because it fails to
> > > work at all on the large input set?
> >
> > It didn't work here (on gforth/powerpc) even for the small set, and i
> > didn't feel like debugging either of these packages (novice, list).
>
> That is grossly unfair, to say that my novice.4th and list.4th
> packages need debugging. They and the alien.4th program all work fine.

Well - i now took the time (the lack of which was the main reason i
didn't investigate the failure earlier) to try again (using a different
invocation), and guess what: The small example worked.

> You know, there is a reason why I'm writing Straight Forth --- it is
> because Gforth is garbage.

So you are the god of fairness - LOL!

> I really believe that Anton Ertl purposely crippled Gforth so that it
> wouldn't be competitive against SwiftForth. He is just a stooge of
> Forth Inc.. Most likely this bug is part of that effort --- he
> crippled Gforth in some way so that it would crash on programs that
> SwiftForth handles okay, in order to make SwiftForth look good --- so
> that people wouldn't notice how bad SwiftForth is in regard to its
> complete lack of optimization.

You're obviously paranoid (which is not to say that they're not after
you).

But i know that you're on an agenda (as you repeatedly stated here): To
bring down Forth Inc.

Seriously: The offensive (and untrue) off-topic stuff that you write
here every other day is going on my nerves big time - you're close to
ending up in the filter once again.

> > It seems to me that the novice package was more an obstacle, rather than
> > a helpful tool (at least in this case).
>
> Your program is incredibly ugly.

Yes, that's what i wrote about it.

> Bernd Paysan calls this "plain Forth"

Because it doesn't have to include anything on top of forth.
Is that so hard to get?

Your code first loads >64KB (!!) of stuff before doing anything.

> --- but that really speaks volumes in regard to how bad Forth code
> quality has become since the release of ANS-Forth in 1994 --- my
> novice package allows me to write beautiful Forth code quickly.

We just saw that. :-)
I still say: It brought you tools that you didn't realls need for the
task, which lead you to writing a program whose source is 5 times as big
as need to be (not including the novice packages source), just in order
to solve problems (eg. complaining about invalid input data and looking
overly neat) that were not part of the task.
Then it even turned out that your SEQ basically was mis-applied in it's
current form, requiring a package update (which might be easy to do for
you and me, but surely not for a novice who would be stuck right there,
waiting for you to come up with the update - and then you pretend that
you're so much better than some company X).

> I don't think that it is fair to disparage my novice package because
> some of the string stuff is limited to strings < 255 characters in
> length --- this has been a common limitation in Forth for decades ---

The fact that you have to rewrite parts of SEQ just means that it's not
very flexible. (Which matters because it obviously pretends to be sth.
to build upon: It even fooled you, the author, and kept you from solving
the large example.)

> alien.4th is the first program that I've written in which this
> limitation was an issue (the A-large-practice.in file containing
> pattern strings longer than 255 chars).

You see: It happens just when you would have to deliver in 8 minutes.

> Also, as I said, it is easy
> enough for me to write a version of SEQ that allows for big strings
> --- this is actually trivial ---

...So, have you fixed it now?

> this has nothing to do with the core
> ideas of the novice package.

Which are?

Alex

Anton Ertl

unread,

Apr 25, 2012, 9:39:38 AM4/25/12

to

Hugh Aguilar <hughag...@yahoo.com> writes:
>When I run alien.4th under Gforth, Gforth aborts with this message:
>gforth: ./main.c.1188 optimize_bb: Assertion `ninsts<128' failed

That's a limitation in Gforth 0.6.*. Gforth 0.7.* does not have that
limitation. BTW, this means that your program has a sequence of 128
primitives without intervening control flow (no call to a colon
definition or somesuch). Not many people complained about that
limitation while 0.6.* was current, but I came across (IIRC) one
program where it caused a problem, so I removed the limitation.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2011: http://www.euroforth.org/ef11/

Hugh Aguilar

unread,

Apr 25, 2012, 1:05:28 PM4/25/12

to

On Apr 25, 7:39 am, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:

What possible purpose could there be in limiting the length of the
functions? That seems like an arbitrary and contrived limitation.

Your limitation was in the number of primitives without any
intervening control-flow (calls or jumps). Well, this is pretty common
with generated functions. The idea is to unwind loops. The loop is
executed at compile-time rather than run-time and it generates a long
function without any internal jumps. This makes my program faster, as
jumps are typically the slowest aspect of any program. Also, it makes
my program simpler because I have fewer nested loops, which a lot of
programmers (me anyway) find confusing. I have a two-stage process:
first I generate the pattern-check functions, then I execute all of
them on all of the target strings. I'm a big fan of the idea of
simplifying a program by making it multi-stage in order to avoid a lot
of nested complexity --- this is also why we have pipes in Unix (plus
they save memory, which was an issue on the 64K PDP-11).

If your students weren't complaining about this limitation, it is
because they are bad Forthers --- but I can't be blamed for that!

Well, you got rid of this limitation in the latest version of Gforth
(I haven't actually tested this, but I'll believe you) --- so that is
a big step forward --- good job!

Hugh Aguilar

unread,

Apr 25, 2012, 2:58:06 PM4/25/12

to

On Apr 25, 6:25 am, awe...@arcor.de (Alex Wegel) wrote:
> But i know that you're on an agenda (as you repeatedly stated here): To
> bring down Forth Inc.

The world is full of sleazy salespeople selling garbage to
unsuspecting customers. I don't interfere with them --- caveat emptor.
Elizabeth Rather has spent her entire career telling everybody about
how she knew Chuck Moore back in the 1970s. That is just name-
dropping. Now she tells us that she personally knows many members of
Barack Obama's family. That is just more name-dropping. If people are
dumb enough to believe this baloney, then they deserve to get ripped
off.

Elizabeth Rather attacked my code. When Passaniti said that it "sucks"
and was "pulled out of my ass," she described this as: "perceptive and
technically sound." She herself said that my novice package was
"written by a novice." She stepped over the line! She can't expect me
to not interfere with her sleazy SwiftForth scam after saying such
things about my code. Of course I'm going to point out the gross
problems with SwiftForth --- she shouldn't have attacked my code ---
people who live in glass houses shouldn't throw stones.

When I'm driving my cab, people often tell me about how they hob-nob
with celebrities, and then when we get to the destination they say
that they have to go inside to get the money (presumably from their
buddy Barack) --- I don't let them get away with this --- I have put
people in jail for this --- and I have given them a beating first,
before handing them over to the police. I can spot phonies a mile away
--- Elizabeth Rather is a complete phony, attaching herself like a
leach to Chuck Moore --- I am amazed that the comp.lang.forth crowd
fails to see this (or, pretends not to see it).

SwiftForth really is a glass house. I paid almost $500 for it, and it
has serious bugs. For example, (LOCAL) doesn't work and will crash the
system. Also, it does essentially no optimization at all --- it just
pastes the code from sub-functions into the calling function --- all
that this saves is the CALL and RET, which isn't much, and it bloats
out the code so that it no longer fits in the cache. In SwiftForth,
ALIGN and ALIGNED are no-ops, so we don't even get aligned data, which
is a very basic optimization that even the most novice assembly-
language programmers know about. SwiftForth gives the Forth community
a bad name --- this is especially true because it comes from Forth
Inc. --- casual observers assume that Forth Inc. defines Forth,
because Forth Inc. owns the name "Forth" (and because Elizabeth Rather
learned Forth directly from Chuck Moore in the 1970s, yada yada).

Message has been deleted

Alex Wegel

unread,

Apr 25, 2012, 6:23:17 PM4/25/12

to

Hugh Aguilar <hughag...@yahoo.com> wrote:

> On Apr 25, 6:25 am, awe...@arcor.de (Alex Wegel) wrote:

> > But i know that you're on an agenda (as you repeatedly stated here): To
> > bring down Forth Inc.
>

soso

Alex Wegel

unread,

Apr 25, 2012, 6:23:17 PM4/25/12

to

Hugh Aguilar <hughag...@yahoo.com> wrote:

> On Apr 25, 6:25 am, awe...@arcor.de (Alex Wegel) wrote:

> > But i know that you're on an agenda (as you repeatedly stated here): To
> > bring down Forth Inc.

Maybe you could get a job as a software insultant?

Alex Wegel

unread,

Apr 25, 2012, 6:23:17 PM4/25/12

to

Hugh Aguilar <hughag...@yahoo.com> wrote:

> On Apr 25, 6:25 am, awe...@arcor.de (Alex Wegel) wrote:
> > But i know that you're on an agenda (as you repeatedly stated here): To
> > bring down Forth Inc.
>

aha

WJ

unread,

Apr 26, 2012, 5:19:03 AM4/26/12

to

Hugh Aguilar wrote:

> On Apr 24, 2:30 am, "WJ" <w_a_x_...@yahoo.com> wrote:
> > No matter how many other languages I dabble in, I'll probably
> > keep using Ruby for many things. It often makes programming
> > so easy that it's almost boring.
>
> That sounds like a good thing to me. This alien.4th program was fun,
> but it was also a lot more complicated and time-consuming than it
> would have been in any other language. I'd rather have a language in
> which programming is boring --- this is especially true in a workplace
> --- it is too stressful to risk getting fired every day, because every
> program assigned to me is a challenge that I may or may not succeed
> at. Now I discover that my alien.4th program crashes Gforth --- I
> wouldn't want to have to deal with a mess like that in a workplace
> with my boss breathing down my neck. I would rather have a language
> that never crashed and that allowed me to write programs in a matter
> of minutes --- then my only worry would be that the boss wouldn't
> catch me goofing off on the internet during work hours. ;-)
>
> BTW, have you tried Python? What do you think of it as compared to
> Ruby?

I have barely looked at Python. The significance of indentation
doesn't appeal to me. Also repellent is the fact that the language
tries to allow you to do a thing only one way: Guido's way.
His tyrannical attitude inspired this revision of Stephen Crane's
poem "Think as I Think":

Code as I Code

"Code as I code," said Guido,
"Or you are abominably wicked;
"You are a toad."

And after I had thought of it,
I said: "I will, then, be a toad."

(Off topic)
Another Crane poem that has never been more apropos than it is
today:

The wayfarer
Perceiving the pathway to truth
Was struck with astonishment.
It was thickly grown with weeds.
``Ha,'' he said,
``I see that none has passed here
In a long time.''
Later he saw that each weed
Was a singular knife.
``Well,'' he mumbled at last,
``Doubtless there are other roads.''

Hugh Aguilar

unread,

Apr 26, 2012, 6:17:25 AM4/26/12

to

I agree that being Pythonic isn't a much of a goal to aspire to. On
the other hand though, Python has become so common that it is worth
being familiar with.

What do you think of Lua?

I never heard of Stephen Crane and had to google him. His poems were
pretty good from what I read. I'm a fan of Alexander Pope myself. This
is one of my favorites, and very apropos for comp.lang.forth:

The coxcomb bird, so talkative and grave,
That from his cage, cries Cuckold, Whore and Knave,
Though many a passenger he rightly call,
You hold him no philosopher at all.

Albert van der Horst

unread,

Apr 26, 2012, 8:36:28 AM4/26/12

to

In article <1kj4t7c.1j1gctm8krqygN%awe...@arcor.de>,

Rolling on the floor with laughter!

Groetjes Albert

--
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

Anton Ertl

unread,

Apr 26, 2012, 12:30:26 PM4/26/12

to

Hugh Aguilar <hughag...@yahoo.com> writes:
>On Apr 25, 7:39=A0am, an...@mips.complang.tuwien.ac.at (Anton Ertl)

>wrote:
>> Hugh Aguilar <hughaguila...@yahoo.com> writes:
>> >When I run alien.4th under Gforth, Gforth aborts with this message:
>> >gforth: ./main.c.1188 optimize_bb: Assertion `ninsts<128' failed
>>

>> That's a limitation in Gforth 0.6.*. =A0Gforth 0.7.* does not have that
>> limitation. =A0BTW, this means that your program has a sequence of 128

>> primitives without intervening control flow (no call to a colon
>> definition or somesuch).

...

>What possible purpose could there be in limiting the length of the
>functions? That seems like an arbitrary and contrived limitation.

This limitation makes the storage management of optimal selection of
static superinstructions and static stack caching easier. IIRC it's
still there, but now it just breaks too-long basic blocks into parts
that are 128 primitives in length.

van...@vsta.org

unread,

Apr 26, 2012, 12:45:34 PM4/26/12

to

WJ <w_a_...@yahoo.com> wrote:
> I have barely looked at Python. The significance of indentation
> doesn't appeal to me. Also repellent is the fact that the language
> tries to allow you to do a thing only one way: Guido's way.

Having coded heavily in C, Forth, and Python, I can say that the indentation
approach has not turned out to be a big deal. And although Guido is, in
fact, "Benevolent Dictator for Life", his guidance has evolved Python into a
very effective language. I suggest you move beyond "barely looking" before
reaching a conclusion on the language.

One of my most recent "aha" moments was this code for doing breadth traversal
of collections:

def breadthfirst(tree, children=iter):
yield tree
last = tree
for node in breadthfirst(tree, children):
for child in children(node):
yield child
last = child
if last == node:
return

It's used as:

for member in breadthfirst(aCollection):
...

And "member" gets first the things at the top, then things one level down,
and so forth. It works generically across any collection.

Although the language does not have, as such, lazy evaluation semantics, this
function uses generators to get that effect. I'm not sure I've seen a more
elegant breadth-first traversal in any language, and I don't know that
there's any other language which could use this implementation technique.

I really hope that, some day, a next generation Forth will marry its elegance
to semantics as powerful as these.

--
Andy Valencia
Home page: http://www.vsta.org/andy/
To contact me: http://www.vsta.org/contact/andy.html

Paul Rubin

unread,

Apr 26, 2012, 1:47:03 PM4/26/12

to

van...@vsta.org writes:
> Having coded heavily in C, Forth, and Python, I can say that the indentation
> approach has not turned out to be a big deal.

I agree with this. "One and only one way" just means Python tries to
avoid creating multiple syntaxes or libraries to do the same thing. The
general feeling of programming Python is sort of like Lisp. There are
still lots of styles you can use. I haven't used Ruby in part because
from what I can tell, it's similar enough to Python that I wouldn't
really gain anything new from it.

> def breadthfirst(tree, children=iter):
> yield tree

> last = tree ...

> Although the language does not have, as such, lazy evaluation semantics, this
> function uses generators to get that effect. I'm not sure I've seen a more
> elegant breadth-first traversal in any language, and I don't know that
> there's any other language which could use this implementation technique.

I actually find that code pretty confusing, and I can't tell quite what
it's supposed to do. I suspect it would be simpler in Haskell. You
could also look at Lua coroutines.

> I really hope that, some day, a next generation Forth will marry its
> elegance to semantics as powerful as these.

Perhaps: http://factorcode.org/
But, I think Forth itself is intended to be quite low-level.

Rugxulo

unread,

Apr 26, 2012, 7:00:19 PM4/26/12

to

Hi,

On Apr 23, 2:27 pm, "WJ" <w_a_x_...@yahoo.com> wrote:
> Hugh Aguilar wrote:

> > On Apr 21, 10:03 pm, "WJ" <w_a_x_...@yahoo.com> wrote:
> > > Ruby:
>
> > > _, numwords, numpatterns = gets().split.map{|s| s.to_i}
>
> > > words = (1 .. numwords).map{ gets().strip }
> > > patterns = (1 .. numpatterns).map{ gets().strip }
>
> > > patterns.each_with_index{|pat,i|
> > > regex = Regexp.new( pat.gsub("(", "[").gsub(")", "]") )
> > > printf "Case #%d: %d\n", i, words.grep(...
>
> > I don't know anything about Ruby, but I can certainly admire the
> > conciseness of your program. How does it compare in speed to mine?
>
> It takes 6.5 seconds for the large input file on my laptop.
>
> The program is run this way:
>
> ruby code-jam.rb Code-jam.in

This may or may not be interesting to you (Ruby 1.8.7 pl352), just
comparing to itself on this one x86 machine:

= Lucid Puppy Linux: time ruby code-jam.rb Code-jam.in > linux.out
(3.7 s.)
= DJGPP build atop DOSEMU: redir -t ruby code-jam.rb Code-jam.in >
djgpp-a.out (4.2 s.)
= DJGPP build atop DOSEMU: redir -t miniruby code-jam.rb Code-jam.in >
djgpp-b.out (2.8 s.)

So timings for one version of Ruby (or any language) depend on various
things, even on the same exact machine, so take such measurements with
a big grain of salt.

Paul Rubin

unread,

Apr 27, 2012, 5:26:08 AM4/27/12

to

van...@vsta.org writes:
> def breadthfirst(tree, children=iter):
> yield tree
> last = tree
> for node in breadthfirst(tree, children):
> for child in children(node):
> yield child
> last = child
> if last == node:
> return

Maybe that could be written something like (untested):

from itertools import chain
def breadthfirst(tree, children=iter):
def t(fifo):
n1 = list(islice(fifo,1))
return chain(n1,fifo,children(n1) if n1 else [])
return t([tree])

That's the more traditional way to do it. It's prettier in Haskell, IMO:

data Tree a = Tree {value :: a, children :: [Tree a] }

breadthFirst t = bf1 [t] where
bf1 [] = []
bf1 (x:xs) = value x : bf1 (xs ++ children x)

Alex Wegel

unread,

Apr 27, 2012, 8:15:49 AM4/27/12

to

For those still there - here's a final(?) version of my approach to the
alien program.
Now it's down to 150 words of source - i think that's how i leave it.

#! /usr/local/bin/gforth-fast
\ : key pad dup 1 stdin read-file throw drop c@ ; ( fast vsn. of KEY)
: <n> 0 begin key [char] 0 - dup 10 u< while swap 10 * + repeat drop ;
0 value L
: <dims> <n> to L <n> <n> ;

: ord [char] a - ;
: az? ord dup 26 u< ;
: az* begin key az? while c, repeat drop ;
: <dic> 0 do az* loop align ;

: >msk [ -1 dup 1 rshift xor ] literal swap rshift invert ;
: az*) -1 begin key az? while >msk and repeat drop ;
: -pat -1 cells L * allot ;
: <pat>
begin key
az? if
>msk
else
[char] ( ord = if az*) else exit then
then ,
again ;

: m?
here dup rot cells - do

i @ over c@ lshift
0< if unloop drop false exit then
char+
cell +loop
drop true ;

: #m 0 -rot L * bounds do i L m? - L +loop ;

: .m ." Case #" 1 .r ." : " 1 .r cr ;

: chk 0 do 2dup <pat> #m -pat i 1+ .m loop 2drop ;

here <dims> over <dic> chk bye

Bernd Paysan

unread,

Apr 27, 2012, 9:21:22 AM4/27/12

to

Hugh Aguilar wrote:

> Your program is incredibly ugly. Bernd Paysan calls this "plain Forth"

Do you know what a plain bread is? Can you even speak plain English?
Alex Wegel immediately understood what it means: Without anything on top
of it. But then, Alex is just as German as I am, so we both speak the
same dialect of English: The one taught in German schools. He's pretty
good at it, coining terms like "Software insultant". :-)

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://bernd-paysan.de/

Alex Wegel

unread,

Apr 27, 2012, 12:28:37 PM4/27/12

to

Ian Osgood <ia...@quirkster.com> wrote:

> Four years, and never a Forth entry. Anyone want to put Forth on the
> scoreboard this year? Registration is open and the qualification round
> is on Friday.
>
> http://code.google.com/codejam/

I tried another one of the examples for practicing - im still way too
slow. Fortunately, i wouldn't feel like going on a trip to NY anyway ;-)

The example i tried was "all your base..", the logical sequel to the
alien example :-)

http://code.google.com/codejam/contest/189252/dashboard#s=p0

It took me about 1:30 hrs to come up with a fully correct solution
(guess what - it's ugly).

In case someone wants to try too, i don't post my solution or further
discussions (i.e. any spoilers) yet.

Cheers,
Alex

van...@vsta.org

unread,

Apr 27, 2012, 12:37:42 PM4/27/12

to

Paul Rubin <no.e...@nospam.invalid> wrote:
> Maybe that could be written something like (untested):
> from itertools import chain
> def breadthfirst(tree, children=iter):
> def t(fifo):
> n1 = list(islice(fifo,1))
> return chain(n1,fifo,children(n1) if n1 else [])
> return t([tree])

No, even fixing errors (islice is also from itertools) it's not right.
I don't think itertools.chain is giving you the semantics you want.

> That's the more traditional way to do it. It's prettier in Haskell, IMO:
>
> data Tree a = Tree {value :: a, children :: [Tree a] }
>
> breadthFirst t = bf1 [t] where
> bf1 [] = []
> bf1 (x:xs) = value x : bf1 (xs ++ children x)

My (uninformed--I don't know Haskell) reading sees this as a depth-first
traversal. How does it achieve breadth-first semantics?

Paul Rubin

unread,

Apr 27, 2012, 1:17:50 PM4/27/12

to

van...@vsta.org writes:
>> data Tree a = Tree {value :: a, children :: [Tree a] }
>>
>> breadthFirst t = bf1 [t] where
>> bf1 [] = []
>> bf1 (x:xs) = value x : bf1 (xs ++ children x)
>
> My (uninformed--I don't know Haskell) reading sees this as a depth-first
> traversal. How does it achieve breadth-first semantics?

The data declaration says that a tree has a value and a list of
subtrees. I'm not sure if that's what your original Python code did,
but I'll assume it is.

The usual way to do breadth-first traversal of a tree is:

1. Put the tree onto a FIFO queue (so the queue starts with one element)
2. Iterate over the queue like this:
while (queue is not empty):
t <- pop first tree from queue
yield (value associated with root of t)
append all children of root of t to the queue

Maybe I got too clever with itertools.chain in my python example, and
should have used collections.deque instead. Anyway, the code

breadthFirst t = bf1 [t] where
bf1 [] = []
bf1 (x:xs) = value x : bf1 (xs ++ children x)

says "breadthFirst t" calls a helper function bf1 on a list [t], i.e. a
one-element list containing t, like in Python. bf1 is defined
internally to breadthFirst with the "where" clause, but it could have
been written as a separate top-level function.

bf1 returns immediately if given an empty list. If given a nonempty
list, it splits it into the first element (x), and the rest (xs). It
then returns a new list containing the value from x, followed by what
comes from recursively calling bf1 on the result of appending x's
children to the original fifo. ++ concatenates two lists together.
This is all done with lazy evaluation (Haskell uses lazy evaluation for
everything), and the recursive call to bf1 is in tail position so you
shouldn't get a lot of list bloat or stack buildup if you're going to
just iterate through the traversal and do something with each element.
Also, ++ is lazy (like itertools.chain) so it doesn't go and traverse
the two lists and allocate storage for a new one each time you call it.

For depth-first, you just process in LIFO instead of FIFO order:

depthFirst t = df1 [t] where
df1 [] = []
df1 (x:xs) = value x : df1 (children x ++ xs)

Example (*Main> is ghci's interactive prompt):

*Main> let x = Tree 1 [Tree 2 [Tree 3 []], Tree 4 []]
*Main> breadthFirst x
[1,2,4,3]
*Main> depthFirst x
[1,2,3,4]

It is of course possible that I missed something and am doing stuff in
the wrong order, or that the code has some efficiency problem that could
be avoided some simple way. I'm nowhere near Haskell guru level.

van...@vsta.org

unread,

Apr 27, 2012, 2:15:01 PM4/27/12

to

Paul Rubin <no.e...@nospam.invalid> wrote:
> ...

> It is of course possible that I missed something and am doing stuff in
> the wrong order, or that the code has some efficiency problem that could
> be avoided some simple way. I'm nowhere near Haskell guru level.

Thank you!

hughag...@yahoo.com

unread,

May 1, 2012, 2:54:43 AM5/1/12

to

On Friday, April 27, 2012 6:15:49 AM UTC-6, Alex Wegel wrote:
> For those still there - here's a final(?) version of my approach to the
> alien program.
> Now it's down to 150 words of source - i think that's how i leave it.
>
> #! /usr/local/bin/gforth-fast
> \ : key pad dup 1 stdin read-file throw drop c@ ; ( fast vsn. of KEY)
> : <n> 0 begin key [char] 0 - dup 10 u< while swap 10 * + repeat drop ;
> 0 value L
> : <dims> <n> to L <n> <n> ;
>
> : ord [char] a - ;
> : az? ord dup 26 u< ;
> : az* begin key az? while c, repeat drop ;
> : <dic> 0 do az* loop align ;
>
> : >msk [ -1 dup 1 rshift xor ] literal swap rshift invert ;
> : az*) -1 begin key az? while >msk and repeat drop ;
> : -pat -1 cells L * allot ;
> :
>

> begin key
> az? if
> >msk
> else
> [char] ( ord = if az*) else exit then
> then ,
> again ;
>
> : m?
> here dup rot cells - do
> i @ over c@ lshift
> 0< if unloop drop false exit then
> char+
> cell +loop
> drop true ;
>
> : #m 0 -rot L * bounds do i L m? - L +loop ;
> : .m ." Case #" 1 .r ." : " 1 .r cr ;
> : chk 0 do 2dup

> #m -pat i 1+ .m loop 2drop ;
>
> here <dims> over <dic> chk bye

Here is a new version of my program, including these upgrades:
1.) I implemented a SSEQ data type that is similar to my SEQ data type except that it allows for big (>255 char) strings.
2.) I replaced SEQ with SSEQ throughout the ALIEN program as needed so that the program will now handle the large sample file.
3.) I switched to bit-masks similar to Alex's program.
4.) I fixed a stack-picture comment that was incorrect.

This is not plain Forth --- I've never written a plain Forth program in my life. I find plain Forth to be largely unreadable (although I can *decipher* it given enough time). If I were offered a job maintaining plain Forth software, I would refuse and just stick with cab driving --- unless it paid a *lot* of money, but that is unrealistic because the employer would just have the program rewritten from scratch in a readable language for less money.

My program seems rather slow on the large sample file. I think this is due to my heavy use of the heap, which is quite slow on all Forth systems (I don't know why). I also compile functions at run-time, which may be slow depending upon which compiler is being used and how much optimization is being done.

This is SSEQ which will go into the novice package:

list
w field .cnt
w field .chars \ pointer to heap
constant sseq

: sseqable ( adr cnt -- new-adr cnt ) \ put string in heap for SSEQ so <KILL-SSEQ> will work on the SSEQ
>r
r@ alloc \ -- adr new-adr \r: -- cnt
tuck r@ cmove> \ -- new-adr
r> ;

: init-sseq ( adr cnt node -- node )
init-list >r
sseqable r@ .cnt ! r@ .chars !
r> ;

: new-sseq ( adr cnt -- node )
sseq alloc
init-sseq ;

: <kill-sseq> ( node -- )
dup .chars @ dealloc
dealloc ;

: kill-sseq ( head -- )
each[ <kill-sseq> ]each ;

macro: <sseq> ( node -- adr cnt )
dup .chars @ swap .cnt @ ;

: <show-sseq> ( node -- )
<sseq> type cr ;

: show-sseq ( head -- )
cr
['] <show-sseq> each ;

1000 value sseq-size \ making this too big could be a problem if such a large block can't be found in the heap

: <read-sseq> ( adr cnt -- head )
r/o <open-file> >r \ r: -- file-id
nil begin \ -- head
sseq-size 2+ alloc \ -- head chars
dup sseq-size r@ read-line abort" *** READ-SSEQ failed to read line ***"
while \ -- head chars cnt
dup >r realloc r> \ -- head new-chars cnt
new-sseq \ -- head node
link repeat 2drop
r> <close-file> ;

: read-sseq ( name -- head )
count <read-sseq> ;

: <write-sseq> ( head adr cnt -- )
w/o <create-file> swap \ -- file-id head
each[ <sseq> rover write-line abort" *** <WRITE-SSEQ> failed to write ***" ]each
<close-file> ;

: write-sseq ( head name -- )
count <write-sseq> ;

This is the new ALIEN program:

\ This is the solution to the "alien language" example problem from Google CodeJam.
\ http://code.google.com/codejam/contest/90101/dashboard#s=p0

\ Written by Hugh Aguilar --- copyright (c) 2012 --- BSD license

\ requires novice.4th and list.4th

marker alien.4th

\ ******
\ ****** input and output
\ ******

variable #letters
variable #words
variable #patterns

: sseqs ( name-str -- word-sseq pattern-sseq )
read-sseq
dup <sseq> evaluate
#patterns ! #words ! #letters !
.fore @ \ -- word-seq
dup #words @ nth \ -- word-seq pattern-seq
delink
dup #patterns @ nth \ -- word-seq pattern-seq extraneous-seq
delink kill-sseq \ -- word-seq pattern-seq
\ error checking
over length #words @ <> abort" *** bad word-sseq ***"
dup length #patterns @ <> abort" *** bad pattern-sseq ***"
over each[ .cnt @ #letters @ <> abort" *** string in word-sseq is wrong length ***" ]each ;

: dump-result ( pattern-list name-str -- )
<cstr +cstr c" .result" +cstr cstr> write-sseq ;

\ ******
\ ****** convert pattern string into any-seq list
\ ****** each node in the list represents one char in the target string
\ ****** the .LINE string of each node contains all of the character that would match
\ ******

: check-any-str { pattern-adr pattern-cnt head -- }
head length #letters @ <> if
cr ." *** any-seq is the wrong length ***"
cr pattern-adr pattern-cnt type
cr true abort" *** aborting ***
then ;

: <make-any-seq> ( head str -- head )
dup c@ if new-seq link \ str has characters in it
else drop then ;

: make-any-seq { pattern-adr pattern-cnt | group? -- any-seq }
nil <cstr
pattern-adr pattern-cnt bounds ?do \ -- head
group? if \ if inside of ( ) group
I c@ [char] ) = if false to group? cstr> <make-any-seq> <cstr
else I c@ char+cstr then
else \ else outside of ( ) group
I c@ [char] ( = if true to group?
else I c@ char+cstr cstr> <make-any-seq> <cstr then
then
loop
cstr> <make-any-seq>
pattern-adr pattern-cnt rover check-any-str ;

\ ******
\ ****** generate pattern matcher
\ ******

char & comment \ this is an example of what will get generated

:noname ( mask-adr -- match? ) \ this is for: a(bc)

dup @ 1 and 0= if drop false exit then
w + \ -- new-mask-adr

dup @ 6 and 0= if drop false exit then
w + \ -- new-mask-adr

drop true ;

&

: char>mask ( char -- mask ) \ this assumes that there are < 32 chars, and they start with 'a'
[char] a - \ -- ordinal
1 swap lshift ;

: <generate-pattern> ( node -- )
>r
s" dup @ " evaluate
0 r> .line @ count bounds do \ -- mask
I c@ char>mask or loop lit, \ --
s" and 0= if drop false exit then w + " evaluate ;

: generate-pattern ( any-seq -- xt )
>r
s" :noname ( char-adr -- match? ) " evaluate
r> ['] <generate-pattern> each
s" drop true ; " evaluate ;

\ ******
\ ****** upgrade word-seq
\ ******

list
w field .mask \ pointer to array of words
constant mask

: <kill-mask> ( node -- )
dup .mask @ dealloc
dealloc ;

: kill-mask ( head -- )
each[ <kill-mask> ]each ;

: init-mask ( word-adr word-cnt node -- node )
init-list >r
dup w * alloc \ -- word-adr word-cnt mask-array
dup r@ .mask !
-rot bounds do \ -- mask-element
I c@ char>mask over !
w + loop drop \ --
r> ;

: new-mask ( word-adr word-cnt -- )
mask alloc
init-mask ;

: upgrade-word ( word-sseq -- mask-list ) \ create a MASK list given a SSEQ list
nil swap \ -- mask-list word-sseq
each[ <sseq> new-mask link ]each ;

\ ******
\ ****** upgrade pattern-seq
\ ******

sseq \ starts out as pattern-str, later gets changed to result-str
w field .any \ pointer to any-seq
w field .xt \ xt of generated pattern matcher
w field .matches \ count of matches for this pattern
constant pattern

: <kill-pattern> ( node -- )
dup .any @ kill-seq
<kill-sseq> ;

: kill-pattern ( head -- )
each[ <kill-pattern> ]each ;

: init-pattern ( pattern-adr pattern-cnt node -- node )
init-sseq >r
r@ .chars @ r@ .cnt @ make-any-seq r@ .any !
r@ .any @ generate-pattern r@ .xt !
0 r@ .matches !
r> ;

: new-pattern ( adr cnt -- node )
pattern alloc
init-pattern ;

: upgrade-pattern ( pattern-sseq -- pattern-list ) \ create a PATTERN list given a SSEQ list
nil swap \ -- pattern-list pattern-sseq
each[ <sseq> new-pattern link ]each ;

\ ******
\ ****** pattern-match
\ ******

: <check-word> ( pattern-list-node mask-node -- pattern-list-node )
.mask @ \ -- pattern-list-node mask-adr
over .xt @ execute \ -- pattern-list-node match?
if 1 over .matches +! then ; \ -- pattern-list-node

: <check-pattern> ( mask-list pattern-list-node -- mask-list )
over ['] <check-word> each \ -- word-sseq pattern-list-node
drop ;

: check-pattern ( mask-list pattern-list -- )
['] <check-pattern> each \ -- mask-list
drop ;

\ ******
\ ****** make result strings
\ ******

: u>str ( u -- adr cnt )
u>d <# #s #> ;

: <fill-result> ( pattern# pattern-list-node -- new-pattern# )
dup .chars @ dealloc \ get rid of pattern-str
<cstr
c" Case #" +cstr
over u>str <+cstr>
c" : " +cstr
dup .matches @ u>str <+cstr>
cstr> count sseqable rover .cnt ! swap .chars ! \ -- pattern# \ set result-str
1+ ; \ -- new-pattern#

: fill-result ( pattern-list -- )
1 swap ['] <fill-result> each \ -- pattern#
drop ;

\ ******
\ ****** main program
\ ******

: alien ( name-str -- )
dup sseqs { name-str word-sseq pattern-sseq | mask-list pattern-list -- }
s" marker upgrade-pattern-stuff " evaluate \ so we can get rid of the UPGRADE-PATTERN words
word-sseq upgrade-word to mask-list
pattern-sseq upgrade-pattern to pattern-list
mask-list pattern-list check-pattern
pattern-list fill-result
pattern-list name-str dump-result
\ clean up
word-sseq kill-sseq
pattern-sseq kill-sseq
mask-list kill-mask
pattern-list kill-pattern
s" upgrade-pattern-stuff " evaluate ;

Bruno Gauthier

unread,

May 1, 2012, 11:59:41 AM5/1/12

to

hi Alex,
sticked with a bug severals days, here my solution.
curious to see your.
Bruno

\ AllYourBase bruno gauthier
decimal
create in$ maxstring allot create inner1$ maxstring allot
create inner2$ maxstring allot create out$ maxstring allot
create tvals 64 allot create Single-Char-I/O-Buffer 0 C, align
10 constant NL
variable ifh variable ofh variable #ib
0 value T 0 value highestindex 0 value basetested

: Fill-tvals
'1' tvals c! '0' tvals 1+ c!
10 2 do '0' i + tvals i + c! loop
26 0 do 'A' i + tvals i + 10 + c! loop
;
fill-tvals

: CHECKED ( f -- ) ABORT" File Access Error. " ;

: read-char ( file -- char|f )

Single-Char-I/O-Buffer 1 ROT READ-FILE CHECKED

IF Single-Char-I/O-Buffer C@ ELSE -1 THEN
;

: next-word->in$ ( -- )
begin
ifh @ read-char
NL <>
while
Single-Char-I/O-Buffer c@ dup dup
'0' '9' between swap 'a' 'z' between or
if in$ C+place
else drop
then
repeat
;

: parse-headline ( -- ) next-word->in$ in$ count evaluate to T ;
: open-files ( -- )
bl word count r/o open-file throw ifh !
bl word count 2dup file-status nip 0=
if 2dup delete-file drop
then r/w create-file throw ofh !
;
: inits ( -- )
0 to highestindex 0 to basetested
in$ maxstring erase out$ maxstring erase inner1$ maxstring erase
inner2$ maxstring erase
next-word->in$ in$ count 2dup upper inner1$ place
;
: available-chars ( -- )
inner1$ count 2dup bounds
do
2dup i c@ scan drop i = if i c@ inner2$ c+place then
loop
2drop
;
: chars-exchanges ( -- )
in$ count nip 0
do
inner2$ dup 1+ swap
count in$ count drop i + c@ scan drop swap -
dup highestindex max to highestindex
tvals + c@ out$ C+place
loop
;
: .outputs ( i -- )
in$ maxstring erase
cr s" Case #" in$ place
1+ 0 <# #s #> in$ +place s" : " in$ +place
<# #s #> in$ +place in$ count type
crlf$ count in$ +place in$ count ofh @ write-file drop
;

: allyourbase ( -- )
cls open-files parse-headline
T 0
do

inits available-chars chars-exchanges

highestindex 1+ 2 max to basetested
out$ count basetested base-tonum
i .outputs
loop
ofh @ close-file drop
ifh @ close-file drop
;

\ allyourbase A-small-practice.IN A-small-practice.OUT
allyourbase A-large-practice.IN A-large-practice.OUT

Alex Wegel

unread,

May 1, 2012, 1:12:25 PM5/1/12

to

Bruno Gauthier <bgau...@free.fr> wrote:

> Le 27/04/2012 18:28, Alex Wegel a écrit :
> > Ian Osgood<ia...@quirkster.com> wrote:
> >
> >> Four years, and never a Forth entry. Anyone want to put Forth on the
> >> scoreboard this year? Registration is open and the qualification round
> >> is on Friday.
> >>
> >> http://code.google.com/codejam/
> >
> > I tried another one of the examples for practicing - im still way too
> > slow. Fortunately, i wouldn't feel like going on a trip to NY anyway ;-)
> >
> > The example i tried was "all your base..", the logical sequel to the
> > alien example :-)
> >
> > http://code.google.com/codejam/contest/189252/dashboard#s=p0
> >
> > It took me about 1:30 hrs to come up with a fully correct solution
> > (guess what - it's ugly).
> >
> > In case someone wants to try too, i don't post my solution or further
> > discussions (i.e. any spoilers) yet.
> >
> > Cheers,
> > Alex
>
> hi Alex,
> sticked with a bug severals days, here my solution.
> curious to see your.
> Bruno

As i said, it's another ugly solution - my "philosophy" was to leave
everything out that wasn't really needed for the question. (This way,
there's also less space for bugs to hide.)

I didn't put comments into the code (quel surprise!), but the basic
concept is: The alien digits are already ordered (1,0,2,3,4,5,...), so
it's easy to build a table for translating them. After the table has
been built, the alien number-base is known, and i can feed the
transliterated alien numeral to forth to get the numerical value.

The program uses stdin (run in a pipe), so if your forth doesn't have
that, you just need a replacement for getl, to get a line (e.g. based on
refill).

Here it comes - cheers,
Alex

#! /usr/local/bin/gforth-fast
0 value hi
create d 36 allot
: wipe d 36 erase 0 to hi ;
: +dig d hi + c! 1 hi + to hi ;
: >dig?
hi begin dup while
1 - 2dup d + c@ = if nip true exit then
repeat drop false ;
: twiddle dup >r c@ r@ char+ c@ r@ c! r> char+ c! ;
: pars
bounds do
i c@ >dig? if drop else +dig then
loop d twiddle hi 2 max to hi ;
: dig> dup 9 > if 7 + then [char] 0 + ;
: eval
2dup pars 2dup bounds do i c@ >dig? drop dig> i c! loop
hi base ! s>unumber? drop decimal ;
: getl pad dup 64 stdin read-line throw drop ;
: .. ." Case #" 1 .r ." : " 1 ud.r cr ;
: app getl evaluate 0 do getl eval i 1+ .. wipe loop ;
' noop is bootmessage
app bye

Bruno Gauthier

unread,

May 1, 2012, 2:39:13 PM5/1/12

to

while have to study it :)
thanks for your post
bruno

m...@iae.nl

unread,

May 1, 2012, 7:00:18 PM5/1/12

to

awe..cor.de (Alex Wegel) writes Re: Google CodeJam?

> As i said, it's another ugly solution - my "philosophy" was to leave
> everything out that wasn't really needed for the question. (This way,
> there's also less space for bugs to hide.)

> I didn't put comments into the code (quel surprise!), but the basic
> concept is: The alien digits are already ordered (1,0,2,3,4,5,...), so
> it's easy to build a table for translating them. After the table has
> been built, the alien number-base is known, and i can feed the
> transliterated alien numeral to forth to get the numerical value.

So why not add the above small explanation and some stack comments
to your code? I almost suspect that you deliberately deleted them :-)

Just a observation: your program works with iForth when I use the
standard >NUMBER and REFILL words. The runtime for the large example
is dominated by I/O time: 1 ms without output and 120 ms with writing
of the data to file.

-marcel

Alex Wegel

unread,

May 1, 2012, 7:44:05 PM5/1/12

to

After adding the following words...

256 constant maxstring
: between 1+ within ;
: C+place ( c saddr) dup >r count dup >r + c! r> 1+ r> c! ;
: upper bounds ?do i c@ toupper i c! loop ;
create crlf$ 1 c, 10 c, align
: cls cr cr cr cr ;
: base-tonum base ! s>unumber? 0= throw decimal ;

...and "0 in$ c!" somewhere before the call to parse-headline (because
allot didn't clear memory, and your inits word is only called later), it
worked:-)

Most of all i like that there's a lot of variety in the style of the
solutions shown in this thread, even though the two CodeJam questions
were of a real simple nature.

Unfortunately, nobody of us could understand the aliens or stop their
attack in time.

Cheers, Alex

Alex Wegel

unread,

May 1, 2012, 7:44:05 PM5/1/12

to

<hughag...@yahoo.com> wrote:

> I've never written a plain Forth program in my life.

Now i'm puzzled (though not really surprised by the fact itself) - this
is coming quite a long way from:

> I'm the only Forther on the planet who has any chance at all.

I wish you much fun with or without your package, and good luck with
your copyrights. (iBrows raised..)

Alex

Alex Wegel

unread,

May 1, 2012, 8:29:10 PM5/1/12

to

<..x.nl> wrote:

> awe..cor.de (Alex Wegel) writes Re: Google CodeJam?
>
> > As i said, it's another ugly solution - my "philosophy" was to leave
> > everything out that wasn't really needed for the question. (This way,
> > there's also less space for bugs to hide.)
>
> > I didn't put comments into the code (quel surprise!), but the basic
> > concept is: The alien digits are already ordered (1,0,2,3,4,5,...), so
> > it's easy to build a table for translating them. After the table has
> > been built, the alien number-base is known, and i can feed the
> > transliterated alien numeral to forth to get the numerical value.
>
> So why not add the above small explanation and some stack comments
> to your code? I almost suspect that you deliberately deleted them :-)

Actually i think i had 2 of them at some point in time, but that was
about it, and then they went stale. They not so interesting anyway :-)
See:

#! /usr/local/bin/gforth-fast
0 value hi \ highest digit-value found so far
create d 36 allot \ digit string containing all alien digits
: wipe ( ) d 36 erase 0 to hi ; \ housekeeping
: +dig ( c) d hi + c! 1 hi + to hi ; \ add newly learned digit to d
: >dig? ( c -- +n true|c false) \ lookup alien digit

hi begin dup while
1 - 2dup d + c@ = if nip true exit then
repeat drop false ;

: twiddle ( ca) dup >r c@ r@ char+ c@ r@ c! r> char+ c! ; \ exch 2 bytes
: pars ( ca u) \ read alien numeral to determine digits & number base

bounds do
i c@ >dig? if drop else +dig then
loop d twiddle hi 2 max to hi ;

: dig> (+n--c) dup 9 > if 7 + then [char] 0 + ; \ convert 1 dig to text
: eval ( ca u) \ evaluate alien number string

2dup pars 2dup bounds do i c@ >dig? drop dig> i c! loop
hi base ! s>unumber? drop decimal ;
: getl pad dup 64 stdin read-line throw drop ;
: .. ." Case #" 1 .r ." : " 1 ud.r cr ;
: app getl evaluate 0 do getl eval i 1+ .. wipe loop ;
' noop is bootmessage
app bye

Main thing to remember when reading the source is that in bottom up
programming, the top is down.

> Just a observation: your program works with iForth when I use the
> standard >NUMBER

Yes, i should have used that one.

> and REFILL words.

Also yes, this time there were no overlong input lines, so this would do
the job too.
My defense is that what i posted is the original - the first working
version, so changing these would be part of some clean-up to come.
(I fathom that using SCAN in >dig? and maybe getting rid of hi as a
value could also get on the todo-list).

> The runtime for the large example
> is dominated by I/O time: 1 ms without output and 120 ms with writing
> of the data to file.

Well - strictly speaking, the 4 minutes(?) granted by the codejam rules
are way too much time, considering some of the attack-dates ;-)

Cheers,
Alex

Paul Rubin

unread,

May 1, 2012, 10:18:24 PM5/1/12

to

awe...@arcor.de (Alex Wegel) writes:
> I didn't put comments into the code (quel surprise!), but the basic
> concept is: The alien digits are already ordered (1,0,2,3,4,5,...), so

It was a pretty easy problem (I used Python) and my test output
looks ok to me, but the upload says my answer was incorrect.
Did you get 290762935202 for case #35 and 10^18 for case #36?

Did they really expect people to complete this task in 4 minutes? I
guess that's doable if absolutely nothing goes wrong. Make a small
mistake or two and you're sunk. I didn't time myself but I probably
took between 5 and 10 minutes.

I should try doing some of these in C, which should be more directly
comparable to Forth. I had trouble attempting using Forth because of
the complexity of doing anything with strings or files. In languages
like Python, this stuff is too easy because the built-in libraries do
all the work.

Alex Wegel

unread,

May 1, 2012, 11:14:46 PM5/1/12

to

Paul Rubin <no.e...@nospam.invalid> wrote:

> awe...@arcor.de (Alex Wegel) writes:
> > I didn't put comments into the code (quel surprise!), but the basic
> > concept is: The alien digits are already ordered (1,0,2,3,4,5,...), so
>
> It was a pretty easy problem (I used Python) and my test output
> looks ok to me, but the upload says my answer was incorrect.
> Did you get 290762935202 for case #35 and 10^18 for case #36?

Case #1: 201
Case #2: 75
Case #3: 11
Case #4: 17419143
Case #5: 1801622
Case #6: 47225
Case #7: 17273
Case #8: 866022
Case #9: 2
Case #10: 42
Case #11: 44317196
Case #12: 511
Case #13: 1
Case #14: 44317196
Case #15: 1023456789
Case #16: 29480883458974409
Case #17: 26432593615
Case #18: 35180798355218
Case #19: 102334506713879
Case #20: 398821148
Case #21: 102345156378290
Case #22: 1002342562744892
Case #23: 674293938766347782
Case #24: 4256386811230819
Case #25: 515096463571317029
Case #26: 3589692911
Case #27: 102034056733893387
Case #28: 187812613000849559
Case #29: 108686242308947
Case #30: 2616885866937
Case #31: 35181782102483
Case #32: 674303048939557361
Case #33: 102345678950
Case #34: 64897047
Case #35: 290762935202
Case #36: 1000000000000000000
Case #37: 575985757797280145
Case #38: 319635304277606399
Case #39: 316555023193359374
Case #40: 921615989741647091
Case #41: 23219822358886405
Case #42: 11986027137890515
Case #43: 153855365171289881
Case #44: 262144
Case #45: 244593359538403
Case #46: 3005211782346
Case #47: 1859030134286095
Case #48: 211967741386084
Case #49: 4256889290508007
Case #50: 23551152047651
Case #51: 23551946083611
Case #52: 102343256672849
Case #53: 23550744836764
Case #54: 14018075135
Case #55: 10234456789102345
Case #56: 29480883458974409
Case #57: 50069
Case #58: 4
Case #59: 1244777988
Case #60: 64881574
Case #61: 3
Case #62: 107
Case #63: 398819
Case #64: 1145122416
Case #65: 1342284513
Case #66: 287
Case #67: 12194234
Case #68: 60589237
Case #69: 1177315138
Case #70: 83
Case #71: 115
Case #72: 2430248
Case #73: 5
Case #74: 10843881
Case #75: 53409068
Case #76: 302
Case #77: 431345
Case #78: 29
Case #79: 3
Case #80: 22094
Case #81: 637
Case #82: 4825
Case #83: 1315
Case #84: 64776197
Case #85: 4297977
Case #86: 17194
Case #87: 11
Case #88: 2163798
Case #89: 1802680
Case #90: 1664354412
Case #91: 62874659
Case #92: 290313057
Case #93: 13
Case #94: 427257653
Case #95: 138
Case #96: 893251
Case #97: 301
Case #98: 1
Case #99: 17349
Case #100: 1924398475

> Did they really expect people to complete this task in 4 minutes? I
> guess that's doable if absolutely nothing goes wrong. Make a small
> mistake or two and you're sunk. I didn't time myself but I probably
> took between 5 and 10 minutes.

I would have failed there too (both times), but with re-reading, and
properly understanding the given limits *before* trying with the real
data (i.e. before the 4- or 8-min timer would have started), i would
have had a better chance (because that's what i stumbled over twice:
first buffer size, then number precision).

> I should try doing some of these in C, which should be more directly
> comparable to Forth. I had trouble attempting using Forth because of
> the complexity of doing anything with strings or files. In languages
> like Python, this stuff is too easy because the built-in libraries do
> all the work.

Well - the first example almost being a regexp was an especially fitting
case for string-heavy languages, but even there, i'd say that forth had
it's strengths, and if it's just by encouraging to not regard the input
as strings, but as something a little bit simpler, in the limits of the
posed question.

Cheers,
Alex

Hugh Aguilar

unread,

May 1, 2012, 11:45:08 PM5/1/12

to

On May 1, 5:44 pm, awe...@arcor.de (Alex Wegel) wrote:

> <hughaguila...@yahoo.com> wrote:
> > I've never written a plain Forth program in my life.
>
> Now i'm puzzled (though not really surprised by the fact itself) - this
> is coming quite a long way from:
>
> > I'm the only Forther on the planet who has any chance at all.

Your "plain Forth" would more accurately be called "obfuscated Forth"
--- I've never written that in my life --- and I never will.

I don't think that I could write a Forth program without stack-picture
comments, as I would confuse myself. I think that you used stack-
picture comments just like everybody else when you wrote your program,
but then removed them afterward to obfuscate your code.

We've been quoting poetry in this thread --- here is what another
German said on the subject:
"Nor are poets clean enough for me --- they muddy the water to make it
appear deep."

> I wish you much fun with or without your package, and good luck with
> your copyrights. (iBrows raised..)

I stick that copyright notice on the top of all my stuff. I don't
think there is a market for alien-alphabet software. This will go into
the next novice package upgrade as yet another example program --- all
of that stuff is BSD license --- anybody can use it freely.

Actually, there isn't any market for any kind of Forth. That was the
point that I was making. It is mildly amusing to write a string
pattern-matching program from scratch in Forth, but the rest of the
world just uses the myriad scripting languages available for this kind
of stuff --- they are trivial. It took me about 3 hours (and I screwed
it up by not supporting pattern strings longer than 255 chars) --- no
employer in the world is going to pay for 3 hours of work to write a
program that even the office intern working for free could knock out
in 30 minutes.

Alex Wegel

unread,

May 2, 2012, 5:38:06 PM5/2/12

to

You don't really want an answer to that crap?

WJ

unread,

Mar 4, 2013, 12:16:27 AM3/4/13

to

hughag...@yahoo.com wrote:

> On Thursday, April 12, 2012 1:43:19 AM UTC-6, Hugh Aguilar wrote:
> > Just for fun, lets have our own comp.lang.forth contest to write the
> > best Forth solution to the "alien language" problem.
> > http://code.google.com/codejam/contest/90101/dashboard#s=p0
> > This will be a loser's consolation contest, as none of us have any
> > chance at the real contest.
>
> I've waited and waited, but nobody has come forward. To qualify for the CodeJam contest, the program had to be written in under 8 minutes. That is very fast programming; I think that a typical programmer using a modern language would take about 1/2 hour. My own Forth program took me about 3 hours, so I am 6 times slower than pretty much everybody. This is a big part of why Forth is not used in the work world. No employer is going to pay anybody to program in Forth when it takes 6 times longer to write a program than it does in any other language. Also, I had the advantage of having my novice package available. Without the novice package, I think most Forth programmers would take maybe 3 days to write a program like this (that is why nobody responded to my challenge).
>
> It seems extremely unlikely that any Forther is going to come up with a Forth program to compete against mine. I would like to see programmers of other languages, such as Lisp and Ruby and so forth, present their own programs along with a mention of how much time was required. It is okay to post non-Forth code on comp.lang.forth --- nobody is posting Forth code --- if we are going to get any code posted, it will have to be in other languages.
>
> Here is my own Forth code:

>
> \ This is the solution to the "alien language" example problem from Google CodeJam.
> \ http://code.google.com/codejam/contest/90101/dashboard#s=p0
>
> \ Written by Hugh Aguilar --- copyright (c) 2012 --- BSD license
>
> \ requires novice.4th and list.4th
>
> marker alien.4th
>
>
> \ ******
> \ ****** input and output
> \ ******
>
> variable #letters
> variable #words
> variable #patterns
>

> : seqs ( name-str -- word-seq pattern-seq )
> read-seq
> dup .line @ count evaluate

> #patterns ! #words ! #letters !
> .fore @ \ -- word-seq
> dup #words @ nth \ -- word-seq pattern-seq
> delink
> dup #patterns @ nth \ -- word-seq pattern-seq extraneous-seq

> delink kill-seq \ -- word-seq pattern-seq
> \ error checking
> over length #words @ <> abort" *** bad word-seq ***"
> dup length #patterns @ <> abort" *** bad pattern-seq ***"
> over each[ .line @ c@ #letters @ <> abort" *** string in word-seq is wrong length ***" ]each ;

>
> : dump-result ( pattern-list name-str -- )

> <cstr +cstr c" .result" +cstr cstr> write-seq ;

>
>
> \ ******
> \ ****** convert pattern string into any-seq list
> \ ****** each node in the list represents one char in the target string
> \ ****** the .LINE string of each node contains all of the character that would match
> \ ******
>

> : check-any-str { pattern-str head -- }

> head length #letters @ <> if
> cr ." *** any-seq is the wrong length ***"

> cr pattern-str count type

> cr true abort" *** aborting ***
> then ;
>

> \ If CHECK-ANY-STR fails, this is usually because the pattern-str is longer than 255 characters and it got truncated.
> \ This happens in the file: A-large-practice.in
> \ I could upgrade the program to deal with this problem, but doing so would involve rewriting the SEQ code in LIST.4TH.

>
> : <make-any-seq> ( head str -- head )
> dup c@ if new-seq link \ str has characters in it
> else drop then ;
>

> : make-any-seq { pattern-str | group? -- any-seq }
> nil <cstr
> pattern-str count bounds ?do \ -- head

> group? if \ if inside of ( ) group
> I c@ [char] ) = if false to group? cstr> <make-any-seq> <cstr
> else I c@ char+cstr then
> else \ else outside of ( ) group
> I c@ [char] ( = if true to group?
> else I c@ char+cstr cstr> <make-any-seq> <cstr then
> then
> loop
> cstr> <make-any-seq>

> pattern-str over check-any-str ;

>
>
> \ ******
> \ ****** generate pattern matcher
> \ ******
>
> char & comment \ this is an example of what will get generated
>

> :noname ( char-adr -- match? ) \ this is for: a(bc)
>
> false \ -- char-adr any?
> over c@ 97 = if true or then

> 0= if drop false exit then

> 1+ \ -- new-char-adr
>
> false \ -- char-adr any?
> over c@ 98 = if true or then
> over c@ 99 = if true or then

> 0= if drop false exit then

> 1+ \ -- new-char-adr
>
> drop true ;
>
> &
>
> : <generate-pattern> { node -- }
> s" false " evaluate
> node .line @ count bounds do
> s" over c@ " evaluate
> I c@ lit,
> s" = if true or then " evaluate
> loop
> s" 0= if drop false exit then 1+ " evaluate ;

>
> : generate-pattern ( any-seq -- xt )
> >r
> s" :noname ( char-adr -- match? ) " evaluate
> r> ['] <generate-pattern> each
> s" drop true ; " evaluate ;
>
>
> \ ******

> \ ****** upgrade pattern-seq
> \ ******
>
> seq \ .LINE starts out as pattern-str, later gets changed to result-str

> w field .any \ pointer to any-seq
> w field .xt \ xt of generated pattern matcher
> w field .matches \ count of matches for this pattern
> constant pattern
>
> : <kill-pattern> ( node -- )
> dup .any @ kill-seq

> <kill-seq> ;

>
> : kill-pattern ( head -- )
> each[ <kill-pattern> ]each ;
>

> : init-pattern ( pattern-str node -- node )
> init-seq >r
> r@ .line @ make-any-seq r@ .any !

> r@ .any @ generate-pattern r@ .xt !
> 0 r@ .matches !
> r> ;
>

> : new-pattern ( str -- node )
> pattern alloc
> init-pattern ;
>
> : upgrade-pattern ( pattern-seq -- pattern-list ) \ create a PATTERN list given a SEQ list
> nil swap \ -- pattern-list pattern-seq
> each[ .line @ new-pattern link ]each ;
>
>
> \ ******
> \ ****** pattern-match
> \ ******
>
> : <check-word> ( pattern-list-node word-seq-node -- )
> .line @ count drop \ -- pattern-list-node char-adr \ assume str size is correct

> over .xt @ execute \ -- pattern-list-node match?
> if 1 over .matches +! then ; \ -- pattern-list-node
>

> : <check-pattern> ( word-seq pattern-list-node -- word-seq )
> over ['] <check-word> each \ -- word-seq pattern-list-node
> drop ;
>
> : check-pattern ( word-seq pattern-list -- )
> ['] <check-pattern> each \ -- word-seq

> drop ;
>
>
> \ ******
> \ ****** make result strings
> \ ******
>
> : u>str ( u -- adr cnt )
> u>d <# #s #> ;
>
> : <fill-result> ( pattern# pattern-list-node -- new-pattern# )

> dup .line @ dealloc \ get rid of pattern-str in .LINE

> <cstr
> c" Case #" +cstr
> over u>str <+cstr>
> c" : " +cstr
> dup .matches @ u>str <+cstr>

> cstr> hstr swap .line ! \ -- pattern# \ set result-str to .LINE

> 1+ ; \ -- new-pattern#
>
> : fill-result ( pattern-list -- )
> 1 swap ['] <fill-result> each \ -- pattern#
> drop ;
>
>
> \ ******
> \ ****** main program
> \ ******
>
> : alien ( name-str -- )

> dup seqs { name-str word-seq pattern-seq | pattern-list -- }

> s" marker upgrade-pattern-stuff " evaluate \ so we can get rid of the UPGRADE-PATTERN words

> pattern-seq upgrade-pattern to pattern-list
> word-seq pattern-list check-pattern

> pattern-list fill-result
> pattern-list name-str dump-result
> \ clean up

> word-seq kill-seq
> pattern-seq kill-seq

> pattern-list kill-pattern
> s" upgrade-pattern-stuff " evaluate ;
>

Factor:

USING: locals io.encodings.ascii splitting math.parser regexp formatting ;

:: run ( -- )
"Code-jam.txt" ascii
[ readln " " split [ string>number ] map
first3 :> numpatterns :> numwords drop
numwords iota [ drop readln ] map :> words
numpatterns iota [ drop readln ] map :> patterns
patterns
[| pat i |
pat { { CHAR: ( CHAR: [ } { CHAR: ) CHAR: ] } } substitute
<regexp> :> regex
i 1 + words [ regex matches? ] count "Case #%d: %d\n" printf
] each-index
] with-file-reader
;

WJ

unread,

Mar 5, 2013, 2:42:18 AM3/5/13

to

Bruno Gauthier wrote:

> Le 27/04/2012 18:28, Alex Wegel a �crit :

Factor:

USING: locals vectors io.encodings.ascii math.parser formatting ;
QUALIFIED: assocs

:: min-val ( str -- num )
36 iota >vector :> pool 0 1 pool exchange pool reverse! drop
H{ } clone :> digit-values
str
[ digit-values [ [ pool pop ] unless* ] assocs:change-at
] each
digit-values keys length 2 max :> base
str 0 [ digit-values at swap base * + ] reduce
;

: do-it ( -- )
"A-large-practice.in" ascii
[ readln string>number iota
[| i |
i 1 + readln min-val "Case #%d: %d\n" printf
] each
] with-file-reader
;

do-it

WJ

unread,

Mar 5, 2013, 1:33:36 PM3/5/13

to