We recently saw nice solutions to the `nine 9's' problem here. I was
wondering if anybody have solved the (easier) 1st problem at
http://www.itasoftware.com/careers/programmers.php
They ask to make the program as efficient as possible, but I am not
interested in stupid little microoptimizations, but rather whether
there are better, but still simple, algorithms than mine. That's
also why I am not posting my solution (not yet, at least): I don't
want to influence other solutions if anybody wants to give it a
shot. To give an idea about orders of magnitude involved, my
solution has about 60 LOC, and runs, with no declarations, no
optimizations turned on, about two minutes. Note that I changed
the spec for my solution a little: It tries to find /all/ solutions
of maximum length. IIRC, if it quits when the first solution is
found, it took about 40 secs. Much fun!
Regards,
--
Nils Goesche
"Don't ask for whom the <CTRL-G> tolls."
PGP key ID 0x42B32FC9
Oops, make that three minutes...
Sorry,
I tried my luck and it looks like we're in about the same ball-park. I
arrived at 38 LOC and around 27 seconds for the first solution.
[CMUCL 18c on Linux, PIII 850MHz, 256 MB RAM]
Cheers,
Edi.
edi@bird:~/add-a-gram> time ./add-a-gram.x86f < WORD.LST
("ton" "tone" "tonne" "tonnes" "tension" "sonatine" "nominates" "antinomies"
"nitrosamine" "terminations" "antimodernist" "determinations"
"underestimation" "underestimations")
real 0m27.196s
user 0m26.240s
sys 0m0.550s
Mine is similar in length. It takes about 40 seconds to find
all solutions, where by "find all solutions" I mean "build
a data structure that represents all addagrams in the dictionary,
and output enough information to reconstruct them all trivially".
(I avoid repeating words I've already output; this is quite
effective in avoiding a combinatorial explosion.) It would be
only marginally quicker if I only wanted one solution.
My program conses far too much: about 280Mb. This wouldn't be
very hard to fix. I don't know whether doing so would improve
the speed or make it worse.
Breakdown:
Reading the dictionary 0.01s 1.75Mb
Getting it into a usable form 7.75s 75.37Mb
Finding addagrams 29.52s 203.43Mb
Reporting the results 0.17s 0.94Mb
All timings are for CMUCL on a 1GHz Athlon running FreeBSD.
No optimization flags and no declarations. I tried adding
a few, but it made no measureble difference.
Writing the code took about 80 minutes, which feels like
approximately a factor of 2 too long. A lot of that time
was spent looking things up in the Hyperspec; not enough
of the spec is swapped into my brain at the moment. (Work
is busy, and CL is almost entirely recreation for me.)
For comparison, I implemented something similar in Python.
It's not identical; one implementation choice seemed to be
more natural one way in CL and another way in Python.
The timings for three of the four stages were very similar;
the other stage ("Finding addagrams") more than twice as
fast in Python. This isn't a major surprise. It would
probably be even faster in Perl.
... So, I tried doing it the "Pythonic" way in CL. It turns
out that it's better in CL as well, though it's still slightly
faster in PYthon. So now the figures look like this:
Reading the dictionary 0.01s 1.75Mb
Getting it into a usable form 5.98s 77.45Mb
Finding addagrams 17.77s 261.40Mb
Reporting the results 0.13s 1.20Mb
Total: about 24s. The Python version takes 23.4s; the
first two stages are slower, the third is quicker. The
Python version is also somewhat shorter. (I also wrote
it more quickly, but that's not interesting because
I'd already written the CL version.)
Like Nils, I'm deliberately not saying more than I
have to about just what the program does, so as not
to spoil the fun for anyone else.
--
Gareth McCaughan Gareth.M...@pobox.com
.sig under construc
Well, I'd cut down the size of the search space by lumping together
words that are anagrams of one another because I know I'm not going to
have more success adding a letter to "spat" than to "taps". And I'd
treat this problem as a graph search problem.
The "word lumps" are the nodes in the graph and the edges are the
letters added to form longer words. I'd use a breadth first and weed
out nodes that are either already enqueued or nodes that don't
represent words in the dictionary.
I think that's about as optimal as one can get, no?
-dj
> ... So, I tried doing it the "Pythonic" way in CL. It turns
> out that it's better in CL as well, though it's still slightly
> faster in PYthon. So now the figures look like this:
>
> Reading the dictionary 0.01s 1.75Mb
> Getting it into a usable form 5.98s 77.45Mb
> Finding addagrams 17.77s 261.40Mb
> Reporting the results 0.13s 1.20Mb
>
> Total: about 24s. The Python version takes 23.4s; the
> first two stages are slower, the third is quicker. The
> Python version is also somewhat shorter. (I also wrote
> it more quickly, but that's not interesting because
> I'd already written the CL version.)
I just did a Dylan version. It's 33 lines and takes 16.5 seconds on my
700 MHz Athlon on linux using Gwydion 2.3.6.
Reading the dictionary and building data structure: 13.5s
Finding longest add-a-gram: 3.0s
Final heap size is 29 MB. I don't know the GC stats.
I could double the speed with another ten lines of code, mostly by
doing my own parsing of the input file instead of using sucky library
routines.
-- Bruce
Evaluation took:
8.64 seconds of real time
6.06 seconds of user run time
0.87 seconds of system run time
[Run times include 1.47 seconds GC run time]
0 page faults and
40242400 bytes consed.
The insane amount of consing is due to my choice of data-structure,
which really should be changed, but I don't feel like bothering at
this point. The code is quite long, partly because it uses lots of
small wrapper functions to abstract over some of the details of the
data-structure used and partly because I have commented out code using
LOOP alongside the code that's acutally run which uses SERIES. The
performance difference of the LOOP and SERIES solutions is essentially
nothing.
--
-> -/- - Rahul Jain - -\- <-
-> -\- http://linux.rice.edu/~rahul -=- mailto:rj...@techie.com -/- <-
-> -/- "I never could get the hang of Thursdays." - HHGTTG by DNA -\- <-
|--|--------|--------------|----|-------------|------|---------|-----|-|
Version 11.423.999.221020101.23.50110101.042
(c)1996-2002, All rights reserved. Disclaimer available upon request.
> I just did a Dylan version. It's 33 lines and takes 16.5 seconds on my
> 700 MHz Athlon on linux using Gwydion 2.3.6.
>
> Reading the dictionary and building data structure: 13.5s
> Finding longest add-a-gram: 3.0s
>
> Final heap size is 29 MB. I don't know the GC stats.
Now I have some internal instrumentation:
$ time ./add-a-gram <WORD.LST
16: #("indeterminations", "intermediations", "determinations",
"antimodernist", "terminations", "nitrosamine", "antinomies",
"nominates", "mentions", "tension", "nitons", "niton", "into", "ton")
reading file: 13.27d0 sec, 172.029688d0 MB
searching: 3.24d0 sec, 5.387312d0 MB
real 0m16.531s
user 0m16.430s
sys 0m0.100s
> I tried my luck and it looks like we're in about the same
> ball-park. I arrived at 38 LOC and around 27 seconds for the first
> solution.
> [CMUCL 18c on Linux, PIII 850MHz, 256 MB RAM]
I changed the algorithm a little bit and I'm now at 10 seconds
(including load time which is less than 1 second) - again without
declarations, same platform. LOC have increased to about 55.
Edi.
Evaluation took:
0.9 seconds of real time
0.82 seconds of user run time
0.08 seconds of system run time
[Run times include 0.52 seconds GC run time]
410 page faults and
17980360 bytes consed.
("ton" "tone" "tonne" "tonnes" "tension" "sonatine" "nominates" "antinomies"
"nitrosamine" "terminations" "antimodernist" "determinations"
"underestimation" "underestimations")
Evaluation took:
9.05 seconds of real time
8.73 seconds of user run time
0.26 seconds of system run time
[Run times include 0.99 seconds GC run time]
9 page faults and
41866368 bytes consed.
Some people apparently found great solutions :-) Maybe we should all
exchange solutions by email...
> Some people apparently found great solutions :-) Maybe we should all
> exchange solutions by email...
I'd be interested to see the other solutions as well. As you started
this thread - would you mind assembling them and sending them out to
the other participants (if they agree)? [I already sent you a link to
my solution (updated this morning).]
Thanks in advance,
Edi.
I've got a couple of solutions, but the one that's competitive in time
is substantially longer than some of the others. It's hard to benchmark,
since I'm using a different compiler on a different machine running
a different OS: about 50 secs on a 500 mHz P-III Vaio running
Windows-98 (don't ask) using FunO Dylan.
I think my datastructure is over-engineered (but you can do other
cool things with it), so I'm slashing down a version now to see what
a minimal solution might do.
Since so many people are writing up solutions to ITA's problems I thought
it would be interesting to do a little informal followup to
http://www.flownet.com/gat/papers/lisp-java.pdf
Anyone who has written up an ITA problem in any language who would like to
be a data point in a new study please email me a copy of your code. If I
get enough submissions I'll run a set of benchmarks on some standardized
environments and compile the results.
Please put [ITA] in the subject line of your submission email to make it
easier for me to sort them out. Also, please include the following
information:
1. How long you spent writing the code (to the best of your recollection)
2. How long you've been programming
3. How long you've been programming in the language your code is written in.
Multiple submissions from the same person are fine.
Thanks!
Erann
g...@jpl.nasa.gov
I'll mail you and anybody else who ask me a tar file with all solutions
I got so far. I hope everybody who mailed one to me agrees with that;
if not, post soon or sue me ;-)
To make things easier for you, we should first clarify the problem
a bit: ITA asks for ``the'' solution, but there is actually more than
one. Should the programs find /all/ solutions or stop when they have
found one? I think searching for /all/ solutions makes more sense,
so that's what's my version does.
I mean all solutions of maximum length, of course.
Sorry,
69 LOC, finding a single longest soln.
; cpu time (non-gc) 6,810 msec user, 200 msec system
; cpu time (gc) 920 msec user, 70 msec system
; cpu time (total) 7,730 msec user, 270 msec system
; real time 9,726 msec
; space allocation:
; 3,165,182 cons cells, 39,652,424 other bytes, 0 static bytes
on ACL 6.1, Red Hat Linux 7.2, intel 700 Mhz
Andrew
--
Dunderhead
> To make things easier for you, we should first clarify the problem
> a bit: ITA asks for ``the'' solution, but there is actually more than
> one. Should the programs find /all/ solutions or stop when they have
> found one? I think searching for /all/ solutions makes more sense,
> so that's what's my version does.
That still needs clarification.
There are a number of maximum-length words which can be derived. It is
trivial for me to modify my program to print them all, complete with a
derivation. The runtime will be identical (but it will be a couple more
lines of code).
The problem is that for any given word there can be quite a number of
derivations -- possibly as many as a couple of thousand. In some cases
the alternatives are anagrams, in other cases they result from
adding/deleting letters in different orders.
Finding (and printing) all these would be a very large change to my
program.
-- Bruce
That sure was a lot of fun!
--Matthieu
Indeed; sorry.
> There are a number of maximum-length words which can be derived. It is
> trivial for me to modify my program to print them all, complete with a
> derivation. The runtime will be identical (but it will be a couple more
> lines of code).
>
> The problem is that for any given word there can be quite a number of
> derivations -- possibly as many as a couple of thousand. In some cases
> the alternatives are anagrams, in other cases they result from
> adding/deleting letters in different orders.
>
> Finding (and printing) all these would be a very large change to my
> program.
New suggestion: Find all words of maximum length that can be reached
by an add-a-gram, together with one /example/ solution for each of
those.
Darn, I'm a little dense today. Thank god it's weekend now.
> time addagram <WORD.LST
addagram starting at length: 16
indeterminations
intermediations
determinations
antimodernist
terminations
nitrosamine inseminator
antinomies antimonies
nominates
mentions
tension intones
nitons
niton
into
ton not
3.79 real, 3.42 user, 0.18 sys
I might have gotten it wrong though ...
cheers,
doug
OK, I did another under-engineered version that is 1/3 the length
and runs faster. Here's the output (with some debug code trimmed out).
Loading dictionary... took 12.692 seconds
Finding longest from 3... took 5.427 seconds
Longest addagrams go from {"mon", "nom"} to {"underestimations"}
I actually think my initial solution is more "interesting", but it really
doesn't work as well. It lends itself to some interesting all-pairs
least-paths algorithms, tho, which could make it capable of doing
lots of other stuff instantly.
Looks great... Could you mail me the code? I'll put it into the tar
file I am going to mail around when I stop getting new solutions...
Nertz, I just tried sending my *new* results, but my net connection
burped, so if you see them twice, oops. This is what I now have (BTW,
I'm using Ocaml native compiler on Linux).
VmSize: 14944 kB
2.33 real, 2.07 user, 0.12 sys
> Looks great... Could you mail me the code? I'll put it into the tar
> file I am going to mail around when I stop getting new solutions...
Sure will.
cheers,
doug
>
> OK, I did another under-engineered version that is 1/3 the length
> and runs faster. Here's the output (with some debug code trimmed out).
>
> Loading dictionary... took 12.692 seconds
> Finding longest from 3... took 5.427 seconds
> Longest addagrams go from {"mon", "nom"} to {"underestimations"}
>
I'm reduced to talking to myself and hacking at home on a Friday
night. Pathetic.
I added 1/2-dozen lines of code to eliminate incidental consing, and
got the search down from 5.4sec to about 0.75sec. This is in keeping
with my long-time claim that, in languages like Lisp, consing and time
have a correlation of nearly 1. (I'll send this program to Nils, so he
can publish it with the other ones.)
If I was less consy in the dictionary phase, the time on that would
probably go down by 5x as well.
> My version takes 63 LOC. On my Sparc Ultra-10, it takes:
> - 18.5 seconds to load the file
> - 1.58 seconds to find one of the longest add-a-grams.
>
> That sure was a lot of fun!
Matthieu, Nils sent me the tar file, which included your solution
(one of the faster ones). I also had some fun tweaking it to run
faster still on Allegro CL.
First, for a reference base, on a 1.2 Ghz AMD Athlon with your
unmodified version:
cl-user(2): (time (tst))
; cpu time (non-gc) 6,430 msec user, 10 msec system
; cpu time (gc) 1,700 msec user, 20 msec system
; cpu time (total) 8,130 msec user, 30 msec system
; real time 8,161 msec
; space allocation:
; 2,042,370 cons cells, 21,560,760 other bytes, 0 static bytes
("indeterminations" "intermediations" "determinations" "antimodernist"
"terminations" "nitrosamine" "antinomies" "nominates" "mentions"
"tension" ...)
cl-user(3):
Next, some slight modifications to your version more than doubles
the speed on the same machine:
cl-user(2): (time (tst))
; cpu time (non-gc) 2,020 msec user, 120 msec system
; cpu time (gc) 1,440 msec user, 10 msec system
; cpu time (total) 3,460 msec user, 130 msec system
; real time 3,597 msec
; space allocation:
; 2,042,272 cons cells, 20,619,760 other bytes, 0 static bytes
("indeterminations" "intermediations" "determinations" "antimodernist"
"terminations" "nitrosamine" "antinomies" "nominates" "mentions"
"tension" ...)
cl-user(3):
Finally, with some experimentation on aggressive modifications to
our SORT implementation on elements of a string I got almost
another doubling of speed:
cl-user(3): (time (tst))
; cpu time (non-gc) 1,090 msec user, 40 msec system
; cpu time (gc) 840 msec user, 50 msec system
; cpu time (total) 1,930 msec user, 90 msec system
; real time 2,027 msec
; space allocation:
; 254 cons cells, 20,619,760 other bytes, 0 static bytes
("indeterminations" "intermediations" "determinations" "antimodernist"
"terminations" "nitrosamine" "antinomies" "nominates" "mentions"
"tension" ...)
cl-user(4):
This string-sorter used the same technique we use on simple-arrays
of element-type t, and is obviously very fast. However, I'm not
sure I would actually want to make this string-sorter part of our
product; there aren't many times that users actually want to sort
characters within a string - this kind of sorting mostly seems apropos
only to this particular problem. However, the specialized sorter
could be coded and run instead of using SORT.
For the simple changes to your code, here are contextual diffs (it
adds 3 LOC):
--- matthieu_villeneuve.lisp Sat Jan 26 03:25:49 2002
***************
*** 1,3 ****
--- 1,7 ----
+
+ #+allegro
+ (sys:resize-areas :new 10000000 :old 10000000)
+
(proclaim '(optimize (speed 3) (debug 0) (safety 0) (compilation-speed 0)))
(defvar *dictionary* (make-hash-table))
***************
*** 8,14 ****
(sort ordered #'char<)
(when (null (gethash length *dictionary*))
(setf (gethash length *dictionary*)
! (make-hash-table :test 'equal)))
(setf (gethash ordered (gethash length *dictionary*)) word))
word)
--- 12,18 ----
(sort ordered #'char<)
(when (null (gethash length *dictionary*))
(setf (gethash length *dictionary*)
! (make-hash-table :test 'equal :size 500 :rehash-size 10.0)))
(setf (gethash ordered (gethash length *dictionary*)) word))
word)
***************
*** 19,24 ****
--- 23,29 ----
(add-word-to-dictionary word))))
(defun copy-seq-except-nth (source dest n)
+ (declare (optimize speed) (type simple-string source dest))
(let ((length (length source)))
(do ((i 0) (j 0))
((>= i length) dest)
--
Duane Rettig Franz Inc. http://www.franz.com/ (www)
1995 University Ave Suite 275 Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253 du...@Franz.COM (internet)
> ----- Original Message -----
> From: "Scott McKay" <s...@mediaone.net>
> Newsgroups: comp.lang.lisp
> Sent: Friday, January 25, 2002 5:53 PM
> Subject: Re: Cute Little Problem
>
> >
> > OK, I did another under-engineered version that is 1/3 the length
> > and runs faster. Here's the output (with some debug code trimmed out).
> >
> > Loading dictionary... took 12.692 seconds
> > Finding longest from 3... took 5.427 seconds
> > Longest addagrams go from {"mon", "nom"} to {"underestimations"}
> >
>
> I'm reduced to talking to myself and hacking at home on a Friday
> night. Pathetic.
That's all right, I'll talk to you tonight :-)
> I added 1/2-dozen lines of code to eliminate incidental consing, and
> got the search down from 5.4sec to about 0.75sec. This is in keeping
> with my long-time claim that, in languages like Lisp, consing and time
> have a correlation of nearly 1. (I'll send this program to Nils, so he
> can publish it with the other ones.)
>
> If I was less consy in the dictionary phase, the time on that would
> probably go down by 5x as well.
I noticed that some have been posting separate times for the loading and
searching phases, and others have just the total. So I decided to
go back to my fastest time (on a 1.2 Ghz AMD Athlon, somewhat more than
twice as fast as your machine) using Matthieu's version as modified
and the specialized sort function, and break them up into the separate
phases:
cl-user(3): (time (add-words-from-file "~/addagrams/WORD.LST"))
; cpu time (non-gc) 970 msec user, 50 msec system
; cpu time (gc) 850 msec user, 60 msec system
; cpu time (total) 1,820 msec user, 110 msec system
; real time 2,639 msec
; space allocation:
; 80 cons cells, 20,465,016 other bytes, 0 static bytes
t
cl-user(4): (time (original-words (longest-add-a-gram)))
; cpu time (non-gc) 90 msec user, 0 msec system
; cpu time (gc) 0 msec user, 0 msec system
; cpu time (total) 90 msec user, 0 msec system
; real time 91 msec
; space allocation:
; 176 cons cells, 154,744 other bytes, 0 static bytes
("indeterminations" "intermediations" "determinations" "antimodernist"
"terminations" "nitrosamine" "antinomies" "nominates" "mentions"
"tension" ...)
cl-user(5):
This is probably on the same order as you are intending. How much are
you consing, anyway?
It is nonetheless interesting how many idioms seem to be shared by
both old version, new version, and, most entertainingly, _your
patches_.
In particular, (sort something #'char<) seems a rather crucial factor,
in it :-).
On the other hand, my present irritation is that reading in the
171K-odd entries seems to take a fair bit more time than y'all are
describing. That's pretty much to be expected, with CLISP; a little
disappointing that it's not faster than it is with CMU/CL...
--
(concatenate 'string "cbbrowne" "@canada.com")
http://www3.sympatico.ca/cbbrowne/spreadsheets.html
"Now they show you how detergents take out bloodstains, a pretty
violent image there. I think if you've got a T-shirt with a bloodstain
all over it, maybe laundry isn't your biggest problem. Maybe you
should get rid of the body before you do the wash." --Jerry Seinfeld
[...]
> This string-sorter used the same technique we use on simple-arrays
> of element-type t, and is obviously very fast. However, I'm not
> sure I would actually want to make this string-sorter part of our
> product; there aren't many times that users actually want to sort
> characters within a string - this kind of sorting mostly seems apropos
> only to this particular problem. However, the specialized sorter
> could be coded and run instead of using SORT.
How complicated would it be for an CLA end-user to pull off a
hack like that? I saw the exact-same problem when writing
up a perl solution:
while (<>) {
chomp;
my $key = join '', sort m/./g;
$word{$key} ||= $_;
}
That's about as fast as I could load the file into an appropriate
perl hash, but that loop's performance swamps everything else-
taking up ~35 of the ~45 seconds it takes to generate a solution
on my box*.
But a simple Inline::C hack of the bottleneck here is trivial-
+ use Inline 'C' => <<'';
+ static int cpr(void *s, void *t) {return *(char *)s - *(char *)t;}
+ void csort( char *s ) { qsort(s, strlen(s), sizeof(char), cpr); }
while (<>) {
chomp;
- my $key = join '', sort m/./g;
+ csort(my $key = $_);
$word{$key} ||= $_;
}
This 4-line change cuts the runtime by more than half, and also evens
the labor division between setting up the hash and searching for the
longest add-a-gram (now ~9 sec each for me). It's roughly identical in
overall performance (both in run time and memory footprint) to the
optimized nine 9's solution I came up with using perl, which of course
needed lots of inlined C code. Also, by dropping perl altogether and
using ~100 lines of straight C (and using GNU libavl for the data tree),
I reduced my nine 9's runtime to 3-4 seconds.
[*] - I'm running linux 2.4 on an AMD K6 225 w/ ~60ns EDO RAM
--
Joe Schaefer
> On the other hand, my present irritation is that reading in the
> 171K-odd entries seems to take a fair bit more time than y'all are
> describing. That's pretty much to be expected, with CLISP; a little
> disappointing that it's not faster than it is with CMU/CL...
Ironically, it was CLISP's I/O speed (they weren't using Gray streams
at the time) and our failure to make our Gray Streams and printer
as fast as CLISP's that lost us the potential of the Viaweb contract to
CLISP, and which subsequently caused us to invent a new stream substrate.
Apparently, our new simple-stream substrate is now fast enough ...
> Duane Rettig <du...@franz.com> writes:
>
> [...]
>
> > This string-sorter used the same technique we use on simple-arrays
> > of element-type t, and is obviously very fast. However, I'm not
> > sure I would actually want to make this string-sorter part of our
> > product; there aren't many times that users actually want to sort
> > characters within a string - this kind of sorting mostly seems apropos
> > only to this particular problem. However, the specialized sorter
> > could be coded and run instead of using SORT.
>
> How complicated would it be for an CLA end-user to pull off a
> hack like that? I saw the exact-same problem when writing
> up a perl solution:
CLA? You mean CL? (My best guess is that you mean CL-Ansi, but that
is reduncdant, since CL implies Ansi already)
I assume you are asking if I had to resort to non-CL code in order to
get the speed. And the answer is "no".
The algorithm is simply a merge-sort, using for the temporary workspace
a stack-allocated array of 1024 characters for problems which fit, or
a heap-allocated array if the problem ever got more than 1K (the
simple-vector sorter chooses between several strategies and stack-allocated
sizes varying from 1k elements to 1m elements, backed by a heap
allocation fallback, or an optional but undocumented pre-allocation
workspace supplied by the caller - I simply pared this all down to a
single trivial case for strings).
The only non-portable hack I did was to pun the string to an array
of (unsigned-byte 8) elements; this would only work for our 8-bit
lisps, and one would have to use (unsigned-byte 16) elements for
our 16-bit-character lisps. The punning I did was based on the fact
that our implementation defines (char> x y) as
(> (char-code x) (char-code y)), and simple-strings store the char-code
into each element of the string. Therefore
(let ((s string))
(declare (type simple-string s))
(char> (schar s i) (schar s j)))
is equivalent to this in an 8-bit lisp (and if you are willing to lie
to the compiler):
(let ((v string))
(declare (type (simple-array (unsigned-byte 8) (*)) v))
(char> (schar v i) (schar v j)))
> I assume you are asking if I had to resort to non-CL code in order to
> get the speed. And the answer is "no".
[..nice description...]
That's what I was really asking about- thanks!
--
Joe Schaefer
>>>>> "CB" == Christopher Browne <cbbr...@acm.org> writes:
[...]
CB> On the other hand, my present irritation is that reading in
CB> the 171K-odd entries seems to take a fair bit more time than
CB> y'all are describing. That's pretty much to be expected, with
CB> CLISP; a little disappointing that it's not faster than it is
CB> with CMU/CL...
AFAIK there have been recent changes to CMUCL read-line, and Paul Foley
has implemented Franz's simple-streams for it. I am using PvE's Debian
package 3.0.8 18c+ 31 December 2001 build 3030, and I am pretty sure
they are not in there. But I figured out how to get some speedup
in string reading a while back w/o going deep into syscalls and other
non-ANSI features.
I wrapped all this in a with-fast-string-reader macro, callable as
follows:
(defmacro with-fast-string-reader
((stream ;
reader ; what you pass here will be bound to a fast reader
; function that takes no arguments and returns a
; string or nil
eof-p ; if the stream is at eof the var you pass here
; will be bound to non-nil
&key
(avg-size 80) ; average size of the strings you'll read
(buffer-size 4096) ; what size to use for eventual read syscalls
(max-string-size 100000)) ; max size of string you'll read
&body body)
I use versions of this macro that do fancier things like processing the
string in-place and returning that result etc. by slightly modifying it
and passing it lambdas instead of a reader symbol to be bound. You can
optimize this further by reading things as (byte 8) etc. but that time
probably better spent improving the CMUCL compiler to produce better code.
Here's a comparison on my P-III 933 with the file off NFS but
in OS buffer cache.
;; ord is with regular read-line, wf is with the fast read-line macro
;; eyeballing the results over multiple runs I get about 5x speedup
;; or so my notes say anyway!
* (time (with-open-file (stream "/home/mucit/WORD.LST")
(setf gork1 (ord stream))))
Compiling LAMBDA NIL:
Compiling Top-Level Form:
Evaluation took:
0.9 seconds of real time
0.69 seconds of user run time
0.17 seconds of system run time
[Run times include 0.51 seconds GC run time]
0 page faults and
18040320 bytes consed.
("zyzzyvas" "zyzzyva" "zymurgy" "zymurgies" "zymotic" ...)
* (time (with-open-file (stream "/home/mucit/WORD.LST")
(setf gork2 (wf stream))))
Compiling LAMBDA NIL:
Compiling Top-Level Form:
Evaluation took:
0.11 seconds of real time
0.09 seconds of user run time
0.01 seconds of system run time
0 page faults and
5738200 bytes consed.
("zyzzyvas" "zyzzyva" "zymurgy" "zymurgies" "zymotic" ...)
*
Here's the code and test harness. The only CMUCL extension I am using
is lisp::shrink-vector, I think.
(declaim (optimize (speed 3) (debug 0) (safety 0) (space 0) (compilation-speed 0)))
(defmacro with-fast-string-reader ((stream reader eof-p &key (avg-size 80) (buffer-size 4096) (max-string-size 100000)) &body body)
(let ((start (gensym)) ;next unread char index of the internal buffer
(end (gensym)) ;last valid char index of the internal buffer
(current (gensym)) ;current character
(dim-limit (gensym)) ;current limit of the output buffer
(fill-ptr (gensym)) ;fill pointer of the output buffer
(pending-buffer (gensym)) ; the output buffer itself
(buffer (gensym)) ; internal buffer
(sb-size (gensym)) ; output buffer size
(mb-size (gensym)) ; internal buffer size
(push-to-str (gensym)) ; these two are internal procs gensymed to not confuse the expansion env
(get-char-proc (gensym)))
`(let* ((,start 0)
(,end 0)
(,current #\X) ;; bogus char to help type inference
(,sb-size ,avg-size)
(,dim-limit ,sb-size)
(,fill-ptr 0)
(,eof-p nil)
(,pending-buffer (make-string ,sb-size))
(,mb-size ,buffer-size))
(declare (type (integer 0 ,buffer-size) ,start ,end)
(type (integer 0 ,max-string-size) ,sb-size ,dim-limit ,fill-ptr)
(simple-string ,pending-buffer)
(inline make-string read-sequence schar subseq lisp::shrink-vector))
(let ((,buffer (make-string ,mb-size)))
(labels ((,get-char-proc ()
(when (= ,start ,end)
(setf ,start 0)
(setf ,end (read-sequence ,buffer ,stream))
(when (zerop ,end)
(setf ,eof-p t)
(return-from ,get-char-proc nil)))
(setf ,current (schar ,buffer ,start))
(incf ,start))
(,push-to-str ()
(when (= ,fill-ptr ,dim-limit)
(let ((new-buffer
(make-string
(setf ,dim-limit (+ ,fill-ptr ,sb-size)))))
(setf (subseq new-buffer 0 ,fill-ptr)
,pending-buffer)
(setf ,pending-buffer new-buffer)))
(setf (schar ,pending-buffer ,fill-ptr) ,current)
(incf ,fill-ptr))
(,reader ()
(loop while (and (,get-char-proc)
(not (char= ,current #\newline)))
do (,push-to-str))
(lisp::shrink-vector ,pending-buffer ,fill-ptr)
(prog1
,pending-buffer
(setf ,fill-ptr 0)
(setf ,dim-limit ,sb-size)
(setf ,pending-buffer (make-string ,sb-size)))))
(declare (inline ,reader ,push-to-str ,get-char-proc))
,@body)))))
(defun wf (stream &aux lines)
(with-fast-string-reader (stream reader eof :avg-size 15 :buffer-size 8192)
(loop
(let ((str (reader)))
(if (not eof)
(push str lines)
(return))))
lines))
(defun ord (stream &aux lines)
(do ((l (read-line stream) (read-line stream nil 'eof)))
((eq l 'eof) "Reached end of file.")
(push l lines))
lines)
I bounced the code at "clisp -c", and the only thing it gripes about
is shrink-vector.
That's a pretty neat macro!
--
(reverse (concatenate 'string "gro.gultn@" "enworbbc"))
http://www3.sympatico.ca/cbbrowne/sap.html
"Who is General Failure and why is he reading my hard disk?"
-- <lei...@inf.fu-berlin.de>, Felix von Leitner
CB> I bounced the code at "clisp -c", and the only thing it gripes
CB> about is shrink-vector. [...]
Yeah, actually it is possible to write this macro w/o the
shrink-vector and using adjust-array. Unfortunately it would have
been too much work at the time to write CMUCL compiler optimizers for
this case for adjust-array. The point is there is nothing in the
language itself that prevents things from being fast.
Adjusting the string was mostly a non-issue, however, because I don't
use this macro to return lines. At the point where shrink-array is
called you have a buffer and a pointer to the end of the delimited
content, you can process that buffer right there to turn that string
data into your internal representation and not generate any garbage at
all. Such a modification would change interface: you'd pass a lambda
or a function name that takes a buffer + a fill-pointer and a
delimiter. The macro would then expand them inline and have the SSC
(CMUCL) optimize the details. Something like split-string, or some
META style parser could fit in there. The mess lives in the macro,
the application sees a clean interface (a single magic function call
that returns the right data structure) w/o the performance penalty.
You cannot do stuff like this in C -- you'd have to expose the bloody
mess. In fairness, of course, if you take the bloody mess as a given
you don't need to get this clever in C to get performance either.
cheers,
BM
I have put all solutions into a tar file now, together with timings
and a Lisp port I made of Doug's OCaml version. It is much slower
than the OCaml version, about 19 vs 0.88 seconds... but it doesn't
contain any declarations and I didn't try to optimize it anyhow.
Everything is available here:
http://www.cartan.de/addagram.tar.gz
Somebody named Yuan made a very fast and simple solution. Thanks to
everybody!
Regards,
--
Nils Goesche
Ask not for whom the <CONTROL-G> tolls.
PGP key ID 0xC66D6E6F
s:{x@<x}'w:0:`word.txt / read and sort each word: 1400 ms
a:|2_ s(1_'=i,n)-#i:1+!|/n:#:'s / group words by length: 130 ms
d:{:[#x;y@&y _lin,/x@\:h _di/:h:!#*x;x]} / back transitions
#{~#*r::|d\x}(1_)/a / brute force: 370 ms
e:{y@*&y _lin x:{x@<x}'x,/:_ci 97+!26} / forward transitions
w s?/:(**r)e\1_ r / e.g. first root: 0 ms
timings (in milliseconds) are for my 400 mhz pentium.
the first line reads the wordlist from disk and sorts each word.
the second line, given the sorted wordlist, groups words into same-length
sublists. e.g. a[0] is the list of all 28-length words, a[1] all 27-length
words, &c.
the third line defines the function for computing back transitions: given
lists x and y, x containing words of length n, y containing words of length
n-1, return the sublist of y which has continuations in x.
the fourth line computes all the add-a-grams, and prints the length of the
longest found.
the fifth line define the function for computing forward transitions: given
x and y, y containing words of length n, x containing words of length n-1,
return the sublist of of y which has continuations of words in x.
the sixth line finds and prints a sample longest addagram; e.g. the first:
14
("aim"
"aims"
"amies"
"amines"
"anomies"
"amniotes"
"nominates"
"antimonies"
"inseminator"
"terminations"
"antimodernist"
"determinations"
"intermediations"
"indeterminations")
note that if we were concerned only to find the length of longest addagram,
the code is even shorter:
w:{x@<x}'0:`word.txt / read and sort
w@:(1_'=i,n)-#i:1+!|/n:#:'w / group
w:|2_ w / drop and reverse
d:{:[#x;y@&y _lin,/x@\:h _di/:h:!#*x;x]} / back transitions
#(~#d/)(1_)/w / brute force
k datatypes are atoms, lists, and dictionaries. there are 40 primitives,
20 monadic and 20 dyadic, each denoted by a single ascii symbol. syntax
is simplified iverson notation (APL/J). e.g. line 2 above is read:
w is w of (1 drop each group i join n)-i
where i is 1 + enumerate max over n
where n is count each w
the k interpreter can be downloaded from www.kx.com. this is quick, since
the interpreter is only about 170k, including native gui, communications, and
file i/o.
enjoy!
sa
> several k programmers passed this one around yesterday.
[snip]
> w:{x@<x}'0:`word.txt / read and sort
> w@:(1_'=i,n)-#i:1+!|/n:#:'w / group
> w:|2_ w / drop and reverse
> d:{:[#x;y@&y _lin,/x@\:h _di/:h:!#*x;x]} / back transitions
> #(~#d/)(1_)/w / brute force
Good Lord! I didn't think it was possible to design a language that
looked uglier than Perl, but there it is. Blech!
E.
> note that if we were concerned only to find the length of longest addagram,
> the code is even shorter:
>
> w:{x@<x}'0:`word.txt / read and sort
> w@:(1_'=i,n)-#i:1+!|/n:#:'w / group
> w:|2_ w / drop and reverse
> d:{:[#x;y@&y _lin,/x@\:h _di/:h:!#*x;x]} / back transitions
> #(~#d/)(1_)/w / brute force
>
> k datatypes are atoms, lists, and dictionaries. there are 40 primitives,
> 20 monadic and 20 dyadic, each denoted by a single ascii symbol. syntax
> is simplified iverson notation (APL/J). e.g. line 2 above is read:
Greenspun's Eleventh Rule of Programming:
Every sufficiently unreadable programming language contains a
reimplementation of APL and/or INTERCAL.
Cheers
--
Marco Antoniotti ========================================================
NYU Courant Bioinformatics Group tel. +1 - 212 - 998 3488
719 Broadway 12th Floor fax +1 - 212 - 995 4122
New York, NY 10003, USA http://bioinformatics.cat.nyu.edu
"Hello New York! We'll do what we can!"
Bill Murray in `Ghostbusters'.
Erann Gat wrote:
>
> > w:{x@<x}'0:`word.txt / read and sort
> > w@:(1_'=i,n)-#i:1+!|/n:#:'w / group
> > w:|2_ w / drop and reverse
> > d:{:[#x;y@&y _lin,/x@\:h _di/:h:!#*x;x]} / back transitions
> > #(~#d/)(1_)/w / brute force
>
> Good Lord! I didn't think it was possible to design a language that
> looked uglier than Perl, but there it is. Blech!
I had the honor of working on the same floor with the K developers for
quite a while and got to see some of the documentation. Sadly, I was a
consultant and only in-housers got tapped for K training.
Aside from speed, they actually have brevity as a language design goal.
I'll never forget the one-page spreadsheet application.
My language design preference goes in other directions than source
brevity, but I can't help getting a kick out of these guys and their
enthusiasm for K.
Imagine, a language which could not hold an obfuscated code contest if
they wanted to.
:)
--
kenny tilton
clinisys, inc
---------------------------------------------------------------
"We have a pond and a pool. The pond would be good for you."
- Ty to Carl, Caddy Shack
[K]
> Aside from speed, they actually have brevity as a language design goal.
> I'll never forget the one-page spreadsheet application.
Didn't one of those win the last international obfuscated C
competition?
Cheers,
M.
--
Gullible editorial staff continues to post links to any and all
articles that vaguely criticize Linux in any way.
-- Reason #4 for quitting slashdot today, from
http://www.cs.washington.edu/homes/klee/misc/slashdot.html
> I'll never forget the one-page spreadsheet application.
>
ah, but do you remember the two-line version:
S..t:"S[.;`f]:.[`D;(;);{. y};S[]][]" / trigger: re-evaluate
S:D:.+(`a`b`c`d;4 5#,"");`show$`S / initialize and show
(cp the 800-line java applet at http://java.sun.com/applets/SpreadSheet)
as you say, we like short programs which go fast. :)
stevan apter wrote:
>
> "Kenny Tilton" <kti...@nyc.rr.com> wrote in message news:3C6184F3...@nyc.rr.com...
>
> > I'll never forget the one-page spreadsheet application.
> >
>
> ah, but do you remember the two-line version:
>
> S..t:"S[.;`f]:.[`D;(;);{. y};S[]][]" / trigger: re-evaluate
> S:D:.+(`a`b`c`d;4 5#,"");`show$`S / initialize and show
>
Not sure I do, but I am going back about five years, mebbe more. Y'all
got a web page? I went looking but google just wanted to tell me about
K-5 schools. That and two specialty languages called K.
Try,
http://www.kx.com/download/documentation.htm
--------
Geoff
I finished V1 of mine Saturday night/Sunday morning.
Approx 10 hours work.
On a 1GHZ laptop running Windows 2000 professinal. MSVC++ V6.0.
Default maximum speed optimizations
Largest addagram 16
an
nan
anon
anion
anions
anoints
enations
antinoise
antimonies
inseminator
terminations
antimodernist
determinations
intermediations
indeterminations
[Addagram] Time: [1.71114]
Thats 1.71114 seconds
Start benchmark timer
Open and load file
Format ready for navigating
find path
end benchmark
Log the results.
I could have guessed that there was a NG discussion about this. As I
thought I found this thread after a search for addagram. I wonder if
ITA are flooded with responses?
If I feel inclined/have the time I will get a *X version working.
Should be interesting - as the machine I would use for developing
would be 1/5th the raw processing power.
Actually.....a web accessible perl version...???? Or howsabout a
JavaScript version....
M.