The sources and data files to reproduce the problem are available from
http://euridice.tue.nl/~ptutelae/TeX/patronen/performance.html
Hopefully somebody can provide help.
--Piet
e-mail: __o Piet Tutelaers
P.T.H.T...@tue.nl _`\<,_ ICTS / Room LG 1.82
phone: +31 (0)40 2474541 (_)/ (_) Eindhoven University of Technology
fax: +31 (0)40 2434438 Save nature P.O. Box 513, 5600 MB Eindhoven, NL
I'll reply directly if I find out more or need more info.
Thanks,
-Jeff
--
Jeff Sullivan | Compaq Computer Corp. | mailto:j...@zk3.dec.com
Compaq C/C++ | Nashua, NH 03062-2698 | Jeff.S...@compaq.com
What is happening is that the stack is being corrupted by a
call to the null_terminate routine. A later call to exit()
ends up in an rendom loop doing "free"s of bad addresses.
This loop is what is consuming most of the time in the bad
case.
I used atom third degree (see man atom, man third) to diagnose
the memory overwrite. This is what it told me when I ran it:
strpascal.c: 25: reading invalid heap at 0x140460000
null_terminate patgen, strpascal.c, line 25
make_c_string patgen, strpascal.c, line 15
xfopen_pas patgen, xfopen-pas.c, line 15
dodictionary patgen, patgen2.c, line 1398
main_body patgen, patgen2.c, line 1698
main patgen, main.c, line 30
__start patgen
When I ran this in the debugger, I found that there were many
calls to make_c_string and did not quickly find where the problem
was occuring.
However, when I made a simple "debugging" change to null_terminate.
I saw this (bad) case:
null_terminate: s=pattmp.5, len=8
null_terminate: i=528
In a "good" case, I expect that i would be len-1. It
certainly is not in this case.
The debugging change I made to null_terminate was this:
int i = 0;
printf("null_terminate: s=%s, len=%d\n", s, strlen(s));
while (*s != ' ')
{ s++; i++; }
printf("null_terminate: i=%d\n", i);
In the (bad) case of pattmp.5, it looks like the string does
NOT have a trailing space and the null_terminate function will
add the NULL termination to some unknown location. It may
sometimes work, but it looks like a source code bug to me.
Let me know if this helps.
I have have found a solution for the performance problem. Patgen runs
now in 10 minutes on a 600MHz Digital UNIX system. On our four year old
OSF/Alpha system (32 MBytes memory) it still takes 33 minutes. My
simple Pentium II based Linux system (64 MBytes) outperforms these
alpha's with 8 minutes and 30 seconds. For a detailed description of
the problem and the solution see my HTML page
http://euridice.tue.nl/~ptutelae/TeX/patgen
Hope the solution will find its way in the web2c sources of patgen.
> alpha's with 8 minutes and 30 seconds. For a detailed description of
> the problem and the solution see my HTML page
> http://euridice.tue.nl/~ptutelae/TeX/patgen
No mention of a bad null_terminate routine?
Donald Arseneau as...@triumf.ca
>Piet Tutelaers <P.T.H.T...@tue.nl> writes:
>> alpha's with 8 minutes and 30 seconds. For a detailed description of
>> the problem and the solution see my HTML page
>> http://euridice.tue.nl/~ptutelae/TeX/patgen
>No mention of a bad null_terminate routine?
I had some discussion with Jeff Sullivan about this problem. The
problem did occur after patgen was finished with the patterns and wants
to write the resulting pattmp.N file (N depends on the wanted
hyphenation level). For some strange reason patgen, only on a Digital
UNIX system, expects a space after the filename "pattmp.N". We could
solve this easily in the patgen2.c version but the bad performance
still was there. When I switched over to the patgen sources from the
latest TeXlive CD I did not see the null_terminate problem anymore.
That is the reason why I forgot about it.
>Donald Arseneau as...@triumf.ca
--Piet