Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

htdig: htmerge now running for 4500 minutes!!

1 view
Skip to first unread message

Alister van Tonder

unread,
Jan 26, 1999, 3:00:00 AM1/26/99
to

--------------F035D7626CC5C2CA5F9F9321
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

My htmerge job often runs for several DAYS!!
Even when I kill the job (after several days) it has produced a working
searchable database!

This particular job was started at 20h01 on Jan 22nd. The files below
were created 10 minutes later. In the mean time htmerge continues as a
job, usually taking all available CPU resources, and continues (until I
eventually) have to kill it.

A directory listing of the ~/htdig/lib/db directory is as follows:

drwxr-xr-x 2 root root 11264 Jan 24 07:26 .
drwxr-xr-x 4 root root 1024 Jan 1 10:19 ..
-rw-r--r-- 1 root root 33153024 Jan 22 20:10 db.docdb
-rw-rw-r-- 1 root root 740352 Jan 1 11:04 db.docs.index
-rw-rw-r-- 1 root root 2430976 Jan 2 01:35 db.metaphone.db
-rw-rw-r-- 1 root root 1686528 Jan 2 01:35 db.soundex.db
-rw-r--r-- 1 root root 47838678 Jan 22 20:10 db.wordlist
-rw-r--r-- 1 root root 12288 Jan 22 20:12 db.wordlist.new
-rw-rw-r-- 1 root root 69552128 Jan 12 01:02 db.words.db
-rw------- 1 root root 8388368 Jan 22 20:11 sort0795500092
-rw------- 1 root root 8388371 Jan 22 20:11 sort0795500093
-rw------- 1 root root 8388365 Jan 22 20:11 sort0795500094
-rw------- 1 root root 8388309 Jan 22 20:11 sort0795500095
-rw------- 1 root root 8388340 Jan 22 20:11 sort0795500096
-rw------- 1 root root 5896925 Jan 22 20:12 sort0795500097


This job has run 4500 minutes and is causing a heavy (unnessary) load on
the system!


The results of "top" is as follows:

11:38pm up 7 days, 1:45, 1 user, load average: 1.00, 1.00, 1.00
46 processes: 43 sleeping, 3 running, 0 zombie, 0 stopped
CPU states: 99.8% user, 0.1% system, 0.0% nice, 0.1% idle
Mem: 30844K av, 29396K used, 1448K free, 22456K shrd, 3704K buff

Swap: 92732K av, 828K used, 91904K free 17960K
cached

PID USER PRI NI SIZE RSS SHARE STAT LIB %CPU %MEM TIME COMMA

7954 root 16 0 640 640 472 R 0 99.0 2.0 4509m
htmerg
13970 root 2 0 588 588 456 R 0 0.9 1.9 0:00 top
1 root 0 0 392 348 328 S 0 0.0 1.1 0:02 init
2 root 0 0 0 0 0 SW 0 0.0 0.0 0:03
kflush
3 root -12 -12 0 0 0 SW< 0 0.0 0.0 0:00
kswapd
10116 nobody 0 0 872 872 764 S 0 0.0 2.8 0:00 httpd

304 root 0 0 304 260 248 S 0 0.0 0.8 0:00
minget
10117 nobody 0 0 864 864 764 S 0 0.0 2.8 0:00 httpd

19 root 0 0 352 332 300 S 0 0.0 1.0 0:00
kernel
159 root 0 0 428 416 356 S 0 0.0 1.3 0:02
syslog
168 root 0 0 532 492 324 S 0 0.0 1.5 0:00 klogd

179 daemon 0 0 388 368 312 S 0 0.0 1.1 0:00 atd
190 root 0 0 456 448 412 S 0 0.0 1.4 0:00 crond

201 bin 0 0 380 360 304 S 0 0.0 1.1 0:01
portma
212 root 0 0 736 720 496 S 0 0.0 2.3 2:09 snmpd

224 root 0 0 384 352 316 S 0 0.0 1.1 0:00 inetd

10109 nobody 0 0 868 868 760 S 0 0.0 2.8 0:00 httpd

Is a configuration error causing this problem ?

My rundig is virtually standard:

# ############### Start of rundig script ####################
#! /bin/sh

#
# rundig
#
# $Id: rundig,v 1.2 1998/06/22 04:32:23 turtle Exp $
#
# This is a sample script to create a search database for ht://Dig.
#
if [ "$1" = "-v" ]; then
verbose=-v
fi
if [ "$2" = "-s" ]; then
stats=-s
fi

#
# Set the TMPDIR variable if you want htmerge to put files in a location

# other than the default. This is important if you do not have enough
# disk space for the big sort that htmerge runs. Also, be aware that
# on some systems, /tmp is a memory mapped filesystem that takes away
# from virtual memory.
#
# from virtual memory.
#
TMPDIR=/var/lib/htdig/db
export TMPDIR

/usr/sbin/htdig -i $verbose $stats
/usr/sbin/htmerge $verbose $stats
/usr/sbin/htnotify $verbose

#
# Only create the endings database if it doesn't already exist.
# This database is static, so even if pages change, this database will
not
# need to be rebuilt.
#
FUZZYALGS="soundex metaphone"
if [ ! -f /var/lib/htdig/common/word2root.db ]
then
FUZZYALGS="$FUZZYALGS endings"
fi

if [ ! -f /var/lib/htdig/common/synonyms.db ]
then
FUZZYALGS="$FUZZYALGS synonyms"
fi
#
# Alister's comment!!
# Do not run htfuzzy for the time being!!
#
# /usr/sbin/htfuzzy $verbose $FUZZYALGS

# $######### End of Script ############################

Any help will be appreciated!

--------------F035D7626CC5C2CA5F9F9321
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: 7bit

<HTML>
<TT>My htmerge job often runs for several DAYS!!</TT>
<BR><TT>Even when I kill the job (after several days) it has produced a
working searchable database!</TT><TT></TT>

<P><TT>This particular job was started at 20h01 on Jan 22nd.&nbsp; The
files below were created 10 minutes later.&nbsp; In the mean time htmerge
continues as a job, usually taking all available CPU resources, and continues
(until I eventually) have to kill it.</TT><TT></TT>

<P><TT>A directory listing of the ~/htdig/lib/db directory is as follows:</TT><TT></TT>

<P><TT>drwxr-xr-x&nbsp;&nbsp; 2 root&nbsp;&nbsp;&nbsp;&nbsp; root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
11264 Jan 24 07:26 .</TT>
<BR><TT>drwxr-xr-x&nbsp;&nbsp; 4 root&nbsp;&nbsp;&nbsp;&nbsp; root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
1024 Jan&nbsp; 1 10:19 ..</TT>
<BR><TT>-rw-r--r--&nbsp;&nbsp; 1 root&nbsp;&nbsp;&nbsp;&nbsp; root&nbsp;&nbsp;&nbsp;&nbsp;
33153024 Jan 22 20:10 db.docdb</TT>
<BR><TT>-rw-rw-r--&nbsp;&nbsp; 1 root&nbsp;&nbsp;&nbsp;&nbsp; root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
740352 Jan&nbsp; 1 11:04 db.docs.index</TT>
<BR><TT>-rw-rw-r--&nbsp;&nbsp; 1 root&nbsp;&nbsp;&nbsp;&nbsp; root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
2430976 Jan&nbsp; 2 01:35 db.metaphone.db</TT>
<BR><TT>-rw-rw-r--&nbsp;&nbsp; 1 root&nbsp;&nbsp;&nbsp;&nbsp; root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
1686528 Jan&nbsp; 2 01:35 db.soundex.db</TT>
<BR><TT>-rw-r--r--&nbsp;&nbsp; 1 root&nbsp;&nbsp;&nbsp;&nbsp; root&nbsp;&nbsp;&nbsp;&nbsp;
47838678 Jan 22 20:10 db.wordlist</TT>
<BR><TT>-rw-r--r--&nbsp;&nbsp; 1 root&nbsp;&nbsp;&nbsp;&nbsp; root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
12288 Jan 22 20:12 db.wordlist.new</TT>
<BR><TT>-rw-rw-r--&nbsp;&nbsp; 1 root&nbsp;&nbsp;&nbsp;&nbsp; root&nbsp;&nbsp;&nbsp;&nbsp;
69552128 Jan 12 01:02 db.words.db</TT>
<BR><TT>-rw-------&nbsp;&nbsp; 1 root&nbsp;&nbsp;&nbsp;&nbsp; root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
8388368 Jan 22 20:11 sort0795500092</TT>
<BR><TT>-rw-------&nbsp;&nbsp; 1 root&nbsp;&nbsp;&nbsp;&nbsp; root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
8388371 Jan 22 20:11 sort0795500093</TT>
<BR><TT>-rw-------&nbsp;&nbsp; 1 root&nbsp;&nbsp;&nbsp;&nbsp; root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
8388365 Jan 22 20:11 sort0795500094</TT>
<BR><TT>-rw-------&nbsp;&nbsp; 1 root&nbsp;&nbsp;&nbsp;&nbsp; root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
8388309 Jan 22 20:11 sort0795500095</TT>
<BR><TT>-rw-------&nbsp;&nbsp; 1 root&nbsp;&nbsp;&nbsp;&nbsp; root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
8388340 Jan 22 20:11 sort0795500096</TT>
<BR><TT>-rw-------&nbsp;&nbsp; 1 root&nbsp;&nbsp;&nbsp;&nbsp; root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
5896925 Jan 22 20:12 sort0795500097</TT>
<BR><TT></TT>&nbsp;<TT></TT>

<P><TT>This job has run 4500 minutes and is causing a heavy (unnessary)
load on the system!</TT>
<BR><TT></TT>&nbsp;<TT></TT>

<P><TT>The results of "top" is as follows:</TT><TT></TT>

<P><TT>&nbsp;11:38pm&nbsp; up 7 days,&nbsp; 1:45,&nbsp; 1 user,&nbsp; load
average: 1.00, 1.00, 1.00</TT>
<BR><TT>46 processes: 43 sleeping, 3 running, 0 zombie, 0 stopped</TT>
<BR><TT>CPU states: 99.8% user,&nbsp; 0.1% system,&nbsp; 0.0% nice,&nbsp;
0.1% idle</TT>
<BR><TT>Mem:&nbsp;&nbsp; 30844K av,&nbsp; 29396K used,&nbsp;&nbsp; 1448K
free,&nbsp; 22456K shrd,&nbsp;&nbsp; 3704K buff</TT>
<BR><TT>Swap:&nbsp; 92732K av,&nbsp;&nbsp;&nbsp; 828K used,&nbsp; 91904K
free&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
17960K cached</TT>
<BR><TT>&nbsp;</TT>
<BR><TT>&nbsp; PID USER&nbsp;&nbsp;&nbsp;&nbsp; PRI&nbsp; NI&nbsp; SIZE&nbsp;
RSS SHARE STAT&nbsp; LIB %CPU %MEM&nbsp;&nbsp; TIME COMMA</TT>
<BR><TT>&nbsp;7954 root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 16&nbsp;&nbsp; 0&nbsp;&nbsp;
640&nbsp; 640&nbsp;&nbsp; 472 R&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0 99.0&nbsp;
2.0&nbsp; 4509m htmerg</TT>
<BR><TT>13970 root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 2&nbsp;&nbsp; 0&nbsp;&nbsp;
588&nbsp; 588&nbsp;&nbsp; 456 R&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;
0.9&nbsp; 1.9&nbsp;&nbsp; 0:00 top</TT>
<BR><TT>&nbsp;&nbsp;&nbsp; 1 root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;
0&nbsp;&nbsp; 392&nbsp; 348&nbsp;&nbsp; 328 S&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
0&nbsp; 0.0&nbsp; 1.1&nbsp;&nbsp; 0:02 init</TT>
<BR><TT>&nbsp;&nbsp;&nbsp; 2 root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;
0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp;
0 SW&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp; 0.0&nbsp; 0.0&nbsp;&nbsp; 0:03
kflush</TT>
<BR><TT>&nbsp;&nbsp;&nbsp; 3 root&nbsp;&nbsp;&nbsp;&nbsp; -12 -12&nbsp;&nbsp;&nbsp;&nbsp;
0&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0 SW&lt;&nbsp;&nbsp;&nbsp;&nbsp;
0&nbsp; 0.0&nbsp; 0.0&nbsp;&nbsp; 0:00 kswapd</TT>
<BR><TT>10116 nobody&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp; 0&nbsp;&nbsp;
872&nbsp; 872&nbsp;&nbsp; 764 S&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;
0.0&nbsp; 2.8&nbsp;&nbsp; 0:00 httpd</TT>
<BR><TT>&nbsp; 304 root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;
0&nbsp;&nbsp; 304&nbsp; 260&nbsp;&nbsp; 248 S&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
0&nbsp; 0.0&nbsp; 0.8&nbsp;&nbsp; 0:00 minget</TT>
<BR><TT>10117 nobody&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp; 0&nbsp;&nbsp;
864&nbsp; 864&nbsp;&nbsp; 764 S&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;
0.0&nbsp; 2.8&nbsp;&nbsp; 0:00 httpd</TT>
<BR><TT>&nbsp;&nbsp; 19 root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;
0&nbsp;&nbsp; 352&nbsp; 332&nbsp;&nbsp; 300 S&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
0&nbsp; 0.0&nbsp; 1.0&nbsp;&nbsp; 0:00 kernel</TT>
<BR><TT>&nbsp; 159 root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;
0&nbsp;&nbsp; 428&nbsp; 416&nbsp;&nbsp; 356 S&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
0&nbsp; 0.0&nbsp; 1.3&nbsp;&nbsp; 0:02 syslog</TT>
<BR><TT>&nbsp; 168 root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;
0&nbsp;&nbsp; 532&nbsp; 492&nbsp;&nbsp; 324 S&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
0&nbsp; 0.0&nbsp; 1.5&nbsp;&nbsp; 0:00 klogd</TT>
<BR><TT>&nbsp; 179 daemon&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp; 0&nbsp;&nbsp;
388&nbsp; 368&nbsp;&nbsp; 312 S&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;
0.0&nbsp; 1.1&nbsp;&nbsp; 0:00 atd</TT>
<BR><TT>&nbsp; 190 root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;
0&nbsp;&nbsp; 456&nbsp; 448&nbsp;&nbsp; 412 S&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
0&nbsp; 0.0&nbsp; 1.4&nbsp;&nbsp; 0:00 crond</TT>
<BR><TT>&nbsp; 201 bin&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;
0&nbsp;&nbsp; 380&nbsp; 360&nbsp;&nbsp; 304 S&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
0&nbsp; 0.0&nbsp; 1.1&nbsp;&nbsp; 0:01 portma</TT>
<BR><TT>&nbsp; 212 root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;
0&nbsp;&nbsp; 736&nbsp; 720&nbsp;&nbsp; 496 S&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
0&nbsp; 0.0&nbsp; 2.3&nbsp;&nbsp; 2:09 snmpd</TT>
<BR><TT>&nbsp; 224 root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;
0&nbsp;&nbsp; 384&nbsp; 352&nbsp;&nbsp; 316 S&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
0&nbsp; 0.0&nbsp; 1.1&nbsp;&nbsp; 0:00 inetd</TT>
<BR><TT>10109 nobody&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp; 0&nbsp;&nbsp;
868&nbsp; 868&nbsp;&nbsp; 760 S&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;
0.0&nbsp; 2.8&nbsp;&nbsp; 0:00 httpd</TT>
<BR><TT></TT>&nbsp;<TT></TT>

<P><TT>Is a configuration error causing this problem ?</TT><TT></TT>

<P><TT>My rundig is virtually standard:</TT><TT></TT>

<P><TT>#&nbsp;&nbsp;&nbsp; ############### Start of rundig script ####################</TT>
<BR><TT>#! /bin/sh</TT><TT></TT>

<P><TT>#</TT>
<BR><TT># rundig</TT>
<BR><TT>#</TT>
<BR><TT># $Id: rundig,v 1.2 1998/06/22 04:32:23 turtle Exp $</TT>
<BR><TT>#</TT>
<BR><TT># This is a sample script to create a search database for ht://Dig.</TT>
<BR><TT>#</TT>
<BR><TT>if [ "$1" = "-v" ]; then</TT>
<BR><TT>&nbsp;&nbsp;&nbsp; verbose=-v</TT>
<BR><TT>fi</TT>
<BR><TT>if [ "$2" = "-s" ]; then</TT>
<BR><TT>&nbsp;&nbsp;&nbsp; stats=-s</TT>
<BR><TT>fi</TT><TT></TT>

<P><TT>#</TT>
<BR><TT># Set the TMPDIR variable if you want htmerge to put files in a
location</TT>
<BR><TT># other than the default.&nbsp; This is important if you do not
have enough</TT>
<BR><TT># disk space for the big sort that htmerge runs.&nbsp; Also, be
aware that</TT>
<BR><TT># on some systems, /tmp is a memory mapped filesystem that takes
away</TT>
<BR><TT># from virtual memory.</TT>
<BR><TT>#</TT>
<BR><TT># from virtual memory.</TT>
<BR><TT>#</TT>
<BR><TT>TMPDIR=/var/lib/htdig/db</TT>
<BR><TT>export TMPDIR</TT><TT></TT>

<P><TT>/usr/sbin/htdig -i $verbose $stats</TT>
<BR><TT>/usr/sbin/htmerge $verbose $stats</TT>
<BR><TT>/usr/sbin/htnotify $verbose</TT><TT></TT>

<P><TT>#</TT>
<BR><TT># Only create the endings database if it doesn't already exist.</TT>
<BR><TT># This database is static, so even if pages change, this database
will not</TT>
<BR><TT># need to be rebuilt.</TT>
<BR><TT>#</TT>
<BR><TT>FUZZYALGS="soundex metaphone"</TT>
<BR><TT>if [ ! -f /var/lib/htdig/common/word2root.db ]</TT>
<BR><TT>then</TT>
<BR><TT>&nbsp;&nbsp;&nbsp; FUZZYALGS="$FUZZYALGS endings"</TT>
<BR><TT>fi</TT><TT></TT>

<P><TT>if [ ! -f /var/lib/htdig/common/synonyms.db ]</TT>
<BR><TT>then</TT>
<BR><TT>&nbsp;&nbsp;&nbsp; FUZZYALGS="$FUZZYALGS synonyms"</TT>
<BR><TT>fi</TT>
<BR><TT>#</TT>
<BR><TT>#&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Alister's comment!!</TT>
<BR><TT>#&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Do not run htfuzzy for the
time being!!</TT>
<BR><TT>#</TT>
<BR><TT># /usr/sbin/htfuzzy $verbose $FUZZYALGS</TT><TT></TT>

<P><TT>#&nbsp;&nbsp;&nbsp; $######### End of Script ############################</TT><TT></TT>

<P><TT>Any help will be appreciated!</TT>
<BR><TT></TT>&nbsp;
<BR><TT></TT>&nbsp;</HTML>

--------------F035D7626CC5C2CA5F9F9321--

----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-...@sdsu.edu containing the single word "unsubscribe" in
the body of the message.

0 new messages