htdig: htmerge now running for 4500 minutes!!


Alister van Tonder (vtondera@fnmail.com)
Mon, 25 Jan 1999 23:48:03 +0200


My htmerge job often runs for several DAYS!!
Even when I kill the job (after several days) it has produced a working
searchable database!

This particular job was started at 20h01 on Jan 22nd. The files below
were created 10 minutes later. In the mean time htmerge continues as a
job, usually taking all available CPU resources, and continues (until I
eventually) have to kill it.

A directory listing of the ~/htdig/lib/db directory is as follows:

drwxr-xr-x 2 root root 11264 Jan 24 07:26 .
drwxr-xr-x 4 root root 1024 Jan 1 10:19 ..
-rw-r--r-- 1 root root 33153024 Jan 22 20:10 db.docdb
-rw-rw-r-- 1 root root 740352 Jan 1 11:04 db.docs.index
-rw-rw-r-- 1 root root 2430976 Jan 2 01:35 db.metaphone.db
-rw-rw-r-- 1 root root 1686528 Jan 2 01:35 db.soundex.db
-rw-r--r-- 1 root root 47838678 Jan 22 20:10 db.wordlist
-rw-r--r-- 1 root root 12288 Jan 22 20:12 db.wordlist.new
-rw-rw-r-- 1 root root 69552128 Jan 12 01:02 db.words.db
-rw------- 1 root root 8388368 Jan 22 20:11 sort0795500092
-rw------- 1 root root 8388371 Jan 22 20:11 sort0795500093
-rw------- 1 root root 8388365 Jan 22 20:11 sort0795500094
-rw------- 1 root root 8388309 Jan 22 20:11 sort0795500095
-rw------- 1 root root 8388340 Jan 22 20:11 sort0795500096
-rw------- 1 root root 5896925 Jan 22 20:12 sort0795500097

This job has run 4500 minutes and is causing a heavy (unnessary) load on
the system!

The results of "top" is as follows:

 11:38pm up 7 days, 1:45, 1 user, load average: 1.00, 1.00, 1.00
46 processes: 43 sleeping, 3 running, 0 zombie, 0 stopped
CPU states: 99.8% user, 0.1% system, 0.0% nice, 0.1% idle
Mem: 30844K av, 29396K used, 1448K free, 22456K shrd, 3704K buff

Swap: 92732K av, 828K used, 91904K free 17960K
cached

  PID USER PRI NI SIZE RSS SHARE STAT LIB %CPU %MEM TIME COMMA

 7954 root 16 0 640 640 472 R 0 99.0 2.0 4509m
htmerg
13970 root 2 0 588 588 456 R 0 0.9 1.9 0:00 top
    1 root 0 0 392 348 328 S 0 0.0 1.1 0:02 init
    2 root 0 0 0 0 0 SW 0 0.0 0.0 0:03
kflush
    3 root -12 -12 0 0 0 SW< 0 0.0 0.0 0:00
kswapd
10116 nobody 0 0 872 872 764 S 0 0.0 2.8 0:00 httpd

  304 root 0 0 304 260 248 S 0 0.0 0.8 0:00
minget
10117 nobody 0 0 864 864 764 S 0 0.0 2.8 0:00 httpd

   19 root 0 0 352 332 300 S 0 0.0 1.0 0:00
kernel
  159 root 0 0 428 416 356 S 0 0.0 1.3 0:02
syslog
  168 root 0 0 532 492 324 S 0 0.0 1.5 0:00 klogd

  179 daemon 0 0 388 368 312 S 0 0.0 1.1 0:00 atd
  190 root 0 0 456 448 412 S 0 0.0 1.4 0:00 crond

  201 bin 0 0 380 360 304 S 0 0.0 1.1 0:01
portma
  212 root 0 0 736 720 496 S 0 0.0 2.3 2:09 snmpd

  224 root 0 0 384 352 316 S 0 0.0 1.1 0:00 inetd

10109 nobody 0 0 868 868 760 S 0 0.0 2.8 0:00 httpd

Is a configuration error causing this problem ?

My rundig is virtually standard:

# ############### Start of rundig script ####################
#! /bin/sh

#
# rundig
#
# $Id: rundig,v 1.2 1998/06/22 04:32:23 turtle Exp $
#
# This is a sample script to create a search database for ht://Dig.
#
if [ "$1" = "-v" ]; then
    verbose=-v
fi
if [ "$2" = "-s" ]; then
    stats=-s
fi

#
# Set the TMPDIR variable if you want htmerge to put files in a location

# other than the default. This is important if you do not have enough
# disk space for the big sort that htmerge runs. Also, be aware that
# on some systems, /tmp is a memory mapped filesystem that takes away
# from virtual memory.
#
# from virtual memory.
#
TMPDIR=/var/lib/htdig/db
export TMPDIR

/usr/sbin/htdig -i $verbose $stats
/usr/sbin/htmerge $verbose $stats
/usr/sbin/htnotify $verbose

#
# Only create the endings database if it doesn't already exist.
# This database is static, so even if pages change, this database will
not
# need to be rebuilt.
#
FUZZYALGS="soundex metaphone"
if [ ! -f /var/lib/htdig/common/word2root.db ]
then
    FUZZYALGS="$FUZZYALGS endings"
fi

if [ ! -f /var/lib/htdig/common/synonyms.db ]
then
    FUZZYALGS="$FUZZYALGS synonyms"
fi
#
# Alister's comment!!
# Do not run htfuzzy for the time being!!
#
# /usr/sbin/htfuzzy $verbose $FUZZYALGS

# $######### End of Script ############################

Any help will be appreciated!



This archive was generated by hypermail 2.0b3 on Tue Jan 26 1999 - 08:10:38 PST