Re: [htdig] stmAAA* files - ???


Subject: Re: [htdig] stmAAA* files - ???
From: Geoff Hutchison (ghutchis@wso.williams.edu)
Date: Mon Jun 19 2000 - 09:59:30 PDT


On Mon, 19 Jun 100, Andreas Hudzieczek wrote:

> indexing a couple of times. Now I had to kill the process the last
> time because cron set it off too early (my blame), and a subsequent
> time this happens:

I'm a little unclear on what you killed. The process of rundig? Remember
that if you kill the parent process of any script, anything it was running
at the time (e.g. htmerge or htnotify or whatever) will become orphaned
and will continue to run until exit.

It's also not a great idea to kill htmerge in mid-run unless you're using
alternate working files. (Killing htdig is also not good, but I've gone
into that and how you probably want to use the -l flag in the 3.1.x
series.) The code has not been written with the assumption that it might
be killed. In the case of htdig, the -l flag does a good job of cleaning
up. There's no great way to clean up in htmerge in certain places. How do
you bail when you're in the middle of assembling the db.words file?

> In the database dir, there are lots of stmAAA* files hanging around,
> all created a few minutes later than the last successful (or not
> successful ?) call of rundig. In addition to that, the same dir also
> has a db.wordlist.new empty file (i.e. 0 bytes) in it, with a
> timestamp just a minute before (!)last call of rundig.

My guess is that the "strange" files are temporary files created by your
'sort' command. If you're using a largely stock rundig script, it sets
TMPDIR to point to the database directory. I would assume the
db.wordlist.new file is from your aborted call to rundig--it's created by
htmerge.

> I do not use the alternate working files.

This is largely irrelevant. The .new file is created when htmerge is
rewriting your db.wordlist file. It needs to remove words from deleted
documents and sort them and so on. So it does this in the .new file and
then gets rid of the old one.

From everything you've said, I think you didn't kill off everything when
you killed rundig (and thus the htnotify process) and your databases may
or may not be hosed. This is one reason I *do* use the alternate files. If
I need to kill a process for some reason, I can rebuild the alternate
versions without worrying that I don't have backups... It's also a reason
I'm trying to get rid of the htmerge phase altogether.

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Mon Jun 19 2000 - 07:50:46 PDT