AW: [htdig] irrelevant pages in search


Subject: AW: [htdig] irrelevant pages in search
From: Hartmut Steffin (h.steffin@abi-behoerden.de)
Date: Thu Nov 25 1999 - 03:33:11 PST


Thanks for the answer,

> > htmerge does not seem to honour the TMPDIR variable which
> IS properly set
this seems to be an individual problem on my machine. there is even a
difference in running rundig from commandline (ok) and via cron/batch
(erroneous)

> > in ANY case,
> > 1. htmerge should do a better error message (I even used -v)
>
> We're open to suggestions, but if the problem is the sort
> program that fails
> silently, there isn't much that htmerge can do to guess at why.
hmm, maybe this was me yelling out too loud without thinking. I think you
cannot do more than supplying stderr of sort plus maybe errno the exit value
as a hint.

> > 2. htsearch should be able to identify a corrupt db
> I too would like to see more error checking to detect such
> problems, but
> I wouldn't know where to begin in adding code, and what to
> look for in terms
> of database problems. Anyone else have any ideas?
IMHO this is the most important part. I did not have a look at sources so
far, but isn't it possible to have a flag "under_construction" somewhere (as
part of the db itself) that is set as long as different files of the db are
not reflecting the status quo? I am not in internals, but i feel you even
have bad results between running htdig and htmerge? so the flag could even
state "ok", "htdig running", "sorting", "merging" .... (and possibly count
in the presence of the -i flag if necessary)
htsearch could read this flag and tell if a search might be unreliable right
now. (or even give this wonderful message "contact the webmaster...." :(

Just ideas, I don't know how practicable.
Hardy

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-unsubscribe@htdig.org
You'll receive a message confirming the unsubscription.



This archive was generated by hypermail 2b25 : Thu Nov 25 1999 - 03:44:58 PST