[htdig] Duration of Htsearch Processing (3.1.5)


Subject: [htdig] Duration of Htsearch Processing (3.1.5)
From: Sphboc@aol.com
Date: Sat Mar 18 2000 - 15:58:57 PST


We are running into some situations where the duration of htsearch processing
-- when a fairly-common word has been sought -- is long enough to cause
problems (timeouts in the invoking process).

Looking at documentation, it does not appear that there is any option in
either the conf file or the parameters passed to htsearch, to limit the
number of matches which are located and sorted. If "several thousand"
documents match the specified words, all of these have to participate in
sorting; there's no way to limit the number which participate.

Use of "bad_words" operates as documented, but this prevents any matches from
being processed.

Appears to me that I could inspect the .wordlist file produced by htdig,
locate the records which are resulting in unwanted matches, and remove these
prior to running htmerge.

I'm hoping that this will merely result in a smaller .words.db file, and that
the .words.db entries which DO get written will still be processed correctly.
 (ie, that the inconsistency, due to the absence of the deleted entries, will
not result in any unforseen problems).

  

Steven P Haver/602-242-9708

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-unsubscribe@htdig.org
You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Sat Mar 18 2000 - 14:57:01 PST