[htdig] Hit count>0 but no URLs returned


Malcolm Austen (malcolm@sable.ox.ac.uk)
Fri, 17 Sep 1999 09:20:37 +0100 (BST)


Hi,

I have lurked for a few weeks ... but may still be seen barking up the
wrong tree 8-)

My comment below might connect with several reports I've seen just
recently on the list.

On Sun, 12 Sep 1999, Sadhunathan Nadesan wrote:

+ i've been running htdig for about a year. it's always worked
+ flawlessly. but lately it is returning the wrong url's.

I have taken over a system (it's old running htdig 3.0.8b2, I'm slowly
building a new system with the latest release) and am not yet confident I
have got it all sussed! You can expect dumb questions from me in the
future ... 8-(

<snip>

+ the symptoms are: when you do a search, all the url's returned are
+ nothing to do with your search! they are pages on the sites alright,
+ but don't really contain any of the keywords from the search. i have
+ tried re-indexing several times and get consistent, incorrect results.

I saw a very similar result on the system I was left with. It would return
hit counts > 0 but sometimes failed to deliver any links. Sometimes it
delivered links that were clearly wrong. Eventually I linked this to the
database growing every week (until it filled the disk!) ... in my case
htdig was being run with the -i flag but this only deleted and started
from scratch with the files that htdig built - htmerge was not starting
from scratch and was adding new words/references and leaving old
words/references to go bad. I fixed the problem by actively deleting the
contents of the database directory rather than asking htdig to
reinitialise.

I have not looked to see if the problem has gone away with the latest
version (ok, I'm not even sure it is a problem with htdig, maybe my
predecessor just failed to set it up right). Since my brief is to re-index
from scratch once a week I just place a new empty directory and
subsequently rename the directories so I always have an old directory with
a backup database from the previous week in it.

regards,
        Malcolm.
+
| Malcolm Austen, Tel: +44(0) 1865 273216
| Oxford University Computing Services, Fax: +44(0) 1865 273275
| 13 Banbury Road, Email - malcolm.austen@oucs.ox.ac.uk
| Oxford, OX2 6NN, England WWW - http://users.ox.ac.uk/~malcolm/
+

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word unsubscribe in
the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Fri Sep 17 1999 - 01:24:26 PDT