Albert Desimone jr (bdesimon@arches.uga.edu)
Fri, 14 May 1999 08:44:02 -0400 (EDT)
Hi ...
This is more of a "sharing my experience" thing than anything else.
A few observations while I sing the praises of ht://Dig.
ht://Dig is bad to the bone, make no mistake about it. I give it
high praise with one of my favorite Georgia (USA) expressions: Finer
than frog hair split nine ways. (BTW: Frogs have *very* fine hair.)
For several months now, a search box for websites has been available
right on the UGA homepage (www.uga.edu) -- powered by ht:/Dig.
But surely I digress.
Anyway, I am working with 3.1.2, upgrading from 3.0.8b2. Even though I
am now including pdf files (acroread as parser) with no change
to max_doc_size (assuming the default to still be 10K), my db files
grew by a factor of 2.5 with the same hop count (-h 6). No big deal
since I have plenty of disk space, but was just a little surprised.
The size of the db files can *certainly* be related to the
increased number of documents being indexed, which was also a
little curious.
Search results were really slow, until I added:
backlink_factor: 0
to htdig.conf
WOW!!! What a difference; the trade-off with back linking is well
worth it (IMHO).
I was wondering (if anyone has really read this far) how do you handle
upgrading ht://Dig? I have an upgrade path in mind, but it isn't
pretty. Any thoughts on this?
-bd
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.
This archive was generated by hypermail 2.0b3 on Fri May 14 1999 - 06:00:33 PDT