Re: htdig: patch for Retriever.cc


Geoff Hutchison (Geoffrey.R.Hutchison@williams.edu)
Fri, 17 Apr 1998 14:53:29 -0400


>I do not know why they weren't included in the 3.08b2. Unfortunately the
>line numbers may not match and you should do it the hard way, manually
>patch the changes; some of the memory leak patches have been applied;
>you'd see them in the code. There are at least two of Pasi's patches you
>should apply; they both relate to local file systems.

I don't pretend to speak for either Pasi or Andrew as to why the local
filesystem patches were not included in 3.0.8b2. I seem to remember talking
to Pasi (as one of the testers of the local user patch) that the reasoning
was the new version was for critical (i.e. memory leak) bugs. Since the
local patches weren't well-tested at the time or as critical, they would be
left for the next version. I have had no problems with the patches in their
current form.

While manually applying the patches is probably best, GNU patch is very
good about detecting (and adapting for) changed line numbers.

>> On another note, I thought that I read that there is a way for htdig to
>>only
>> dig for pages that have changed since the last dig, and then append this to
>> the database. Do

This is called update digging. Calling htdig with -i will do an initial dig
that will dig for everything. If you don't it should check to see if pages
have changed or for new pages and rebuild the database. However the whole
merging phase must be redone. One problem with update digs is that you'll
need to keep around the *large* db.wordlist file (mine weighs in around
200MB).

-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:26:02 PST