Bugs in htdig-3.0.8b2

Alexander Yu. Zotov (sandy@agmar.ru)
Sun, 10 May 1998 14:38:30 +0400 (MSD)


I have downloaded sources of htdig-3.0.8b2, instaled it
as search engine for my site, and testing it now.
 I think this program is great, I am planning to use it
for my further projects and I added to it some new
functionality (obtaining limit_url_to from first level html
document links, changing encoding of a document on the base
of document charset, etc).
 But I found some bugs (I have corrected them, but I think you
may want to eliminate these bugs in further versions of your

The most serious bug is in the String::indexOf(char* str) method
in htlib/Strings.cc.
 It arises when your string don't have '\0' at Data[Length]
position (for example when indexOf is called from htdig/Retriever.cc
when removing exessive "../" from URL).
When you make strncmp() it includes in search characters after
Data[Length-1] which actually are not the part of the string.
So, method can return wrong results.
 I corrected the problem by inserting
Data[Length] = '\0';
 before strncmp() calls.

There are two minor bugs (with setlocale and with html entities
preprocessing), which I can describe to you too.

Sorry for my bad English

Alexander Zotov,
 webmaster of http://www.agmar.ru

