Re: htdig: HTML within search strings

Andrew Scherpbier (
Thu, 02 Jul 1998 08:58:16 -0700

Colin Viebrock wrote:
> Say I have a website where the code:
> Sample<I>Code</I>
> is all over. That's the brandname - including the italics. If I do an
> htdig search for "SampleCode", I get no matches.
> Shouldn't htdig strip out all the HTML? Or is there a conf setting I need
> to do this?

htdig does strip out the HTML, but it has no knowledge of the semantics of the
HTML tags for those types of markups, so it assumes it is a word break.
Just out of curiosity, how do other search engines deal with this problem?

Andrew Scherpbier <>
Contigo Software <>
To unsubscribe from the htdig mailing list, send a message to containing the single word "unsubscribe" in
the body of the message.

This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:26:50 PST