Subject: Re: [htdig] excluding index-pages
From: Geoff Hutchison (email@example.com)
Date: Tue May 23 2000 - 09:05:12 PDT
On 23 May 2000, Andreas Vogt wrote:
> I want htdig to search the links on that index.html, because these are the
> messagexxxx.html, but I don't want to search the text of index.html
> (It's like indexing the whole book and also the index and content pages)
> If I add index.html to the exclude patterns, not only the text is gone,
> but also the text of the hyperlinks.
You say that you don't want to *search* the text of index.html, so I would
do exactly that. I would index normally and in the search form use either:
<input type="hidden" name="exclude" value="index.html">
or (more likely to work):
<input type="hidden" name="restrict" value="message">
One tidbit--when the text of links is indexed, it counts as plain text for
the page it's on, but it counts a description (i.e. description_factor)
for the page that's the target of the link. So that hyperlink text counts
for the messagexxxx.html pages automatically.
-- -Geoff Hutchison Williams Students Online http://wso.williams.edu/
------------------------------------ To unsubscribe from the htdig mailing list, send a message to firstname.lastname@example.org You will receive a message to confirm this.
This archive was generated by hypermail 2b28 : Tue May 23 2000 - 06:53:51 PDT