Subject: [htdig] avoiding binary attachments when indexing email archives
From: Brett Dikeman (brett@artelsoft.com)
Date: Wed Feb 02 2000 - 09:03:15 PST
what's the best way to avoid attachments in archived email?
Otherwise, fuzzy searches end up including random "words" made up of
many random characters, drawn from what htdig considered "text"; I
can find it in emails people sent that included binary attachments.
Second, if htdig is on the same machine as the site I'm searching,
how do I avoid the overhead of using http to do the indexing? Giving
full path names in "start_url" broke the rundig script and trashed my
index files.
I tried searching the archives with several different keywords; came
up with lots of stuff, but nothing I wanted :-)
Thx,
Brett
---- Brett Dikeman Network/System Admin Artel Software 617-451-9900 381 Congress Street 617-451-9916(fax) Boston, MA 02210 http://www.borisfx.com------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.
This archive was generated by hypermail 2b28 : Wed Feb 02 2000 - 09:04:56 PST