[htdig] Howto delete unwanted docs?

Subject: [htdig] Howto delete unwanted docs?
From: Giancarlo (ping@alt.it)
Date: Wed Nov 10 1999 - 11:03:10 PST

After a wild run over a few sites (with some wrong parameters) I found
that the db is fouled with thousands of unwanted docs, all from the same
I don't want to rebuild the whole thing with -i, so I was thinking about
a trick, and was wondering if it might work:

-set in /etc/hosts something like: www.unwantedhost.com

-run htdig on a new db (-i) for only the unwanted host with

remove_bad_urls: true

-merge the new created db into the previos (fauled) one.

Would htmerge delete all unreachable urls from the db ?

Any other trick?


it would be great il -v or -vv could printout the configuration
parameters as read by htdig. I could then trap any configuration error
just upon starting the run and block it.

To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word unsubscribe in
the SUBJECT of the message.

This archive was generated by hypermail 2b25 : Wed Nov 10 1999 - 11:05:10 PST