[htdig] Regexp inclusion of pages in dig


Alok K. Dhir (adhir@forumone.com)
Wed, 30 Jun 1999 11:44:18 -0400 (EDT)


Hi - we have recently started playing with ht://dig and are extremely
impressed.

A couple of things that we think we need, but can't figure out how to do:

We'd like to have the ability to:

1. munge URL's during digging but before adding them to the db in order
to either change them to something which would cause them to be skipped
(i.e. because the same URL has already been indexed) or to remove
superfluous query information (i.e. change
http://www.foo.com/page.html?12345 to http://www.foo.com/page.html). One
cool way to allow this in a flexible way would be to allow a definable
external program to munge URL's before including in the db.

2. use regular expressions in exlude_urls and limt_urls_to.

Are these features planned for future relases of the product? Do
workarounds exist to simulate these?

Thanks!

Al

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Wed Jun 30 1999 - 07:59:49 PDT