Re: [htdig3-dev] Regex


Geoff Hutchison (ghutchis@wso.williams.edu)
Wed, 5 May 1999 08:32:16 -0400


At 3:43 AM -0400 5/5/99, Torsten Neuer wrote:
>We could do so simply by putting regexp in double quotes.. anything
>else will be handled as usual, e.g.
>
>start_url: http://www.foo.com/
>limit_urls_to: ${start_url} \ # as in start_url
> "\.*.html" \ # regexp match
> /bar/ # again, a normal match
>
>Internally, each entry gets a "type descriptor" that dispatches the
>value to the correct handler, i.e. a virtual method.

I like this idea. Quotes aren't a bad choice, but it would be nice to pick
a character that could be used in the htsearch fields too. Maybe:

start_url: http://www.foo.com/
limit_urls_to: ${start_url} \ # as in start_url
                [\.*.html] \ # regexp match
                /bar/ # again, a normal match

Internally, both types of limits would be regexp, but we'd escape those
that weren't enclosed in brackets (or whatever). So in the above case, the
limit becomes:

limit_urls_to: http://www\.foo\.com/|\.*.html|/bar/

Does this make sense?

-Geoff

------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
htdig3-dev@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Wed May 05 1999 - 05:42:32 PDT