Re: [htdig] match part of URL?


Geoff Hutchison (ghutchis@wso.williams.edu)
Mon, 21 Jun 1999 12:09:29 -0400


Daniel Naber wrote:
> can you say how difficult it is to add this feature? If you point me to
> the files
> to change, and if it's not to difficult, I could try to add this.

I did send a response, and it's not too difficult. But see below.

> An example of what I mean: Someone searches for "foobar" and gets
> www.blah.com/~blubb/foobarblah.html as a result, even if that file
> doesn't
> contain the string "foobar".

Now the initial request was more along these lines (which is easier):

http://www.foo.com/bar/blah.html

The request was to match "foo" or "bar" or "blah." For your example,
you'd have to decide if "~" is to be stripped out (I'd say yes) and
whether you'll just go with prefix matching to get "foobar" from
"foobarblah"

If someone submits a function that splits a URLs into words, I'll finish
it. It's a matter of a time tradeoff--I'd rather work on things other
than that function and it's probably faster for me to put in the correct
place (in Retriever.cc).

-- 
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Mon Jun 21 1999 - 08:25:50 PDT