Geoff Hutchison (ghutchis@wso.williams.edu)
Fri, 30 Apr 1999 09:11:29 -0400
At 7:33 AM -0400 4/30/99, Torsten Neuer wrote:
>it. This feature would not conform to any "standard" for search
>engines (if there is any) and thus could cause trouble to web-
Actually several major search engines, including AltaVista, seem to do
exactly this already. The feature has been requested a few times, though
never quite as specifically.
>If acceptable at all, I would further suggest the following confi-
>guration directives for this feature:
>
>url_path_as_keywords: [true|false] # self-explaining
>url_path_increment_factor: n # where n is of N
You don't need url_path_as_keywords since setting the factor to 0 will
effectively disable it.
>Should be more or less easy to implement, no? >:-]
If we're happy to limit it to only indexing "words" based on the slashes in
the path, it's not very hard. The URL class in ht://Dig already allows you
to grab only the path, so then you split it based on '/' and add the words
using the Retriever class.
I always wonder if we should worry about URLs like:
http://wso.williams.edu/cafewso/ -> cafe ?
http://www.foo.com/foo/bar/ -> foobar ?
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.
This archive was generated by hypermail 2.0b3 on Fri Apr 30 1999 - 06:31:25 PDT