Subject: RE: [htdig] Indexing/searching for version numbers and dir paths
From: Tim Leggett (tleggett@msi-world.com)
Date: Thu Jul 13 2000 - 14:26:29 PDT
Thank you--I was thinking about extra_word_characters, but didn't complete
the thought.
So, I add a slash and period to extra_word_characters (I already have the
underscore there). Should I remove these chars from valid_punctuation (as
I've done with the period)? So, my params would be:
extra_word_characters: _/.
valid_punctuation: <the default set, less _/. >
Or, does htdig process valid_punctuation by ignoring, for example,
<period+space>. In that case, I suppose I could have this combination of
parameters, yes?
extra_word_characters: _/.
valid_punctuation: <the default set>
Tim
-----Original Message-----
From: Gilles Detillieux [mailto:grdetil@scrc.umanitoba.ca]
Sent: Thursday, July 13, 2000 12:15 PM
According to Tim Leggett:
> We recently began using htdig to index our document archive, which is
> composed of several hundred pdf docs. One problem--popular searches will
be
> for version numbers, such as 5.0, 4.0.0.29, and so on. From searching the
> htdig list archive, I see that setting allow_numbers to true should index
> integers. Fine so far. But with default params, searching for 5.0 returns
> '50' as the search string (note the period is missing from the string).
> Removing the period from valid_punctuation gives us '5 and 0' as the
search
> string. Not what we wanted, either--we need the search string to be '5.0'.
>
> Another type of search involves finding directory paths, such as gw/bin
and
> sql/ddl/delta. Searching for gw/bin returns 'gw and (bin or bins)' as the
> search string. however, we need the string to be exactly 'gw/bin'.
See http://www.htdig.org/attrs.html#extra_word_characters
-- Gilles R. Detillieux E-mail: <grdetil@scrc.umanitoba.ca> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.
------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.
This archive was generated by hypermail 2b28 : Thu Jul 13 2000 - 11:42:46 PDT