RE: [htdig] Indexing/searching for version numbers and dir paths


Subject: RE: [htdig] Indexing/searching for version numbers and dir paths
From: Tim Leggett (tleggett@msi-world.com)
Date: Thu Jul 13 2000 - 14:26:29 PDT


Thank you--I was thinking about extra_word_characters, but didn't complete
the thought.

So, I add a slash and period to extra_word_characters (I already have the
underscore there). Should I remove these chars from valid_punctuation (as
I've done with the period)? So, my params would be:

extra_word_characters: _/.
valid_punctuation: <the default set, less _/. >

Or, does htdig process valid_punctuation by ignoring, for example,
<period+space>. In that case, I suppose I could have this combination of
parameters, yes?

extra_word_characters: _/.
valid_punctuation: <the default set>

Tim

-----Original Message-----
From: Gilles Detillieux [mailto:grdetil@scrc.umanitoba.ca]
Sent: Thursday, July 13, 2000 12:15 PM

According to Tim Leggett:
> We recently began using htdig to index our document archive, which is
> composed of several hundred pdf docs. One problem--popular searches will
be
> for version numbers, such as 5.0, 4.0.0.29, and so on. From searching the
> htdig list archive, I see that setting allow_numbers to true should index
> integers. Fine so far. But with default params, searching for 5.0 returns
> '50' as the search string (note the period is missing from the string).
> Removing the period from valid_punctuation gives us '5 and 0' as the
search
> string. Not what we wanted, either--we need the search string to be '5.0'.
>
> Another type of search involves finding directory paths, such as gw/bin
and
> sql/ddl/delta. Searching for gw/bin returns 'gw and (bin or bins)' as the
> search string. however, we need the string to be exactly 'gw/bin'.

See http://www.htdig.org/attrs.html#extra_word_characters

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:
http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Thu Jul 13 2000 - 11:42:46 PDT