Re: htdig: Intelligent Engine...


webmaster@www.nisu.flinders.edu.au
Tue, 17 Mar 1998 10:02:06 +0930 (CST)


On 11 Mar, Martyn Jones wrote:
> Hi,
>
> I am mailing you regarding a question I have been unable to resolve and after
> reading a book called 'Web developer.com guide to search engines' I noticed your
> e-mail address had been placed in the book.
>
> I wondered if it is possible to create a type of thesaurus list that a search
> engine can check every time it runs, and this list can be updated with relevant
> words. e.g.
>
> If I wanted to do a fuzzy search on 'Jazz Night Out In North London'
> The search engine would check the list and think 'ah, it say's night, so I
> could also check for anything with evening'.
>
> Is this possible, and if so would you be able to point me in the correct
> direction to implementing this, thanks in advance.
>
> Regards,
> Martyn Jones

You certainly can. HTDig supports a synonym list - you can define
numbers of words to be used in the search if any one of them is
requested. For example, from the default settings, the following
(among others) are declared as synonyms:

car auto automobile

To implement additional synonyms, there are two steps.

1) Edit the file synonyms in the htdig/common directory to include
those you require. Each line of the file needs at least two words -
first is the word to replace and the rest are synonyms for that word.

2) Run htfuzzy with the 'algorithm' argument synonyms;

    htfuzzy [-v] synonyms

This will rebuild the synonyms db. That's all there is to do.

Cheers

-- 
David Robley

WEBMASTER | Phone +61 8 8374 0970 RESEARCH CENTRE FOR INJURY STUDIES | http://www.nisu.flinders.edu.au/ AusEinet | http://auseinet/flinders.edu.au/ Flinders University, ADELAIDE, SOUTH AUSTRALIA

---------------------------------------------------------------------- To unsubscribe from the htdig mailing list, send a message to htdig-request@sdsu.edu containing the single word "unsubscribe" in the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:25:49 PST