loic@ceic.com
Sun, 19 Sep 1999 22:40:30 +0200 (MEST)
Tillman, James writes:
> Is someone already working on a perl interface to htsearch? I and a friend
> of mine are interested in doing the work, but don't want to duplicate anyone
> else's effort.
>
> What we really want is an XS module.
>
Hi,
I'm going to do that, in a way. Let me explain.
In the search/index methods there are a few different levels:
Data:
. The word database
. The document database
Functions:
a The word insertion/udpate/delete (indexing)
b The document parsing
c The search query parsing (building a query syntax tree)
d The query resolution (using the syntax tree to match words)
d The information retrieval (given top N matches for a query
retrieve the relevant document information)
e The information display
I'm currently working hard on 'a' and will provide a perl XS interface
to it. It will define a set of primitives to access the word database.
I won't do anything (yet) concerning the document database. The next
step is to implement 'd'. This requires to define the syntax tree. At
present c/d are intermixed, which is a very confusing thing. For one thing
it prevents easy implementation of a new query syntax. Many people would
love to have AltaVista like syntax :-)
I plan to release 'a' by Wednesday (including unary tests). Being a
co-author of the Text::Query CPAN module and author of the Text::Query-SQL
CPAN module, I already have a syntax tree structure in mind. My idea is to
be compatible with it in htdig so that Perl interface search have the same
semantic as the htdig C++ search library.
If you could explain what you have in mind and what you need, we can work
together for the time needed to release the beast :-)
Cheers,
-- Loic DacharyECILA 100 av. du Gal Leclerc 93500 Pantin - France Tel: 33 1 56 96 09 80, Fax: 33 1 56 96 09 61 e-mail: Loic@Dachary.org URL: http://www.senga.org/
------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message.
This archive was generated by hypermail 2.0b3 on Sun Sep 26 1999 - 13:55:04 PDT