Re: [htdig] We are adding MD5 and reverse indexing

Torsten Neuer (
Fri, 17 Sep 1999 17:22:26 +0200

According to geoff s.:
>Be advised that we are modifying htDig as follows:-
>1. MD5 hashing for DB key and duplication detection (esp useful for email)
>2. Conversion from DB2 to mySQL (this is likely but not yet definite)
>3. Proximity searches (ie word x within n words of word y)
>4. File annotations, subject, author, file format etc
>And, most importantly, ability to search from item 4 to find matching files
>and a lot of code tidying up and support for LZW decoding in PDF, plus
>on-the-fly annotations in 4 as and when people find useful stuff. We are
>using it, amongst other things, for litigation support.
>Pity it is C++, but we hope to upload the above in the next 10 days ("
>was announced today that htDig 2000 might be delayed for new enhanced
>features, blah blah :-)))))"
>I hope our humble contribution finds some fans and the code tree. It is so
>much easier to follow pioneers that to be one, and I'm eternally grateful to
>the folks who kicked htDig off in the first place.
>Does anyone have any ideas on this ?

Yes.. don't rely on MySQL. Better put in some support for generic
databases that can be extended to use other SQL engines (such as
PostgreSQL) or have some autoconf scheme that selects between the
supported database backends. MySQL has some limitations with re-
gards to licensing.


InWise - Wirtschaftlich-Wissenschaftlicher Internet Service GmbH
Waldhofstraße 14                            Tel: +49-4101-403605
D-25474 Ellerbek                            Fax: +49-4101-403606
E-Mail:            Internet:

------------------------------------ To unsubscribe from the htdig mailing list, send a message to containing the single word unsubscribe in the SUBJECT of the message.

This archive was generated by hypermail 2.0b3 on Fri Sep 17 1999 - 08:33:30 PDT