Re: [htdig] Search engine


Gabriel Fenteany (gfenteany@rics.bwh.harvard.edu)
Fri, 25 Jun 1999 18:11:21 -0400


>
> Hi,
> I am in the kprocess of setting up a search engine for our intranet
(corporate),
> presently I am searching for a search Engine, prefarable share-ware. A friend
> recomended htDig my question is to all those who have used htDig are:
> Thanks
> M.S.K
> ****

Hello. I looked into this question extensively before picking ht://Dig for
the 300+ servers and 50,000+ pages of the WWW Virtual Library. Some of our
sites are straight HTML, others databases with dynamic script-generated
HTML, many .cgi script-based. It's worked beautifully (no problems with
runaway indexing either, despite the .cgi files), and users are impressed
with the professional-quality of the search package.

It's almost commercial grade. In some ways it is better such as weighting
configurability, ease of foreign-language support, and meta data support,
which is not great yet but still a hell of a lot better than the very weak
support of commercial packages. I have to qualify this by saying that
ht://Dig is presently lacking phrase searching, real XML support and dynamic
updating of the databases. However, these features will be available in the
next release, I have heard from one of the developers, Geoff Hutchison.

> How have used it how is the performance?,

Performance is very good. Searching is pretty fast. Indexing for me takes
a long time with such a big metasite. However, this is slated to be
significantly improved, I believe.

> Are there any customization issues?

Very easy, if you are familiar with standard-type Unix configuration files,
like Apache's. One of the great strengths of the package is that it is
extremely configurable, and it can be configured to most anything you could
ask of a search engine today, with the caveats on the present release listed
above. There are often two ways of doing the same or similar things, which
sometimes may seem confusing but is actually a strength since it really make
the software more robust and adaptable to different systems and
configurations.

> Do you of other similar products?

You really don't have a choice, unless you are willing to pay a lot of money
for a commercial alternative. ht://Dig is simply the *only* actively
developed, powerful, (almost) full-featured and commercial-grade search
package available today. Harvest development is no more. Swish-E is really
lightweight. Web Glimpse suffers from both of these problems. Free
versions of AltaVista or Excite have document limits attached to them (stop
working after a relatively low number of files indexed, like 1,000), Et
cetera.

There really are *no* other viable non-commercial alternatives. I will
stress that I have no vested interest in ht://Dig succeeding. I am just
conceding that for our large multiple server site, no other package would
come close to working for us (unless we wanted to spend a lot of money).

Good luck

Gabriel

--
Gabriel Fenteany, Ph.D.
http://vl.bwh.harvard.edu
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Fri Jun 25 1999 - 14:30:30 PDT