[htdig] ANNOUNCE: updated ht://Dig 3.1.2 RPMs for Red Hat

Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Thu, 19 Aug 1999 13:36:29 -0500

I've just uploaded source and binary rpms for the ht://Dig 3.1.2 web
site search engine to incoming.redhat.com, for eventual inclusion
on contrib.redhat.com. I've also placed them on the htdig.org site,
in http://www.htdig.org/files/binaries/. They can also be downloaded
from the SCRC web site, at


This is the latest stable release and is recommended for all production

The following RPMs were built on Red Hat Linux 4.2, 5.0* and 6.0 respectively:

htdig-3.1.2-4glibc.i386.rpm * (see note below)
htdig-3.1.2-4glibc.src.rpm * (see note below)
htdig-3.1.2-4glibc21.i386.rpm (for glibc-2.1, Red Hat 6.0)
htdig-3.1.2-4glibc21.src.rpm (for glibc-2.1, Red Hat 6.0)

Run /usr/sbin/rundig after installing, to (re)build all your databases.
(A complete reindexing isn't necessary if you had htdig-3.1.2-0 previously


Name : htdig Distribution: (none)
Version : 3.1.2 Vendor: (none)
Release : 4 Build Date: Wed Aug 18 16:44:59 1999
Install date: Wed Aug 18 16:46:48 1999 Build Host: cliff.scrc.umanitoba.ca
Group : Networking/Utilities Source RPM: htdig-3.1.2-4.src.rpm
Size : 3166490
Packager : Gilles Detillieux <grdetil@scrc.umanitoba.ca>
URL : http://www.htdig.org/
Summary : A web indexing and searching system for a small domain or intranet
Description :
The ht://Dig system is a complete world wide web indexing and searching
system for a small domain or intranet. This system is not meant to replace
the need for powerful internet-wide search systems like Lycos, Infoseek,
Webcrawler and AltaVista. Instead it is meant to cover the search needs for
a single company, campus, or even a particular sub section of a web site.

As opposed to some WAIS-based or web-server based search engines, ht://Dig
can span several web servers at a site. The type of these different web
servers doesn't matter as long as they understand the HTTP 1.0 protocol.


* Note to Red Hat 5.0 & 5.1 users:

There's an obscure bug in vixie-cron on Red Hat 5.0 and 5.1 systems,
in its SIGCHLD signal handling. It causes htmerge to fail consistently
with a "Word sort failed" error, when run from a cron job. It could
potentially cause similar problems with other jobs. I recommend upgrading
to the latest vixie-cron from the 5.2 distribution:


Unfortunately, even though Red Hat discovered and fixed the problem back
in June, they did not mention it in their errata or issue update RPMs.
They can be obtained from any Red Hat Linux distribution mirror site, or
along with the htdig RPMs from my web site above.


    Changes included in release 4, as patches to 3.1.2:
        - allows multiple keyword parameter definitions in search form
        - patched to support Acrobat 4's acroread program (Acrobat 3 still
        - updated version of parse_doc.pl, to work with xpdf 0.90
        - updated FAQ
        - PR#339 fixed - URL encodes all non-ASCII characters in URIs
        - PR#560 fixed - prevent inappropriate suffix stripping in endings fuzzy
        - PR#542 fixed - URL passed to external parser now quoted
        - PR#541 fixed - ANCHOR variable now set properly
        - PR#535 & PR#557 fixed - HTTP header parsing now more robust
        - username/password now blotted out from command arguments
        - adds support for <embed>, <object> and <link> tags
        - PR#554 fixed - locale now affects default date format in htsearch
        - fixes the bug in the handling of modification_time_is_now
        - PR#578 fixed - multiple directives in <meta> robots tag now work
        - now gives an error message for unknown hosts
        - PR#514 fixed - null strings & non-ASCII letters won't crash htfuzzy
        - PDF parser now clears title string properly when done with it
        - PR#543 & PR#585 fixed - names like left_index.html no longer stripped
        - fixes server_alias entries so port defaults to 80 if omitted
        - decodes SGML entities inside tag attributes
        - PR#566 fixed - urls like 'http:/dir/file.ext' resolved properly
        - $(VAR) at end of template string now being expanded properly
        - PR#595 fixed - corrected address for FSF
        - maximum word length now a config attribute, not compile-time option
        - PR#81 & PR#472 fixed - htdig -vvv shouldn't crash in strftime()
        - PR#348 fixed - missing or invalid port number will get set correctly
        - PR#493 fixed - valid URL with ".." within a file name not rejected
        - PR#572 fixed - htsearch won't crash if CONTENT_LENGTH not set
        - PR#545 fixed - configure tests for presence of alloca.h for regex.c
        - documentation updates, including PR#558 & PR#626.


   Release notes for htdig-3.1.2 21 Apr 1999
   This version fixes a number of bugs in the 3.1.1 release and is the
   latest stable release of ht://Dig. It is highly recommended for
   production servers.
     * Fixed a bug that ignored META description tags when they were also
       added to the meta_keywords attribute.
     * Fixed the HTML comment parsing to be more lenient about
       non-standard comments.
     * Fixed problems in the date-parsing code that made it Y2K
       incompatible. In particular, it forgot that 2000 is a leap year
       and wouldn't correctly parse dates after 29 Feb 2000.
     * Fixed a variety of bugs in the HTML parser.
     * Fixed an old bug that would exclude all URLs if the exclude_urls
       attribute left empty.
     * Fixed display of META description tags. Now it always shows the
       top of a description. If no description exists, it looks for the
       search terms in the excerpt as usual.
     * Fixed some small memory leaks.
     * Changed the htfuzzy endings algorithm to use a more efficient
       regex system. Speed improvements on non-English languages are
       noted, now taking minutes for generation that would take days!
     * Changed the noindex_start and noindex_end attributes to allow
       case-insensitive matching.
     * Added on-disk versions of the builtin templates to make it more
       obvious how to change the results templates.
     * Added date_format attribute to change the format of dates
       output in search results.
     * Added extra_word_characters attribute that defines extra
       characters that should be considered part of a word, rather than
     * Several other, relatively minor bugs were also fixed. Many thanks
       to those who sent in bug reports and to Gilles Detillieux for
       coordinating this release.

   The full ChangeLog for this release is available from:

Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

