Subject: [htdig] [ANNOUNCE] ht://Dig 3.2.0b2
From: Geoff Hutchison (email@example.com)
Date: Tue Apr 11 2000 - 16:36:06 PDT
I am very happy to announce the release of htdig version 3.2.0b2. This
is the second beta for the 3.2.0 release and it is the result of quite
a bit of hard work from a number of people. Again, we're looking for
as much feedback as possible, including suggestions, bug reports,
fixes, features, etc.
To download, see <http://www.htdig.org/files/htdig-3.2.0b2.tar.gz>
For documentation and Release Notes, see <http://dev.htdig.org/htdig-3.2/>
For the ChangeLog, see <http://dev.htdig.org/htdig-3.2/ChangeLog>
Feedback on the release should be primarily directed to firstname.lastname@example.org
Williams Students Online
Release notes for htdig-3.2.0b2 11 Apr 2000
This version is still marked beta because it has still only received
limited testing. However, it adds more functionality and should fix
all known bugs in the previous 3.2.0b1 release, including the security
hole fixed in version 3.1.5 in production versions. As with 3.2.0b1,
if you are upgrading from a previous version, you should read the
upgrade guide first.
* Fixed several bugs in the new HTTP/1.1 implementation that would
cause problems with so-called "Chunked" data.
* Fixed a bug in the new regex-based configuration options that
would ignore the case_sensitive attribute.
* Fixed the robots.txt parsing to more rigorously stick to the
* Fixed a bug where upper-case META robots directives would be
* Fixed a bug that could leave a connection open when it failed.
* Fixed the timeout in the connection code to ensure that hung
connections are killed properly.
* Fixed a bug where duplicates of modified documents could pile up
* Fixed a bug in the SGML entity handling where numeric entities
would be ignored. (e.g. ¢ -> ¢)
* Fixed a bug in the new configuration parser that wouldn't accept
lists including numbers
* Fixed a potential infinite loop in the phrase searching parser
that came up when fuzzy algorithms were used.
* The HTML parser now ignores anything between <script> tags, much
like it does for <style> tags.
* Fixed some performance problems in the new word database code.
* Removed the attributes translate_quot, translate_lt, translate_gt
and translate_amp since all SGML entities are now encoded and
decoded when displayed.
* Removed the attribute uncoded_db_compatible since the 3.2
databases are no longer compatible with previous versions anyway.
* Removed the attribute word_list because the db.wordlist file is no
longer generated. To get an ASCII version of the database, use the
* Removed the pdf_parser attribute. It is now preferred to use the
external parser or external converter support with xpdf.
* The wordlist_compress attribute is now turned on by default.
* The output from htsearch and the default and included templates
should now be more HTML-4.0 compliant.
* Added support for searching collections of multiple databases. To
use this, supply multiple config fields or config names separated
by "|" characters. Also see the collection_names attribute.
* Added a new accents fuzzy algorithm, which treats accented and
unaccented words the same. You must create an accents_db with
htfuzzy after indexing.
* Added new attributes tcp_max_retries and tcp_wait_time to
control how many times a low-level connection is retried and how
long to wait on a hung connection.
* Add any_keywords attribute to OR the keywords field in a
search form instead of AND-ing them together.
* Add the attributes search_results_order and url_seed_score
to control result ranking and scoring based on URL patterns.
* Moved the htnotify program into the new httools directory.
* Added the programs htdump, htload, htstat and
* There are the usual variety of other fixes and changes. See the
ChangeLog for more details.
* Once again, a huge thank you to everyone who contributed bug
reports, fixes and patches!
To unsubscribe from the htdig mailing list, send a message to
You will receive a message to confirm this.
This archive was generated by hypermail 2b28 : Tue Apr 11 2000 - 14:22:23 PDT