Re: [htdig] First file found: Invalid?

Subject: Re: [htdig] First file found: Invalid?
From: Gilles Detillieux (
Date: Fri Jul 28 2000 - 07:44:18 PDT

According to David Gibbs:
> At 11:14 AM 07/27/2000, Gilles Detillieux wrote:
> >Well, it definitely turns up pages that don't contain either word. Try
> >setting description_factor to 0 and reindexing, just to make sure we can
> >rule out inappropriate link description text from links to pages that
> >turn up.
> Yep, did that ... still returns results that aren't valid.
> >If that's not the problem, I'd be inclined to suspect database corruption,
> >although you're not getting any of the other classic symptoms.
> I was at one point ... but I rebuilt the index and the details started
> showing up again.
> Just for reference ... here's my config file (comments removed to save
> space) ...
> database_dir: /home/archive/db
> common_dir: /home/archive/common
> start_url:
> limit_urls_to: ${start_url}
> exclude_urls: /cgi-bin/ .cgi /midrange-l-archive/
> use_meta_description: true
> bad_extensions: .wav .gz .z .sit .au .zip .tar .hqx .exe .com .gif \
> .jpg .jpeg .aiff .class .map .ram .tgz .bin .rpm .mpg .mov
> .avi
> maintainer:
> max_head_length: 10000
> max_doc_size: 200000
> no_excerpt_show_top: true
> #search_algorithm: exact:1 synonyms:0.5 endings:0.1
> #search_algorithm: exact:1
> local_urls:
> local_urls_only: true
> backlink_factor: 0
> description_factor: 0

Nothing jumps out at me as being a potential problem. What OS version
are you running (if Linux, which distribution)? Did you build htdig from
source, or install from an RPM or pre-compiled binary? If the latter,
which one?

Gilles R. Detillieux              E-mail: <>
Spinal Cord Research Centre       WWW:
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to You will receive a message to confirm this.

This archive was generated by hypermail 2b28 : Thu Jul 27 2000 - 21:42:58 PDT