Re: [htdig] First file found: Invalid?

Subject: Re: [htdig] First file found: Invalid?
From: David Gibbs (
Date: Thu Jul 27 2000 - 16:06:43 PDT

At 11:14 AM 07/27/2000, Gilles Detillieux wrote:
>Well, it definitely turns up pages that don't contain either word. Try
>setting description_factor to 0 and reindexing, just to make sure we can
>rule out inappropriate link description text from links to pages that
>turn up.

Yep, did that ... still returns results that aren't valid.

>If that's not the problem, I'd be inclined to suspect database corruption,
>although you're not getting any of the other classic symptoms.

I was at one point ... but I rebuilt the index and the details started
showing up again.

Just for reference ... here's my config file (comments removed to save
space) ...

database_dir: /home/archive/db
common_dir: /home/archive/common
limit_urls_to: ${start_url}
exclude_urls: /cgi-bin/ .cgi /midrange-l-archive/
use_meta_description: true
bad_extensions: .wav .gz .z .sit .au .zip .tar .hqx .exe .com .gif \
                 .jpg .jpeg .aiff .class .map .ram .tgz .bin .rpm .mpg .mov
max_head_length: 10000
max_doc_size: 200000
no_excerpt_show_top: true
#search_algorithm: exact:1 synonyms:0.5 endings:0.1
#search_algorithm: exact:1
local_urls_only: true
backlink_factor: 0
description_factor: 0


| Internet:
| WWW:
| This message was written and delivered using 100%
| post-consumer (recycled) data bits.

------------------------------------ To unsubscribe from the htdig mailing list, send a message to You will receive a message to confirm this.

This archive was generated by hypermail 2b28 : Thu Jul 27 2000 - 06:06:34 PDT