Re: [htdig] no http


Subject: Re: [htdig] no http
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Fri Feb 04 2000 - 07:59:18 PST


OK, to recap, you're running htdig 3.1.4 on a 133 MHz Pentium PC running
SuSE Linux, kernel version 2.1.x. The dig seems to progress fine with
an htdig.conf with no customisations, but when you search, the "http://"
is stripped off of result URLs, the document size for matches is shown
as 0, there is no excerpt, and no modification time shown.

This suggests that htsearch is not correctly reading the information
from db.docdb, or that that information is somehow corrupt. Possibilities
that come to mind are:
- htdig or htmerge is not generating it correctly
- a bug in your compiler is producing incorrect Serialize/Deserialize code
  in the module that writes/reads that database.
- htsearch is not looking at the same database that you generated with
  htdig/htmerge.

Questions:
1) are you running htsearch with the same htdig.conf that you used for
htdig and htmerge (or rundig)?
2) does your search.html have any customisations or is it out-of-the-box?
3) does your http server run htsearch under some sort of virtual root?
4) do you get the same incorrect results when manually running the command
   "htsearch words=htdig" from the command line? (Replace the word htdig
   with whichever search word you were using before.)

If the test in (4) above fails, I think the most likely cause would be
a compiler or library bug, in which case you should see if SuSE makes
updates for these available. You may also want to try one of the binary
RPMs under http://www.htdig.org/files/binaries/ on your system. Pick the
one that is appropriate for your C library version (-0 for libc5, -0glibc
for glibc 2.0, and -0glibc21 for glibc 2.1).

According to Search Engine:
> On Tue, 1 Feb 2000, Geoff Hutchison wrote:
> > OK, but what does your config file (htdig.conf maybe?) look like?
>
> It contains only the out-of-the-box settings and doesn't work. I
> initially tried it with my URL under start_url and my email address under
> maintainer. Any help would be appreciated.
>
> database_dir: /opt/www/htdig/db
> start_url: http://www.htdig.org/
> limit_urls_to: ${start_url}
> exclude_urls: /cgi-bin/ .cgi
> bad_extensions: .wav .gz .z .sit .au .zip .tar .hqx .exe .com .gif \
> .jpg .jpeg .aiff .class .map .ram .tgz .bin .rpm .mpg .mov .avi
> maintainer: unconfigured@htdig.searchengine.maintainer
> max_head_length: 10000
> max_doc_size: 200000
> no_excerpt_show_top: true
> search_algorithm: exact:1 synonyms:0.5 endings:0.1
...

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Fri Feb 04 2000 - 08:01:16 PST