Re: [htdig] no http


Subject: Re: [htdig] no http
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Mon Feb 07 2000 - 12:20:18 PST


According to Search Engine:
> > Questions:
> > 1) are you running htsearch with the same htdig.conf that you used for
> > htdig and htmerge (or rundig)?
>
> I used "rundig".
>
> > 2) does your search.html have any customisations or is it out-of-the-box?
>
> out-of-the-box.
>
> > 3) does your http server run htsearch under some sort of virtual root?
>
> yes, this is on a VirtualHost. Is that a problem with htdig?

Not normally, but some virtual host configurations run each host under
its own virtual root directory, i.e. the web server does a "chroot" to
change the root directory. If this is the case on your system, that
means that when the server runs for your virtual domain, it will see
the filesystem from a different root directory than you see from the
command line, and any CGI programs (like htsearch) will also run under
that different root directory. If so, when it looks for databases in
/opt/www/htdig/db, it may be looking somewhere else altogether, e.g.
/home/servers/www.yashy.com/opt/www/htdig/db, depending on how the
server has been configured. You'll need to figure out where your server
does its "chroot" to, and change your configuration files accordingly.
Your htdig/htmerge/htfuzzy config file will have to contain the full
path to the database_dir and common_dir where you'll store your files,
as seen from the command line, while your htsearch config file will have
to have definitions of database_dir and common_dir that strip off the
path to the virtual root, so that the paths are as they will appear in
the chroot'ed environment.

> > 4) do you get the same incorrect results when manually running the command
> > "htsearch words=htdig" from the command line? (Replace the word htdig
> > with whichever search word you were using before.)
>
> ($:/opt/www/cgi-bin): htsearch words=book
> DB2 problem...: /opt/www/htdig/common/synonyms.db: No such file or
> directory
> DB2 problem...: /opt/www/htdig/common/word2root.db: No such file or
> directory

OK, these two errors are probably because you never ran "htfuzzy endings"
and "htfuzzy synonyms". However, as you can see below, htsearch did give
correct results when run from the command line.

> Content-type: text/html
>
> <html><head><title>Search results for 'book'</title></head>
...
> <hr noshade size=1>
> <dl><dt><strong><a href="http://www.htdig.org/meta.html">ht://Dig:
> Recognized META information in HTML documents </a></strong><img
> src="/htdig/star.gif" alt="*"><img src="/htdig/star.gif" alt="*"><img
> src="/htdig/star.gif" alt="*"><img src="/htdig/star.gif" alt="*">
> </dt><dd><b><tt>... </tt></b> synonyms for words in the document. For
> example, if a document is a telephone directory, possible keywords could
> be "telephone phone directory <strong>book</strong> list". Now, regardless
> of what text is actually in the document, it can be found if these
> keywords are used in the search. The weight that words in the<b><tt>
> ...</tt></b><br>
> <i><a
> href="http://www.htdig.org/meta.html">http://www.htdig.org/meta.html></i>
> <font size=-1>12/09/99, 7428 bytes</font>
> </dd></dl>
>
>
> <hr noshade size=4>
> <a href="
http://www.htdig.org">
> <img src="/htdig/htdig.gif" border=0>ht://Dig 3.1.4</a>
> </body></html>

You'll note that htsearch did put out the document excerpt for the match,
and the date and size are non-zero. This is as it should be. That you
don't get the very same results for the same search string, when running
under your web server, would suggest a problem with your web server
configuration, such as a chroot discrepancy.

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Mon Feb 07 2000 - 12:59:48 PST