Re: [htdig] Does htmerge remove URL from database ?


Subject: Re: [htdig] Does htmerge remove URL from database ?
From: Geoff Hutchison (ghutchis@wso.williams.edu)
Date: Sat Nov 25 2000 - 20:07:22 PST


At 2:21 PM +0100 11/23/00, Olivier Korn wrote:
>I tried it and it didn't solve the problem. BTW, I don't think that
>these extra merges are necessary either.

No, they should not be at all necessary unless there's truly
something horrific wrong with the merging code--it only uses the
files directly output from htdig. (My idea was that it would be
faster if you didn't need to run htmerge on intermediate DB.)

>Now, I run :
>htmerge -c site#.conf
>then
>htmerge -c site1.conf -m site#.conf (with # > 1)
>
>If I then run
>htsearch -c site5.conf with words="rénovation tourisme", it finds
>the document (in first place.)
>But if I do
>htsearch -c site1.conf with the same words, it returns the "nomatch" document.
>
>Some of the web hosts are case sensitives and some are not. Could it
>be the source of my problem ?

I wouldn't think so. But you have to be pretty careful that the URL
encodings are shared between your site.conf files. Personally, I make
up a "main.conf," include that in the other files and only set the
start_url and a minimal number of things in the individual site.conf
files. In particular, it makes it easy to change something in all
config files at once.

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this. List archives: <http://www.htdig.org/mail/menu.html> FAQ: <http://www.htdig.org/FAQ.html>



This archive was generated by hypermail 2b28 : Sat Nov 25 2000 - 20:26:46 PST