[htdig] Does htmerge remove URLs from databases when merging ?

Subject: [htdig] Does htmerge remove URLs from databases when merging ?
From: Olivier Korn (olivier.korn@enseignant.org)
Date: Mon Dec 18 2000 - 09:48:37 PST


I tried everything was proposed a few weeks ago but nothing worked (even
with -v -v -v -v -v, my document was not marked as deleted.)

So, here are three config files which are good enough to reproduce my
problem. (htmerge-problem.zip is zipped with WinZip but is readable with
the free unzip for Unix we could find on several web sites.)

I'm using locale: fr_FR (look at include.conf), it could be unconvenient
for you but I don't have any other short example for the moment (I could
try to find one which doesn't use accented words if you like.)

Operations :
1. Unzip the three files into the "config_dir" of ht://Dig.
2. htdig -c ${config_dir}/site1.conf
3. htdig -c ${config_dir}/site2.conf
4. htmerge -c ${config_dir}/site1.conf
5. htmerge -c ${config_dir}/site2.conf
6. htmerge -c ${config_dir}/site1.conf -m ${config_dir}/site2.conf
7. htsearch -c ${config_dir}/site2.conf
7.1. words="rénovation tourisme" (without quotes)
7.2. htsearch finds
http://www.ac-orleans-tours.fr/tourisme/renovation.html (in first place)
8. htsearch -c ${config_dir}/site1.conf
8.1. words="rénovation tourisme" (as before)
8.2. htsearch returns the "no match found" page.

It's clear : htmerge lost the
http://www.ac-orleans-tours.fr/tourisme/renovation.html page when merging
the two databases.

Any suggestions ?

Sincerely yours,
Oivier Korn,
Strasbourg, France.

To unsubscribe from the htdig mailing list, send a message to
You will receive a message to confirm this.
List archives: <http://www.htdig.org/mail/menu.html>
FAQ: <http://www.htdig.org/FAQ.html>

This archive was generated by hypermail 2b28 : Mon Dec 18 2000 - 10:00:29 PST