[htdig3-dev] Duplicate entries in docs.index


Alexander Bergolth (leo@strike.wu-wien.ac.at)
Fri, 22 Jan 1999 12:07:33 +0100 (MEZ)


* List: htdig3-dev@sob.htdig.org

Hi!

I just tried 2 runs with servers that produced duplicate URLs in a
test-database but I didn't remove the files after the first run.

I digged using the following options:
htdig/htdig -v -i -t -s -c /scratch/leo/htdig/htdig/conf/test.conf
htmerge/htmerge -v -s -c /scratch/leo/htdig/htdig/conf/test.conf

After the second run, I found one document from the first run in the
docs.index file that was't removed correctly.

The .docs file that htdig produces is OK, so htmerge must be the problem.

Cheers,
         Leo

P.S.: I did a third run without removing the databases using the first
server again (having a smaller URL count than the second) and 340 of 411
URLs remained from the previous run!

-----------------------------------------------------------------------
Alexander (Leo) Bergolth leo@leo.wu-wien.ac.at
WU-Wien - Zentrum fuer Informatikdienste http://leo.wu-wien.ac.at
Info Center
In a world without walls and fences, who needs windows and gates?

------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
htdig3-dev@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Thu Feb 04 1999 - 22:24:20 PST