Re: [htdig] Redigging - Don't redig urls in DB


Nicolas.Poizot@alcatel.fr
Thu, 12 Aug 1999 14:53:54 +0200


>
> Nicolas.Poizot@alcatel.fr wrote:
> > i'm merging with the current database. It's ok for the maximum of
> > documents. But in such cases, some documents are marked Invalid and
so
> > removed of the database. But the question is:
> > What is the reason for a document that it was marked "invalid"? I
have
> > parsed the code html of this document and i don't see why...
>
> I don't quite know what you mean when you say it was marked "invalid."
> If you can give us the exact error message that comes up, it would be
> very useful.

..merge operation...
htmerge: Total word count: 44818
Deleted, invalid: 7/http://perso.wanadoo.fr/coredump/indexwarhammer.html
htmerge: 10
..
htmerge: 160
Deleted, invalid: 580/http://webalpha.com/brocante/faq.html
..

>
> During merges between databases, duplicate URLs are flagged and the
> older of the two is tossed away. This may be what you mean. However,
the
> URL remains in the database--the merge is informing you that it did not
> wish to duplicate data.

It's not this case, because when i search this document, the htsearch
doesnt find these document.

Nicolas Poizot
>
> --
> -Geoff Hutchison
> Williams Students Online
> http://wso.williams.edu/
>
> ------------------------------------
> To unsubscribe from the htdig mailing list, send a message to
> htdig@htdig.org containing the single word unsubscribe in
> the SUBJECT of the message.
>

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word unsubscribe in
the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Thu Aug 12 1999 - 05:54:53 PDT