Re: htdig: htdig-8.1b2: Ignoring URLs?


Geoff Hutchison (ghutchis@wso.williams.edu)
Fri, 11 Dec 1998 00:15:17 -0500


At 4:12 AM -0500 12/8/98, Frank Richter wrote:
>I applied your patch to htdig, now I get
>Digging with max_hop_count 8: htdig-8.0.8b2 - ca. 55,000 documents
> htdig-8.1.0b2 - ca. 13,000 documents
> patched htdig-8.1.0b2 - 92,118 documents
>
>A lot more documents! I detected 6127 lines with "level -1":
>
>4201:588:-1:http://www.tu-chemnitz.de/chemnitz/: ** size = 497
>4207:589:-1:http://www.tu-chemnitz.de/tu/impressum.html: ----*-*--* size =
>3385
> ^^
>What does this mean?

It would seem to indicate I goofed somewhere in the patch. Someone told me
this can happen with redirects, so I'll give it a look.

>PS: It would be nice to have a possiblity to configure a maximum document
>count to dig, i.e.
>max_hop_count: 8 # dig to level 8
>max_doc_count: 60000 # but maximum this number of documents

You know what? It'll be in 3.1.0b3, but not quite in that form. Someone
asked for a sever_max_docs option, which was easy to add (about 4 or 5
lines, I can't remember).

-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:29:50 PST