J. op den Brouw (msql@st.hhs.nl)
Wed, 22 Sep 1999 11:24:09 +0200 (METDST)
Again, me stupid,
After tracing the code, it seems that htdig is allright.
The pages it was indexing are HTML versions of the java
API documentation and these have *a lot* of <A NAME=...>
tags in them.
So htdig needs *a lot* of time to go trough lists etc.
notably around line 346 of htcommon/DocumentRef.cc
addlist(DOC_ANCHORS, s, docAnchors);
Which brings me to a question:
Is there really a usefull function performed by these tags
(for use in a search engine, that is)
> Hmph. Sounds like there are some bugs to squash in the connection
> code. Can you find the connection for that particular document in the
> server log? Was the server heavily loaded at that point?
>
> Gabriele and I are in the middle of a higher-level rewrite
> (HtHTTP/Transport), but perhaps we want to revisit all the networking
> code. Loic's suggestion on a test suite would help, but I'd be at a
> bit of a loss for the base cases. Would we need to write/copy a TCP
> sniffer, or am I missing something?
>
> Any suggestions? Should we break the networking code out into a
> separate shared library (htnet)?
--jesse
--------------------------------------------------------------------
J. op den Brouw Johanna Westerdijkplein 75
Haagse Hogeschool 2521 EN DEN HAAG
Sector Techniek Netherlands
Afdeling Elektrotechniek +31 70 4458936
-------------------- J.E.J.opdenBrouw@st.hhs.nl --------------------
Linux - because reboots are for hardware changes
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
htdig3-dev@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.
This archive was generated by hypermail 2.0b3 on Wed Sep 22 1999 - 02:27:52 PDT