[htdig] indexing robots and weird links?

Sat, 29 May 1999 12:41:16 -0700

I'm writing a fairly technical paper on how search engine crawler
robots work, and would like any advice anyone would care to give.

For example, do some robots follow links in forms? I've read that
plug-ins and multimedia data can keep robots from spidering a page --
is that still true? Any advice for framed sites beyond using the
<noframes> tags? What about complex JavaScripts that expect the user
to pick or type something?

While I'm at it, what are the weirdest sets of links you've ever
encountered? I know about Lotus Notes disclosure layouts that end up
providing hundreds of links to the same page in slightly different
views. And I'm sure that there are some other wild ones out there,
so send them to me (and let me know if you want to be identified
and/or quoted).

Thanks in advance,

To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.

This archive was generated by hypermail 2.0b3 on Sat May 29 1999 - 11:59:12 PDT