htdig: Problem with htdig-3.0b[12] not indexing full page


Jason Haar (Jason.Haar@trimble.co.nz)
Thu, 12 Nov 1998 12:52:05 +1300


Hi there

We have a PHP3-MySQL Apache site here where I want htdig to index a database
of PHP3 auto-generated HTML pages.

The start page is a largish table containing cells that contain HREFs to
other single pages.

htdig-3.0b2 doesn't get past the front page, whereas 3.0b1 does - but it
only sees about 500 of the (currently) 700-odd HREF-URLs on that start page.

Running htdig -v -v -v doesn't give any indication why either b1 or b2 are
having problems (same htdig.conf of course), except that b1 screws up one
link claiming it reads "...id=574&" when it reads "...id=574&". This is
definitely not correct as I've dumped all the pages using wget and the
string "amp" doesn't show up where htdig claims it sees it...

Any ideas? This is under Redhat 5.1 Linux.

-- 
Cheers

Jason Haar

Unix/Network Specialist, Trimble NZ Phone: +64 3 3391 377 Fax: +64 3 3391 417

---------------------------------------------------------------------- To unsubscribe from the htdig mailing list, send a message to htdig-request@sdsu.edu containing the single word "unsubscribe" in the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:28:47 PST