Re: [htdig] Pages still not all indexing.


Subject: Re: [htdig] Pages still not all indexing.
From: Karen Reardon (karen.reardon@yale.edu)
Date: Tue Sep 05 2000 - 07:59:42 PDT


I upped the max_doc_size to 50000000 and got this response:

./rundig_ejournals[38]: 15102 Segmentation fault(coredump)
htmerge: Total word count: 3102
htmerge: Total documents: 37
htmerge: Total doc db size (in K): 2451

I can't fine a reference to 'Segmentation fault' in the documentation on
the web site...? Also, I still do not get any hits on Journals further down
on the page, so this did not work...
Should I run it with a -i to make the database from scratch?

-karen reardon

At 08:47 AM 9/3/00 -0500, Geoff Hutchison wrote:
>At 8:14 AM -0400 9/3/00, Karen Reardon wrote:
>>I cannot get the entire pages of the large pages to index.. for example,
>>in the letter 'J' (the largest page), I have only about 1/3 of the page in
>>the index. I have changed max_doc_size, up to 5000000 and I still don't
>>get it. The page is a little over 1MB in page information in Netscape.
>>Is there a parameter on how long HtDig will wait for a page to load? (I
>>can't find one.)
>
>Looking at the description right now, it doesn't mention that the number
>is in bytes. So if it's really over 1MB in size, then you're only pulling
>in ~500K (actually a bit less). Try upping it again (say to ~5MB by adding
>another zero).
>
>--
>-Geoff Hutchison
>Williams Students Online
>http://wso.williams.edu/
>
>------------------------------------
>To unsubscribe from the htdig mailing list, send a message to
>htdig-unsubscribe@htdig.org
>You will receive a message to confirm this.
>List archives: <http://www.htdig.org/mail/menu.html>
>FAQ: <http://www.htdig.org/FAQ.html>

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-unsubscribe@htdig.org
You will receive a message to confirm this.
List archives: <http://www.htdig.org/mail/menu.html>
FAQ: <http://www.htdig.org/FAQ.html>



This archive was generated by hypermail 2b28 : Tue Sep 05 2000 - 08:00:44 PDT