Re: [htdig] Pages still not all indexing.


Subject: Re: [htdig] Pages still not all indexing.
From: J. op den Brouw (MSQL_User@st.hhs.nl)
Date: Tue Sep 05 2000 - 08:09:07 PDT


Segmentation fault means that the digger (htdig) crashed
due to illegal address references (e.g. it points to
an address outside its range).

This is not good.

I'll bet it happened right in the 'J' page. htmerge
continues, but it cannot 'merge' more that htdig
put in the database, so you'll miss pages.

Are you running 3.1.5?

Is there a start point that we can use to index
your site. Maybe we will have a crash too (not good),
or maybe we can index your site without problems (also
not good, because you can't).

On Tue, 5 Sep 2000, Karen Reardon wrote:

> I upped the max_doc_size to 50000000 and got this response:
>
> ./rundig_ejournals[38]: 15102 Segmentation fault(coredump)
> htmerge: Total word count: 3102
> htmerge: Total documents: 37
> htmerge: Total doc db size (in K): 2451
>
> I can't fine a reference to 'Segmentation fault' in the documentation on
> the web site...? Also, I still do not get any hits on Journals further down
> on the page, so this did not work...
> Should I run it with a -i to make the database from scratch?
>
> -karen reardon
>
>
> At 08:47 AM 9/3/00 -0500, Geoff Hutchison wrote:
> >At 8:14 AM -0400 9/3/00, Karen Reardon wrote:
> >>I cannot get the entire pages of the large pages to index.. for example,
> >>in the letter 'J' (the largest page), I have only about 1/3 of the page in
> >>the index. I have changed max_doc_size, up to 5000000 and I still don't
> >>get it. The page is a little over 1MB in page information in Netscape.
> >>Is there a parameter on how long HtDig will wait for a page to load? (I
> >>can't find one.)
> >
> >Looking at the description right now, it doesn't mention that the number
> >is in bytes. So if it's really over 1MB in size, then you're only pulling
> >in ~500K (actually a bit less). Try upping it again (say to ~5MB by adding
> >another zero).
> >
> >--
> >-Geoff Hutchison
> >Williams Students Online
> >http://wso.williams.edu/
> >
> >------------------------------------
> >To unsubscribe from the htdig mailing list, send a message to
> >htdig-unsubscribe@htdig.org
> >You will receive a message to confirm this.
> >List archives: <http://www.htdig.org/mail/menu.html>
> >FAQ: <http://www.htdig.org/FAQ.html>
>
>
> ------------------------------------
> To unsubscribe from the htdig mailing list, send a message to
> htdig-unsubscribe@htdig.org
> You will receive a message to confirm this.
> List archives: <http://www.htdig.org/mail/menu.html>
> FAQ: <http://www.htdig.org/FAQ.html>
>
>

--jesse
--------------------------------------------------------------------
J. op den Brouw Johanna Westerdijkplein 75
Haagse Hogeschool 2521 EN DEN HAAG
Faculty of Engeneering Netherlands
Electrical Engeneering +31 70 4458936
-------------------- J.E.J.opdenBrouw@st.hhs.nl --------------------

Linux - because reboots are for hardware changes

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-unsubscribe@htdig.org
You will receive a message to confirm this.
List archives: <http://www.htdig.org/mail/menu.html>
FAQ: <http://www.htdig.org/FAQ.html>



This archive was generated by hypermail 2b28 : Tue Sep 05 2000 - 08:11:32 PDT