Subject: [htdig] Indexing stops after 100-200 documents
From: Martin Mielke (firstname.lastname@example.org)
Date: Tue Sep 19 2000 - 00:22:02 PDT
Hello again :-)
we heavily populated our intranet with tons of documents in different
formats, such as .doc, .pdf, .txt, .html, etc.
The external parsers seem to work fine, as it's possible to search for words
within a M$ Word document, for example, and the dynamic indexes are
generated by our Apache server so we don't need to code an HTML page for
One of the recent oddities is the following: a bif amount of documents are
the RFCs which we need to keep locally. Although start_url points to
http://intranet/, a rundig -vvv finishes quite fast (after 100 or 200
documents) and there's no trace of the RFCs (mostly in plain text or PDF --
I converted the PostScript files). max_doc_size is set to 4000000.
To unsubscribe from the htdig mailing list, send a message to
You will receive a message to confirm this.
List archives: <http://www.htdig.org/mail/menu.html>
This archive was generated by hypermail 2b28 : Tue Sep 19 2000 - 00:24:38 PDT