[htdig] Problems indexing our intranet

Subject: [htdig] Problems indexing our intranet
From: Martin Mielke (martinm@people-com.com)
Date: Mon Sep 04 2000 - 09:25:30 PDT

Dear all,

I recently have installed ht:/Dig 3.1.5 under RedHat 6.2 including an
external parser for M$ Word and PDF files (conv_doc.pl). I edited the
htdig.conf file accordingly but (there's always an obscure reason...) it's
not possible to index the complete intranet.

Some details:

start_url: http://intranet
limit_urls_to: ${start_url}

Our intranet layout, by now, looks something like this:

                    | doc1 ... doc1.N
                    | doc2 ... doc2.N
                    | docN ... docN.N

Equally, only by now there is an intranet/index.html and the rest of
(sub)directories are just coupled with documents (txt, doc, pdf, ps...).
On a previous ht:/Dig installation I did when I was working for another
company, I remember it was enough just to define the 'start_url' and execute
rundig once to get the whole index built. Now I only can find what's defined
in the intranet/index.html file mentioned above... Therefore my question is:
must I create a complete index.html file for every directory I want ht:/Dig
to index or is it enough to define 'start_url' as above??
Maybe I'm overseeing something obvious now... every help will be welcomed!

Thanks in advance!

Best regards,



