Re: htdig: htmerge: Unable to open word list file....


Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Fri, 4 Dec 1998 16:00:43 -0600 (CST)


According to J. op den Brouw:
> I got the "htmerge: Unable to open word list file" error
> and to my opinion it showed when I hit a server that uses
> a robots.txt file and I could not get a file, then somewhere
> htmerge crashes because nothing is indexed.
>
> I only wanted to index that site, no other.

Yup, if htdig doesn't index anything at all, htmerge will give that error.
If you're only indexing that one site, and their robots.txt file blocks
out htdig (or any robots), then htdig will get 0 documents.

You should contact the administrators of the site you want to index (a
good courtesy to follow in any case), and ask them if they'll allow you
to index their site. If they allow access to the User-agent "htdig"
in their robots.txt file, you'll be able to index the directories to
which they give htdig access.

If they want to allow you, but not just any old htdig client, you can
agree on some other user agent ID that they'll put in their robots.txt
file, for you to use. You can configure this ID in your htdig.conf file,
with the "user_agent" parameter.

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:29:48 PST