Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Wed, 7 Jul 1999 10:25:11 -0500 (CDT)
According to S. Hayles:
> > How were you setting exclude_urls?
>
> I tried a variety of approaches. Initially I created a file starting:
>
> exclude_urls: /cgi-bin/ .cgi \
> /ad/gem/gem.html \
> /adultedu/gem/gem.html \
> /ad/ars1/ \
> /adultedu/ars1/ \
> /ad/info/ \
> /adultedu/info/ \
> /ad/rs50/rs50.html \
> /adultedu/rs50/rs50.html \
> /ad/rs50/index.html \
> /adultedu/rs50/index.html \
> /ad/jrs12/jrs12.html \
> /adultedu/jrs12/jrs12.html \
> /ad/adflag \
> /adultedu/adflag \
> /ad/test1.html~ \
> /adultedu/test1.html~ \
> /ad/test.html \
> /adultedu/test.html \
>
> and used
>
> include: file
>
> I also tried embedding the data in the config file, removing the back
> slashes and putting everyting on one line, and including the file list
> using
>
> exclude_urls: `file`
>
> I never saw it reject any URL after the first 9, but in most cases it
> didn't seem to match anything beyond the first 2.
>
> If you can see no reason why it shouldn't work, I'll check everything
> and give it one more go.
If you're using the
exclude_urls: `file`
approach, then all the URLs in the file should NOT be all on one line.
Each line gets folded every 1000 characters, so you want lines to remain
shorter than that. You should have one URL per line, and leave it to
the getFileContents method to rejoin the lines into one string.
-- Gilles R. Detillieux E-mail: <grdetil@scrc.umanitoba.ca> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message.
This archive was generated by hypermail 2.0b3 on Wed Jul 07 1999 - 07:41:51 PDT