Re: [htdig3-dev] Summary and patch for robots.txt


Subject: Re: [htdig3-dev] Summary and patch for robots.txt
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Wed Feb 09 2000 - 07:36:28 PST


According to Geoff Hutchison:
> At 9:55 PM +0200 2/8/00, Valdas Andrulis wrote:
> >So there is the fix(i thinks this code was thought this way, common
> >error with if else):
...
> Whoops! This is a good bug-fix. This is probably going to cause a
> number of problems with things like exclude_urls and limit_urls_to as
> well.

Yes, basically anytime case_sensitive was true, the pattern didn't get
compiled.

> As for the robots.tx, I think we want to stick to the first matching
> section. Any matching section overrides the *, but I think Gilles's
> code is what we want.

OK, I've committed Valdas's fix to HtRegex.cc, and my fix to Server.cc
to implement stricter enforcement of the first match rule.

> I think this is the typical (and expected) format. If Loic's search
> turns up some interesting examples of other formats, we may want to
> consider a more liberal parser. I think we probably want to consider
> an Allow section, but it would be a bit tricky.

Well, I didn't see anything in his e-mail to suggest the more liberal
parsing is warranted, but I'm still open to counter-arguments. It's an
easy change in any case, so if anyone can make a good case for doing
things differently, please do so before 3.2.0b2 goes into pre-release.

> P.S. I'm currently quite swamped, so I will probably not be
> responding to much discussion--I don't want to rush off a response
> and stick my foot in my mouth!

Let's hope the load will be light in the next little while, because I
can't promise to pick up the slack. I've been neglecting other tasks
lately and I really need to get back to them.

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Wed Feb 09 2000 - 07:39:15 PST