Geoff Hutchison (ghutchis@wso.williams.edu)
Sun, 16 May 1999 14:47:27 -0400
OK, I've been beating on the new regex code. The limits code seems to work
correctly, but I can't seem to get excludes to work--it doesn't exclude
anything. At first, I thought it was a problem with '.cgi' or 'cgi-bin' and
these strings being considered incorrectly by the escaping. But even 'cgi'
or '99' don't seem to exclude URLs containing those patterns.
For example:
limit_urls_to: http://www\.htdig\.org/
becomes -> 'http://www\.htdig\.org/'
exclude_urls: 99 =
becomes -> '99|='
(Correct, yes?)
Then here's the code:
//
// If the URL contains any of the patterns in the exclude list,
// mark it as invalid
//
if (excludes.match(url, 0, 0) != 0)
{
if (debug >= 2)
cout << endl << " Rejected: item in exclude list ";
return(FALSE);
}
But that statement never becomes true. In comparison, the limit code is:
//
// If any of the limits are met, we allow the URL
//
if (limits.match(url, 1, 0) != 0) return(TRUE);
All of this looks correct to me. Anyone have sharper eyes?
-Geoff
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
htdig3-dev@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.
This archive was generated by hypermail 2.0b3 on Sun May 16 1999 - 12:04:43 PDT