Re: [htdig] line endings in robots.txt?


Subject: Re: [htdig] line endings in robots.txt?
From: Geoff Hutchison (ghutchis@wso.williams.edu)
Date: Thu Dec 02 1999 - 11:45:16 PST


At 10:38 AM -0800 12/2/99, nets@searchtools.com wrote:
>I just saw a report that ht://Dig has trouble with non-PC line
>endings in the robots.txt file -- it requires CR/LF.
>
>Is this current? I searched the site and could not tell. If so,
>could you add checks for just CR (Mac line end character) and just
>LF (Unix line end character)?

I'd be interested to know where you heard that report, as it's wrong.

I can point to the code that parses lines of robots.txt files in
htdig/Server.cc::robotstxt()

...
     for (char *line = strtok(contents, "\r\n"); line; line = strtok(0, "\r\n"))
...

It splits on \r or \n, which are CR and newline (LF) as you mention.
In the case of a system with both characters as a line ending, it
would split twice. The first would result in the line you'd expect,
then it would split *between* the two characters, resulting in an
empty "line" that it would promptly ignore.

In short, it doesn't care whether there's a CR, LF, or both at the
end of a line.

-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-unsubscribe@htdig.org
You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Thu Dec 02 1999 - 11:58:33 PST