Re: [htdig] the mysterious "Deleted, no excerpt" problem


Subject: Re: [htdig] the mysterious "Deleted, no excerpt" problem
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Tue May 23 2000 - 07:46:47 PDT


According to Patrick Robinson:
> I don't know why BBEdit might have strewn text files with nulls, but I'm
> also not sure why htdig can't read those files. But I might suspect that
> there's a null-terminated string that contains the document. In my case,
> there was typically a null as the first byte of the file, which might make
> the file look empty. Is that it?

Yes, that's it exactly. If it is indeed valid to have null bytes
within an HTML file, then this is a bug in the HTML parser, and it
should be fixed to either strip them out before treating the string
as a null-terminated string, or to change them to something harmless.
If nulls are invalid within HTML, then it's BBEdit that's at fault.
In either case, I can't imagine what value these nulls could be, so it's
probably still a bug in BBEdit that is causing these nulls to appear.

I should check the code to make sure this doesn't affect the external
parser handler, which may have to deal with files which can legitimately
contain nulls.

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Tue May 23 2000 - 05:35:26 PDT