[htdig] infinite loop in doc2html.pl


Subject: [htdig] infinite loop in doc2html.pl
From: Terry Luedtke (LuedtkT@mail.nlm.nih.gov)
Date: Tue Sep 19 2000 - 06:52:36 PDT


Hello,

I ran into an infinite loop using doc2html. When it parses a PDF document it tries to reassemble hyphenated words. Unfortunately, I have documents that end with a dash, like"text-", so the loop spins forever looking for the other half of the word. Adding a check for eof fixed it.

in sub try_text()

      while (<CAT>) {
        while ( m/[A-Za-z\300-\377]-\s*$/ && $set->{'hyph'}) {
          ($_ .= <CAT>) || last;
          s/([A-Za-z\300-\377])-\s*\n\s*([A-Za-z\300-\377])/$1$2/s;
        }

--
      while (<CAT>) {
        while ( m/[A-Za-z\300-\377]-\s*$/ && $set->{'hyph'}) {
          ($_ .= <CAT>) || last;
          s/([A-Za-z\300-\377])-\s*\n\s*([A-Za-z\300-\377])/$1$2/s;
+          last if eof;
        }

Terry Luedtke National Library of Medicine

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this. List archives: <http://www.htdig.org/mail/menu.html> FAQ: <http://www.htdig.org/FAQ.html>



This archive was generated by hypermail 2b28 : Tue Sep 19 2000 - 06:56:15 PDT