[htdig] Re: Beware of 3.1.0 for Sun-sparc


Frank Richter (Frank.Richter@hrz.tu-chemnitz.de)
Thu, 11 Feb 1999 16:52:53 +0100 (MET)


> your patch solves the problem.

Got another segmentation fault (after digging 40,000 docs) - this occured
during parsing a Word doc
http://www.tu-chemnitz.de/wirtschaft/bwl2/download/portrait.doc
via external parser parse_word_doc.pl.

I've no idea if this portrait.doc is ok, but our robust digger shouldn't
die by M$ docs... (I knew it, parsing word docs must be dangerous... :-)

(BTW, contrib/htparsedoc/parse_word_doc.pl has errors - wrong line breaks)

(gdb) bt
#0 0x1b550 in Retriever::got_word (this=0xeffff6d8,
    word=0x10b8c9a
"$J.\231\2049>\213:\031N\0162\2264\005\204vv\006\03182hkw",
location=0, heading=272) at Retriever.cc:876
#1 0x1ee10 in ExternalParser::parse (this=0x435100,
retriever=@0xeffff6d8,
    base=@0xca8d68) at ExternalParser.cc:168
#2 0x1a6e0 in Retriever::RetrievedDocument (this=0xeffff6d8,
doc=@0x1eaaf0,
    ref=0x83de50) at Retriever.cc:556
#3 0x1a2ac in Retriever::parse_url (this=0xeffff6d8, urlRef=@0x44b788)
    at Retriever.cc:458
#4 0x19cf0 in Retriever::Start (this=0xeffff6d8) at Retriever.cc:288
#5 0x1e188 in main (ac=9, av=0xeffff8ec) at main.cc:245

- Frank

-- 
Email: Frank.Richter@hrz.tu-chemnitz.de  http://www.tu-chemnitz.de/~fri/
Work:  Computing Services,  Chemnitz University of Technology,  Germany

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Wed Feb 17 1999 - 10:10:02 PST