Re: htdig: Re: ht://Dig and MSWord

Pirmin Kalberer (
Mon, 25 May 1998 08:02:46 +0000

Richard Jones wrote:
> B. Jung wrote:
> >
> > Hi,
> > I read your posting "htdig: HTDIG: Searching Word files" from the Tue,
> > 15 Jul 1997.
> > I'm interested in the solution of this problem because I need this
> > feature too.
> > I would be glad to hear from you that you have solved the problem with
> > MS Word files.
> I'd be glad to hear that I had a solution too :-)
> Unfortunately, working out exactly how external parsers work
> was beyond my abilities & I gave up. The solution is definitely
> possible, using `catdoc' and a simple shell script. I suggest
> you maybe ask Andrew Scherpbier exactly how the external parsing
> mechanism works, and then you or I can work out how to connect
> up catdoc.

We convert our Winword and Excel file with a Perl-Script which is
much better than catdoc. The three modules OLE-Storage, Unicode::Map
and Startup from Martin Schwartz can be found on CPAN. There
is a description in the May issue of the german Unix magazine 'iX'.


Pirmin Kalberer <>
Mueller Martini Logistik-Systeme AG, CH-8031 Zuerich
Phone: +41 1 279 13 90  Fax: +41 1 279 12 63
To unsubscribe from the htdig mailing list, send a message to containing the single word "unsubscribe" in
the body of the message.

This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:26:18 PST