Re: [htdig] parsing PDF with NT


Subject: Re: [htdig] parsing PDF with NT
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Tue Mar 07 2000 - 08:20:19 PST


According to Stéphane Baudet:
> Well, to compile it, I just did it from the cygwin bash shell the classic
> unix way, I mean : "sh configure" to use autoconf, and then "make". I didn't
> use "make install", but i copied the binaries and the htdig.conf file at the
> place where they should be. I compiled it on a Pentium pro 200 server with
> NT server 4.0. You must first install cygwin B20.1 that you can find on
> http://sourceware.cygnus.com/cygwin/ . Actually, only the cygwin1.dll may be
> necessary.
> I zipped my files and I also put my own htdig.conf that other users may
> modify to put their own path for the external parser. In my search, I found
> a slightly modified parse_doc.pl dubbed as 01-parse_doc.pl, which works well
> with NT and xPDF 9.0. I also put pdfinfo.exe and pdftotext.exe from xPDF
> 9.0. I put my binaries in c:\opt\www\cgi-bin and c:\opt\www.htdig\bin .The
> zip file contains c:\opt\www\cgi-bin and c:\opt\www\htdig . NT users should
> change the database path in htdig.conf. I put it on c:\opt\www\htdig\db, but
> with cygwin syntax, if the DB must be, for instance, on d:\www\mydb, just
> change the database dir line in htdig.conf with the following path :
> d:/www/mydb or //d/www/mydb .
> I think that's all, but how can I upload my zip file on
> ftp://ftp.htdig.org/contrib ?

You should be able to upload them to ftp://ftp.htdig.org/incoming/
and then e-mail Geoff or me to move them over to files/binaries/.

As for the 01-parse_doc.pl script, if I recall it's just the same
parse_doc.pl script with some paths changed, a hook added for dealing
with "nul" vs. /dev/null, suppression of really big words, and some
changed comments. At the risk of repeating myself, I feel I must again
strongly recommend conv_doc.pl over parse_doc.pl, because it just does
a better job. I'll probably add the /dev/null patch to conv_doc.pl, but
if you want to add that and fix the paths, and include the modified
conv_doc.pl script in your zip file, it would be a big help. In any
case, thanks for your efforts.

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Tue Mar 07 2000 - 08:25:04 PST