htdig: Searching PDF or Word Files


Shyam B S (shyambs@hotmail.com)
Mon, 11 Jan 1999 02:30:24 PST


Hi Everybody,

I am trying to index and Search MS Word and PDF files. I am using catdoc
and acrobat as the external parsers for these documents. htsearch finds
the number of instances of the search word in the database, but doesn't
show the URLs in which they are found. For example, if I search for a
word "new" present only in a word document the search returns the
following page:

**********
[INLINE] Search results for 'new'
     
_____________________________________________________________________________________________________________

   Documents 1 - 1 of 1 matches. More * 's indicate a better match.
     
_____________________________________________________________________________________________________________
     
_____________________________________________________________________________________________________________

   Pages:
***********

The following code in DocumentDB.cc is returing NOTOK and hence the
missing URLS.

 if (dbf->Get(url, data) == NOTOK)
        return 0;

Any reason Why?

Thanks

Shyam

______________________________________________________
Get Your Private, Free Email at http://www.hotmail.com
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Mon Jan 11 1999 - 07:23:14 PST