Re: [htdig] PDF indexing problem: Deleted, no excerpt

Subject: Re: [htdig] PDF indexing problem: Deleted, no excerpt
From: Geoff Hutchison (
Date: Wed Aug 09 2000 - 13:30:57 PDT

On Wed, 9 Aug 2000, Mike Gardner wrote:

> HTDIG -v lists the PDF files & their size OK (ie looks as though
> indexing) however I don't see the '+--+--**' that you get for HTML
> files - is this a problem?

No. The +/-/* marks are indications of links in HTML files.

> So I assume that theres no indexable text as the PDF parsing failed
> (even though there were no error messages).

Some PDF files look like text, but were created by some program that just
made graphics. I'd certainly check the PS output that you mentioned for

> Or should I just install xpdf and try that?

This is the recommended way to index PDF files, though certainly if the
PDF if graphics and doesn't store text, there's not much you can do with

-Geoff Hutchison
Williams Students Online

------------------------------------ To unsubscribe from the htdig mailing list, send a message to You will receive a message to confirm this.

This archive was generated by hypermail 2b28 : Wed Aug 09 2000 - 03:30:54 PDT