Subject: Re: [htdig] Getting URL names to show up in index.
From: Gilles Detillieux (firstname.lastname@example.org)
Date: Mon Jun 05 2000 - 11:29:58 PDT
According to Geoff Hutchison:
> On Mon, 5 Jun 2000 email@example.com wrote:
> > I hope this is a simple one. I am trying to have URL names show up in
> > search results. I have thousands of files that are in the following
> > format:
> > 123456_latest.pdf
> > I would like to get hits on 123456. The following is the way I have the
> > htdig.conf setup
> I think what you meant to say is that you want to *search* on parts of a
> filename. (You can already get URL names to show up in search
> results--this is part of the $(URL) variable).
> This has been requested a few times, but no one has offered anything in
> terms of implementation. It probably needs something in Retriever.cc after
> it gets through parsing a file to "parse" the URL.
> Personally, I'd put the string in your files somewhere (doesn't PDF have a
> "comments" or "keywords" portion). This will also make it easier for other
> search engines or browsers to get the information.
Since PDFs must be converted or parsed with an external converter or
parser, it's a very easy matter to modify the external program to spit
out the file name in addition to the body text and/or title. htdig passes
the full URL to the external converter or parser as its 3rd argument.
-- Gilles R. Detillieux E-mail: <firstname.lastname@example.org> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
------------------------------------ To unsubscribe from the htdig mailing list, send a message to email@example.com You will receive a message to confirm this.
This archive was generated by hypermail 2b28 : Mon Jun 05 2000 - 09:19:43 PDT