Re: htdig: External parser again


Geoff Hutchison (Geoffrey.R.Hutchison@williams.edu)
Wed, 07 Oct 1998 13:24:07 -0400


>In fact i do not understand why this seems to be so complicated.
>Htdig will be more customizable if it parses text files only, all other
>files being handled via external parsers. With something like that in
>htdig.conf:

Well it would be more customizable if it handles external parsers well. But
parsing the file directly to text may not be the best solution.

Many formats include graphics (which we may wish to keep track of), and
some formats now include hyperlinks and/or URLs. And what about metadata?
If I was to parse LaTeX documents, I'd want the title counted like the
title of an HTML document, etc.

I'm not going to address these issues in the htdig3 maintenance. However, I
think this is a great topic for htdig4 development. Feedback is always
welcome. :-)

-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:28:29 PST