Re: [htdig] ok, ok I mean non textual data mining !!


Geoff Hutchison (ghutchis@wso.williams.edu)
Sat, 15 May 1999 16:38:52 -0400


>This obviously requires changes in the database as to what is associated
>with a particular file.

Not really. After all, right now the database stores a whole variety of
information with each URL. So if you want to perform data mining on TIFF
files, you just have URLs to each TIFF with the relevant meta information
attached to each.

>That is what I am on about and looking for collaberators so that HTDig can
>cover the whole field of all files on given system for data mining, not
>just those that lend themdselves to text indexing and retrieval by the
>same.

Fair enough--one previous feature suggestion was to store information on
binary files as well as text files. However, you'll need to figure out how
to supply the meta information you want to ht://Dig. One possibility is to
set up an external parser for those data types that supplies the relevant
fields. For example, you could set up one for TIFF files that includes
titles looked up from another database.

-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Sat May 15 1999 - 13:57:03 PDT