Re: [htdig] Data mining on Htdig DB


Subject: Re: [htdig] Data mining on Htdig DB
From: Geoff Hutchison (ghutchis@wso.williams.edu)
Date: Tue Dec 19 2000 - 13:46:11 PST


On Tue, 19 Dec 2000, Laurent wrote:

> I'm looking for a way to do some data analysis on htdig db. Unfortunately I
> have no good idea on how to do that. My point is to track changes over time
> in the language used on a series of specialized web sites.
[snip]
> some information like URL, document title etc. etc.

In the 3.1.x code, the db.wordlist file is already text. The -t flag to
htdig will dump an ASCII version of the document DB in a specified format:

<http://www.htdig.org/htdig.html>

In the 3.2 code, you can use the htdump program to generate these files.

Is this sufficient for your needs?

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this. List archives: <http://www.htdig.org/mail/menu.html> FAQ: <http://www.htdig.org/FAQ.html>



This archive was generated by hypermail 2b28 : Tue Dec 19 2000 - 13:56:59 PST