Re: [htdig] Using htdig to "tidy" HTML


Subject: Re: [htdig] Using htdig to "tidy" HTML
From: Rzepa, Henry (h.rzepa@ic.ac.uk)
Date: Wed Jun 07 2000 - 03:18:42 PDT


>Geoff Hutchison wrote:
>
>
>I only can think of a two-step process here, which has Ht://Dig produce
>a URL logfile which is piped through sort | uniq and fed to the tidy
>pro-
>gram afterwards. A simple shell script which serves as an extension to
>the rundig script should do.

Thanks Geoff. We can report that generating a URL list and doing as above
works well. Its not even that slow, although one might baulk at doing more than
about 20,000 documents.

-- 

Henry Rzepa. +44 (0)20 7594 5774 (Office) +44 (0)20 7594 5804 (Fax) Dept. Chemistry, Imperial College, London, SW7 2AY, UK. http://www.ch.ic.ac.uk/rzepa/

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Wed Jun 07 2000 - 01:08:48 PDT