htdig: Htdig and wwwoffle


Andrew M. Bishop (amb@gedanken.demon.co.uk)
Thu, 27 Aug 1998 20:00:22 +0100


I am the author of WWWOFFLE The World Wide Web OFFLine Explorer.

I would like to know if it is possible for you to modify the way that
htdig works so that it can be used with wwwoffle.

wwwoffle is a caching web-proxy distributed under GPL that is designed
to be used by people with dial-up internet access. It caches all of
the requests that are made while online so that they can be retrieved
again under their original URL when offline. It also allows for
requests to be made when offline that will be fetched when it next
goes online. There are many other features, but they are not relevant
to this question (see http://www.gedanken.demon.co.uk/wwwoffle/).

I would like to be able to offer to users of wwwoffle the ability to
be able to perform searches in the cached files. Rather than writing
a search engine myself (which I don't want to do) I would like to be
able to use an existing one. It seems to me that htdig could fulfil
this requirement, but there are some changes that would need to be
made to both htdig and wwwoffle to do this.

1) wwwoffle will need to provide a list of all URLs in the cache.

2) htdig would need to use only the URLs provided to be searched, not
   follow links.

3) htdig would need to not use the robots.txt because these will not
   have been cached.

4) wwwoffle will need to provide the CGI interface to htdig.

Is htdig still in development and would these changes be possible?

-- 
Andrew.
----------------------------------------------------------------------
Andrew M. Bishop                             amb@gedanken.demon.co.uk
                                      http://www.gedanken.demon.co.uk/
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:27:19 PST