htdig: Indexing docs cached by proxy

Peter Stamfest (
Mon, 4 Jan 1999 16:32:20 +0100 (CET)

Hi all!

I wonder if it is easily possible to include all the pages held by a proxy
into the indexing via htdig.

This is not to be mistaken with using a proxy with htdig - I know this

I ask this question as I keep around 10000 URLs around on my home machine
via the wwwoffle caching http proxy. The urls and their corresponding data
are stored in simple files. A solution would be to hand URL/<path to
content-file> pairs to an indexer (with the hint not to follow links). Is
something like this possible with htdig?

I am currently in the process of finding an indexing/search engine for
this kind of application, and a quick survey of the htdig documentation
revealed nothing obvious.

If nothing like this exists, I am willing to implement something like
this, as it would be handy for me to find that-special-page-naming-the-

If at all possible include me as cc in your replies.


*  peter stamfest                  +-- i do believe what i say --+
** waun des wirrwarr weniga irr wa waun i wieda weniga wirr wa    
**                                                     (attwenger)

