[htdig] Recrusiv Digging

Michael Reutlinger (mulchi@arago.de)
Mon, 21 Jun 1999 22:47:50 +0200 (METDST)


  i have another Problem ;)

  Our Website consists of currently
  3.500 HTML Files.
  The Files are cross linked so that you can
  access the mailinglists archives from many
  locations on the webserver

  When i try to index this server it takes ALOT
  of time and after about 1h the page count
  from the -v debug output is at 35000.

  But actually we don't have so many files ...
  The Problem is, that htDig doesn't realize, that
  it already saw a page.
  Wouldn't it be usefull to exclude pages the engine
  already saw in one indexing run ??

  And beside that .. does anyone has an idea for
  the "on the fly" adding and deleting of documents
  to the database ??

  Thanx alot for any help to these topics !

  Michael R. Reutlinger

      ! arago,                          Michael R. Reutlinger !
      ! Institut fuer komplexes         Project Management    !
      ! Datenmanagement GmbH            eMail: mulchi@arago.de!
      ! Fichtestr. 12                                         !
      ! 60316 Frankfurt am Main         http://www.arago.de   !
      ! Tel: +49-69-40568-0             Fax: +49-69-40568-111 !

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message.

This archive was generated by hypermail 2.0b3 on Mon Jun 21 1999 - 13:03:13 PDT