Re: Fw: [htdig] mutiple search results


Torsten Neuer (tneuer@inwise.de)
Wed, 27 Oct 1999 12:50:27 +0200


Kaspars wrote:
>
> This is the question I am interested in too:))
> -----------------------------------------------------------
> Kaspars
>
> > ----- Original Message -----
> > From: McCallum, Doug <DMcCallu@vtrlmel1.telstra.com.au>
> > To: <htdig@htdig.org>
> > Sent: Wednesday, October 27, 1999 5:41 AM
> > Subject: [htdig] mutiple search results
> >
> >
> > >
> > > Hi all,
> > > I was wondering if someone could provide us with some direction as
> > to why --
> > >
> > > When htdig results are returned they are mutiple duplicates of
> the
> > same
> > > file.?
> > >
> > > thanks douglas mccallum.
> > >
> > >
> > > ------------------------------------
> > > To unsubscribe from the htdig mailing list, send a message to
> > > htdig@htdig.org containing the single word unsubscribe in
> > > the SUBJECT of the message.
> >

Possible reasons (which are all HTTP server related) include:

- The server is not case-sensitive with regards to URLs; some
  hyperlinks to the same document are written differently.
  See http://www.htdig.org/attrs.html#case_sensitive

- The server got multiple names (which are not different virtual
  hosts), causing documents to appear once for every server name.
  See http://www.htdig.org/attrs.html#server_aliases

- The documents are retrieved using GET with a session id as an
  URL parameter. In order to fix this, you will have to postpro-
  cess the result of the htsearch query with a wrapper script.

- You symbolic links, causing the same document served under
  different names. In order to get around this problem, you
  will probably need to exclude the URL from the dig.
  See http://www.htdig.org/attrs.html#exclude_urls

hth,
  Torsten

-- 
InWise - Wirtschaftlich-Wissenschaftlicher Internet Service GmbH
Waldhofstraße 14                            Tel: +49-4101-403605
D-25474 Ellerbek                            Fax: +49-4101-403606
E-Mail: info@inwise.de            Internet: http://www.inwise.de

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig@htdig.org containing the single word unsubscribe in the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Wed Oct 27 1999 - 04:00:16 PDT