Re: [htdig] Digging problem, probably css-related?


Subject: Re: [htdig] Digging problem, probably css-related?
From: J. op den Brouw (msql@st.hhs.nl)
Date: Mon Sep 04 2000 - 23:38:38 PDT


I'm not sure, but the parsing is normally done after
the extension is associated with the parser. As htdig
doesn't recognize .css, the parse part is not started
anyway.

Correct me if I'm wrong.

Thomas Rother wrote:
>
> "J. op den Brouw" wrote:
>
> > > "text/css" not a recognized type. Assuming text
> > > size = 943
> >
> > Htdig doesn't know what type text/css is. That is, it has no
> > parser associated with it.
> > By default it assumes text. If this is bad behaviour, it
> > should be fixed.
>
> What really makes me crazy is the fact that I have
>
> exclude_urls: /cgi-bin/ .cgi .css /css/ .htaccess suchdb/ Msgs mh.rsc
> suck htdig mhonarc
>
> but still the css is parsed! Does that mean I need to add an extra
> "external_parsers" statement, just for css? Hm, doesn't make sens,
> really...
> And what parser would then be appropriate? /usr/bin/less ;-)?
>
> The external parsers line has:
>
> external_parsers: application/rtf->text/html
> /usr/local/scripts/doc2html.pl \
> text/rtf->text/html
> /usr/local/scripts/doc2html.pl \
> application/pdf->text/html
> /usr/local/scripts/doc2html.pl \
> application/postscript->text/html
> /usr/local/scripts/doc2html.pl \
> application/msword->text/html
> /usr/local/scripts/doc2html.pl \
> application/Wordperfect5.1->text/html
> /usr/local/scripts/doc2html.pl
>
> The html has the following code:
>
> <html>
> <head>
> <base target="main_win">
> <link href="css/gaia.css" rel="stylesheet" type="text/css">
> <title>GAIA WEB INTERN Titelseite</title>
>
> Does a simple css link really kill the htdig?
>
> Thomas
> --
> -----------------------------------------------------------
> Thomas M. ROTHER -- 73728 Esslingen -- EU/Germany
> mailto:t.rother@netzwissen.de - http://www.netzwissen.de
> Public PGP Key auf http://www.keyserver.net
> -----------------------------------------------------------
>
> ------------------------------------
> To unsubscribe from the htdig mailing list, send a message to
> htdig-unsubscribe@htdig.org
> You will receive a message to confirm this.
> List archives: <http://www.htdig.org/mail/menu.html>
> FAQ: <http://www.htdig.org/FAQ.html>

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-unsubscribe@htdig.org
You will receive a message to confirm this.
List archives: <http://www.htdig.org/mail/menu.html>
FAQ: <http://www.htdig.org/FAQ.html>



This archive was generated by hypermail 2b28 : Mon Sep 04 2000 - 14:39:48 PDT