U.O. Telematica Municipale - Comune di Prato (tlm@po-net.prato.it)
Mon, 01 Feb 1999 13:22:31 +0100
In 23.35 29/01/99 -0400, hai scritto:
>
>>It indexes the right documents, but then it keeps in the database the old
>>files too. Is there a way to erase from the db all the documents with
>>pattern specified in the limits_urls_to or similar, by making possibile the
>>real updating?
>
>What version are you using? This could be the bug we just fixed that leaves
>old files in the document database. Could you try the latest snapshot?
>
>-Geoff
>
Hi Geoff. I'm using the version htdig-3.1.0b4 .
What do you mean with "latest snapshot"?
Thanks
Gabriele
----------------------------------------------------------
U.O. Rete Civica - Comune di Prato
Via Ricasoli, 4 - 59100 Prato PO Italia
Tel. +39 0574616342 Fax +39 0574616003
http://www.comune.prato.it
E-Mail: tlm@mbox.comune.prato.it
----------------------------------------------------------
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
htdig3-dev@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.
From - Thu Feb 4 22:12:22 1999
Return-Path: <ghutchis@wso.williams.edu>
Received: from sob.htdig.org (htdig.org [209.75.193.22])
by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id FAA02587
for <andrew@contigo.com>; Mon, 1 Feb 1999 05:37:42 -0800 (PST)
Received: from sob.htdig.org (localhost [127.0.0.1])
by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id FAA21752;
Mon, 1 Feb 1999 05:37:47 -0800 (PST)
From: Geoff Hutchison <ghutchis@wso.williams.edu>
Reply-To: htdig3-dev@htdig.org
Errors-To: htdig3-dev@htdig.org
To: htdig3-dev@htdig.org
Message-ID: <36B5AE32.BeroList-2.5.9@sob.htdig.org>
In-Reply-To: <36B59C38.BeroList-2.5.9@sob.htdig.org>
References: <36B28F15.BeroList-2.5.9@sob.htdig.org>
<36B1A24C.BeroList-2.5.9@sob.htdig.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Mon, 1 Feb 1999 08:27:48 -0400
Subject: Re: [htdig3-dev] Updating only a part of the database
>What do you mean with "latest snapshot"?
Go to http://www.htdig.org/files/snapshots/ Download
htdig-3.1.0dev-013199.tar.gz. This is a snapshot of the latest development
code from the CVS tree. It was current as of 12:00AM EST Sunday morning.
-Geoff
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
htdig3-dev@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.
From - Thu Feb 4 22:12:22 1999
Return-Path: <MSQL_User@st.hhs.nl>
Received: from sob.htdig.org (htdig.org [209.75.193.22])
by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id FAA01987
for <andrew@contigo.com>; Mon, 1 Feb 1999 05:14:21 -0800 (PST)
Received: from sob.htdig.org (localhost [127.0.0.1])
by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id FAA21646;
Mon, 1 Feb 1999 05:14:32 -0800 (PST)
From: "J. op den Brouw" <MSQL_User@st.hhs.nl>
Reply-To: htdig3-dev@htdig.org
Errors-To: htdig3-dev@htdig.org
To: htdig3-dev@htdig.org
Message-ID: <36B5A8B9.BeroList-2.5.9@sob.htdig.org>
Date: Mon, 1 Feb 1999 14:06:23 +0100 (MET)
X-Sender: msql@pluto
In-Reply-To: <36B28F14.BeroList-2.5.9@sob.htdig.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Subject: Re: [htdig3-dev] Parsing Ms Word
On Fri, 29 Jan 1999, Geoff Hutchison wrote:
Well , the web sever sends you a mime-type back that
is configured for the extnsion .doc. The server doesn't
know what the contents is. WP docs should have
extensions like .wp or .wp5 or .wp<whatever>
catdoc should complain if the file is not a word file.
In fact it does, but not always.
>
> >Fourth, catdoc sometimes fails dramaticly when a non-Word
> >file end with .doc and gets parsed by catdoc. It crashed
> >htdig at my place...
>
> Hmm. So the file was sent with the incorrect mime-type? Is there a way we
> can detect this easily?
>
> -Geoff
>
>
> ------------------------------------
> To unsubscribe from the htdig3-dev mailing list, send a message to
> htdig3-dev@htdig.org containing the single word "unsubscribe" in
> the SUBJECT of the message.
>
>
--jesse
--------------------------------------------------------------------
J. op den Brouw Johanna Westerdijkplein 75
Haagse Hogeschool 2521 EN DEN HAAG
Sector Techniek Netherlands
Afdeling Elektrotechniek +31 70 4458936
-------------------- J.E.J.opdenBrouw@st.hhs.nl --------------------
Linux - because reboots are for hardware changes
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
htdig3-dev@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.
From - Thu Feb 4 22:12:22 1999
Return-Path: <MSQL_User@st.hhs.nl>
Received: from sob.htdig.org (htdig.org [209.75.193.22])
by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id FAA02861
for <andrew@contigo.com>; Mon, 1 Feb 1999 05:45:04 -0800 (PST)
Received: from sob.htdig.org (localhost [127.0.0.1])
by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id FAA21790;
Mon, 1 Feb 1999 05:45:17 -0800 (PST)
From: "J. op den Brouw" <MSQL_User@st.hhs.nl>
Reply-To: htdig3-dev@htdig.org
Errors-To: htdig3-dev@htdig.org
To: htdig3-dev@htdig.org
Message-ID: <36B5AFEE.BeroList-2.5.9@sob.htdig.org>
Date: Mon, 1 Feb 1999 14:37:06 +0100 (MET)
X-Sender: msql@pluto
In-Reply-To: <36B218B5.BeroList-2.5.9@sob.htdig.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Subject: Re: [htdig3-dev] Odd comment...
Hmm, it was a long time ago. Which file is it in (bad english?).
I can't find it in my htdig src. By the way, dutch is a beautiful
language.
On Fri, 29 Jan 1999, Andrew Scherpbier wrote:
>
> Gilles Detillieux wrote:
> >
> > According to Geoff Hutchison:
> > >
> > >
> > > >Nothing odd about it - just plain Dutch ;-)
> > > >
> > > >It translates roughly to: " something should be inserted here but I didn't
> > > >do that" (no reason given but maybe the context can give you a clue)
> > >
> > > OK, I feel ashamed. I *thought* I had picked up some Dutch from last summer.
> > >
> > > The context didn't tell me much. Now I'll need to figure out what
> > > "something" was. :-)
> >
> > I was wondering what that comment meant! It seems it first appeared in
> > 3.1.0b1. A lot of the revisions then were put in by "turtle", including
> > this one:
> >
> > // Revision 1.4 1998/06/21 23:20:09 turtle
> > // patches by Esa and Jesse to add BerkeleyDB and Prefix searching
> >
> > So maybe turtle, Esa or Jesse can explain?
> >
>
> turtle == Andrew Scherpbier == me...
>
> I remember putting in the patches that Esa and Jesse sent. Since Jesse is
> Dutch, I'd "blame" him! :-)
--jesse
--------------------------------------------------------------------
J. op den Brouw Johanna Westerdijkplein 75
Haagse Hogeschool 2521 EN DEN HAAG
Sector Techniek Netherlands
Afdeling Elektrotechniek +31 70 4458936
-------------------- J.E.J.opdenBrouw@st.hhs.nl --------------------
Linux - because reboots are for hardware changes
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
htdig3-dev@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.
From - Thu Feb 4 22:12:22 1999
Return-Path: <MSQL_User@st.hhs.nl>
Received: from sob.htdig.org (htdig.org [209.75.193.22])
by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id GAA03449
for <andrew@contigo.com>; Mon, 1 Feb 1999 06:08:03 -0800 (PST)
Received: from sob.htdig.org (localhost [127.0.0.1])
by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id GAA21924;
Mon, 1 Feb 1999 06:08:14 -0800 (PST)
From: "J. op den Brouw" <MSQL_User@st.hhs.nl>
Reply-To: htdig3-dev@htdig.org
Errors-To: htdig3-dev@htdig.org
To: htdig3-dev@htdig.org
Message-ID: <36B5B54F.BeroList-2.5.9@sob.htdig.org>
Date: Mon, 1 Feb 1999 14:59:57 +0100 (MET)
X-Sender: msql@pluto
In-Reply-To: <36B218B5.BeroList-2.5.9@sob.htdig.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Subject: Re: [htdig3-dev] Odd comment...
Please remove the comment. I don't know why it's there or
what is does. Apparently it does nothing.
Sorry. It was in my old days as student.......
On Fri, 29 Jan 1999, Andrew Scherpbier wrote:
>
> Gilles Detillieux wrote:
> >
> > According to Geoff Hutchison:
> > >
> > >
> > > >Nothing odd about it - just plain Dutch ;-)
> > > >
> > > >It translates roughly to: " something should be inserted here but I didn't
> > > >do that" (no reason given but maybe the context can give you a clue)
> > >
> > > OK, I feel ashamed. I *thought* I had picked up some Dutch from last summer.
> > >
> > > The context didn't tell me much. Now I'll need to figure out what
> > > "something" was. :-)
> >
> > I was wondering what that comment meant! It seems it first appeared in
> > 3.1.0b1. A lot of the revisions then were put in by "turtle", including
> > this one:
> >
> > // Revision 1.4 1998/06/21 23:20:09 turtle
> > // patches by Esa and Jesse to add BerkeleyDB and Prefix searching
> >
> > So maybe turtle, Esa or Jesse can explain?
> >
>
> turtle == Andrew Scherpbier == me...
>
> I remember putting in the patches that Esa and Jesse sent. Since Jesse is
> Dutch, I'd "blame" him! :-)
>
> --
> Andrew Scherpbier <andrews@contigo.com>
> Contigo Software <http://www.contigo.com/>
> ------------------------------------
> To unsubscribe from the htdig3-dev mailing list, send a message to
> htdig3-dev@htdig.org containing the single word "unsubscribe" in
> the SUBJECT of the message.
>
>
--jesse
--------------------------------------------------------------------
J. op den Brouw Johanna Westerdijkplein 75
Haagse Hogeschool 2521 EN DEN HAAG
Sector Techniek Netherlands
Afdeling Elektrotechniek +31 70 4458936
-------------------- J.E.J.opdenBrouw@st.hhs.nl --------------------
Linux - because reboots are for hardware changes
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
htdig3-dev@htdig.org containing the single word "unsubscribe" in
the SUBJECT of the message.
From - Thu Feb 4 22:12:22 1999
Return-Path: <andrews@contigo.com>
Received: from sob.htdig.org (htdig.org [209.75.193.22])
by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id HAA05480
for <andrew@contigo.com>; Mon, 1 Feb 1999 07:05:10 -0800 (PST)
Received: from sob.htdig.org (localhost [127.0.0.1])
by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id HAA22464;
Mon, 1 Feb 1999 07:05:12 -0800 (PST)
From: Andrew Scherpbier <andrews@contigo.com>
Reply-To: htdig3-dev@htdig.org
Errors-To: htdig3-dev@htdig.org
To: htdig3-dev@htdig.org
Message-ID: <36B5C2AF.BeroList-2.5.9@sob.htdig.org>
Sender: turtle@contigo.com
Date: Mon, 01 Feb 1999 07:04:23 -0800
Organization: Contigo Software
X-Mailer: Mozilla 4.5 [en] (X11; I; Linux 2.2.1 i686)
X-Accept-Language: en
MIME-Version: 1.0
References: <36B5AFEE.BeroList-2.5.9@sob.htdig.org>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Subject: Re: [htdig3-dev] Odd comment...
"J. op den Brouw" wrote:
>
> Hmm, it was a long time ago. Which file is it in (bad english?).
> I can't find it in my htdig src. By the way, dutch is a beautiful
> language.
Hey! Nobody is denying that!
It is beautifil until you have to write it... :-) (Of mischien was het de
slechte school waar ik op zat...)
-- Andrew Scherpbier <andrews@contigo.com> Contigo Software <http://www.contigo.com/> ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:22 1999 Return-Path: <grdetil@scrc.umanitoba.ca> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id JAA10967 for <andrew@contigo.com>; Mon, 1 Feb 1999 09:04:11 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id JAA22990; Mon, 1 Feb 1999 09:04:13 -0800 (PST) From: Gilles Detillieux <grdetil@scrc.umanitoba.ca> Reply-To: htdig3-dev@htdig.org Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B5DE94.BeroList-2.5.9@sob.htdig.org> Date: Mon, 1 Feb 1999 11:03:18 -0600 (CST) In-Reply-To: <36B5A8B9.BeroList-2.5.9@sob.htdig.org> from "J. op den Brouw" at Feb 1, 99 02:06:23 pm X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [htdig3-dev] Parsing Ms WordAccording to J. op den Brouw: > Well , the web sever sends you a mime-type back that > is configured for the extnsion .doc. The server doesn't > know what the contents is. WP docs should have > extensions like .wp or .wp5 or .wp<whatever> > > catdoc should complain if the file is not a word file. > In fact it does, but not always. > > On Fri, 29 Jan 1999, Geoff Hutchison wrote: > > >Fourth, catdoc sometimes fails dramaticly when a non-Word > > >file end with .doc and gets parsed by catdoc. It crashed > > >htdig at my place... > > > > Hmm. So the file was sent with the incorrect mime-type? Is there a way we > > can detect this easily?
Improving the error checking in catdoc may be a solution, but the question in my mind was "why is any external parser able to take htdig down with it?" I took a look at htdig/ExternalParser.cc, and found some of its error checking to be less than bullet-proof. Some of the ands looked funny - I guess single &'s would work, but the right operator in this context is &&. I added lots of error checking for strtok's results. First of all, I didn't assume it can be called repeatedly after it returns a NULL, as that may be implementation dependent. I also made sure the return value was always checked for NULL before using it.
I don't use external parsers at my site, so could someone who uses them give this patch a try, please? I'd especially like to know if this solves the crashing problem reported by the person who started this thread. (Sorry, I don't have the original message, so I don't recall who this was.)
This was applied to the 012799 snapshot. If you want to apply it to 3.1.0b4, the last two bits will fail because they include the meta stuff that was added after the 011499 snapshot.
--- ./htdig/ExternalParser.cc.nullchk Wed Jan 27 18:57:07 1999 +++ ./htdig/ExternalParser.cc Mon Feb 1 10:30:09 1999 @@ -151,13 +151,19 @@ while (readLine(input, line)) { token1 = strtok(line, "\t"); + if (token1 == NULL) + token1 = ""; + token2 = NULL; + token3 = NULL; switch (*token1) { case 'w': // word token1 = strtok(0, "\t"); - token2 = strtok(0, "\t"); - token3 = strtok(0, "\t"); - if ( token1!=NULL & token2!=NULL & token3!=NULL ) + if (token1 != NULL) + token2 = strtok(0, "\t"); + if (token2 != NULL) + token3 = strtok(0, "\t"); + if (token1 != NULL && token2 != NULL && token3 != NULL) retriever.got_word(token1, atoi(token2), atoi(token3)); else cerr<< "External parser error in line:"<<line<<"\n"; @@ -165,17 +171,20 @@ case 'u': // href token1 = strtok(0, "\t"); - token2 = strtok(0, "\t"); - url.parse(token1); - if (token1 != NULL & token2 != NULL ) + if (token1 != NULL) + token2 = strtok(0, "\t"); + if (token1 != NULL && token2 != NULL) + { + url.parse(token1); retriever.got_href(url, token2); + } else cerr<< "External parser error in line:"<<line<<"\n"; break; case 't': // title token1 = strtok(0, "\t"); - if (token1 != NULL ) + if (token1 != NULL) retriever.got_title(token1); else cerr<< "External parser error in line:"<<line<<"\n"; @@ -183,7 +192,7 @@ case 'h': // head token1 = strtok(0, "\t"); - if (token1 != NULL ) + if (token1 != NULL) retriever.got_head(token1); else cerr<< "External parser error in line:"<<line<<"\n"; @@ -204,7 +213,9 @@ else cerr<< "External parser error in line:"<<line<<"\n"; break; + case 'm': // meta + { // Using good_strtok means we can accept empty // fields. char *httpEquiv = good_strtok(token1+2, '\t'); @@ -315,6 +326,11 @@ } else cerr<< "External parser error in line:"<<line<<"\n"; + break; + } + + default: + cerr<< "External parser error in line:"<<line<<"\n"; break; } }
-- Gilles R. Detillieux E-mail: <grdetil@scrc.umanitoba.ca> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:22 1999 Return-Path: <webmaster@javawoman.com> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id KAA14511 for <andrew@contigo.com>; Mon, 1 Feb 1999 10:18:15 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id KAA23217; Mon, 1 Feb 1999 10:17:27 -0800 (PST) From: Marjolein Katsma <webmaster@javawoman.com> Reply-To: htdig3-dev@htdig.org Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B5EFF1.BeroList-2.5.9@sob.htdig.org> X-Sender: javawoma@pop.javawoman.com X-Mailer: QUALCOMM Windows Eudora Pro Version 4.1 Date: Mon, 01 Feb 1999 18:50:39 +0100 In-Reply-To: <36B5C2AF.BeroList-2.5.9@sob.htdig.org> References: <36B5AFEE.BeroList-2.5.9@sob.htdig.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: Re: [htdig3-dev] Odd comment...Andrew,
At 07:04 1999-02-01 -0800, you wrote: > >"J. op den Brouw" wrote: >> >> Hmm, it was a long time ago. Which file is it in (bad english?). >> I can't find it in my htdig src. By the way, dutch is a beautiful >> language. > >Hey! Nobody is denying that! >It is beautifil until you have to write it... :-) (Of mischien was het de >slechte school waar ik op zat...)
Ja, dat moet de school geweest zijn! ;-)
> >-- >Andrew Scherpbier <andrews@contigo.com> >Contigo Software <http://www.contigo.com/> >------------------------------------ >To unsubscribe from the htdig3-dev mailing list, send a message to >htdig3-dev@htdig.org containing the single word "unsubscribe" in >the SUBJECT of the message. >
Marjolein Katsma webmaster@javawoman.com Java Woman - http://javawoman.com/ ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:22 1999 Return-Path: <ghutchis@wso.williams.edu> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id MAA21297 for <andrew@contigo.com>; Mon, 1 Feb 1999 12:25:50 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id MAA23857; Mon, 1 Feb 1999 12:25:56 -0800 (PST) From: Geoff Hutchison <ghutchis@wso.williams.edu> Reply-To: htdig3-dev@htdig.org Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B60DD4.BeroList-2.5.9@sob.htdig.org> In-Reply-To: <36B60A12.BeroList-2.5.9@sob.htdig.org> References: <36B6067C.BeroList-2.5.9@sob.htdig.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Date: Mon, 1 Feb 1999 15:24:50 -0400 Subject: Re: [htdig3-dev] Final Push
>Installing individual programs... >make[1]: Entering directory `/usr/local/htdig3/htfuzzy' >transform=s,x,x, >htfuzzy /usr/local/bin/`echo htfuzzy | sed ''` >htfuzzy: '/usr/local/bin/htfuzzy' is not a supported algorithm
Thanks. I just checked in a fix to the Makefiles.
-Geoff
------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:22 1999 Return-Path: <ghutchis@wso.williams.edu> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id LAA19590 for <andrew@contigo.com>; Mon, 1 Feb 1999 11:54:21 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id LAA23625; Mon, 1 Feb 1999 11:54:29 -0800 (PST) From: Geoff Hutchison <ghutchis@wso.williams.edu> Reply-To: htdig3-dev@htdig.org Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B6067C.BeroList-2.5.9@sob.htdig.org> Date: Mon, 1 Feb 1999 14:53:37 -0500 (EST) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Subject: [htdig3-dev] Final Push
I'm hoping to put in a final push to iron out any remaining bugs or issues before Wednesday. If it seems all set for release, I'll roll a snapshot as a pre-release and we can run it through its paces to make sure we didn't miss something silly.
I'd like to thank everyone for a great effort in getting together a fantastic release. I think we've ironed out a lot of bugs, some never even reported. :-)
-Geoff
P.S. I get a sense that post-release people want to work on the databases. I'll include a starting question here... If we store a DocID field for every document in db.docdb, why do we have a separate db.docs.index file? Why don't we store documents in db.docdb by DocID rather than URL? ;-)
------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:22 1999 Return-Path: <eschmid@stuttgart.netsurf.de> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id MAA20354 for <andrew@contigo.com>; Mon, 1 Feb 1999 12:09:39 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id MAA23672; Mon, 1 Feb 1999 12:09:53 -0800 (PST) From: Egon Schmid <eschmid@stuttgart.netsurf.de> Reply-To: htdig3-dev@htdig.org Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B60A12.BeroList-2.5.9@sob.htdig.org> Date: Mon, 1 Feb 1999 21:09:03 +0100 (MET) In-Reply-To: <36B6067C.BeroList-2.5.9@sob.htdig.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Subject: Re: [htdig3-dev] Final Push
Oh, I have pulled the recent CVS tree and get the following with 'make install':
Installing ht://Dig
Creating directories (if needed)...
Installing individual programs... make[1]: Entering directory `/usr/local/htdig3/htfuzzy' transform=s,x,x, htfuzzy /usr/local/bin/`echo htfuzzy | sed ''` htfuzzy: '/usr/local/bin/htfuzzy' is not a supported algorithm
make[1]: *** [install] Error 1 make[1]: Leaving directory `/usr/local/htdig3/htfuzzy' make[1]: Entering directory `/usr/local/htdig3/htdig'
-Egon
On Mon, 1 Feb 1999, Geoff Hutchison wrote:
> > > I'm hoping to put in a final push to iron out any remaining bugs or issues > before Wednesday. If it seems all set for release, I'll roll a snapshot as > a pre-release and we can run it through its paces to make sure we didn't > miss something silly. > > I'd like to thank everyone for a great effort in getting together a > fantastic release. I think we've ironed out a lot of bugs, some never even > reported. :-) > > -Geoff > > P.S. I get a sense that post-release people want to work on the databases. > I'll include a starting question here... If we store a DocID field for > every document in db.docdb, why do we have a separate db.docs.index file? > Why don't we store documents in db.docdb by DocID rather than URL? ;-)
------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:22 1999 Return-Path: <gumby@cafes.net> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id OAA28150 for <andrew@contigo.com>; Mon, 1 Feb 1999 14:23:10 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id OAA24309; Mon, 1 Feb 1999 14:23:01 -0800 (PST) From: Randy Winch <gumby@cafes.net> Reply-To: htdig3-dev@htdig.org Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B62950.BeroList-2.5.9@sob.htdig.org> Sender: randy@mail.cafes.net Date: Mon, 01 Feb 1999 16:25:25 -0600 X-Mailer: Mozilla 4.08 [en] (X11; I; Linux 2.0.35 i686) MIME-Version: 1.0 References: <36B6067C.BeroList-2.5.9@sob.htdig.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Subject: Re: [htdig3-dev] Final Push
Geoff Hutchison wrote: > If we store a DocID field for > every document in db.docdb, why do we have a separate db.docs.index file? > Why don't we store documents in db.docdb by DocID rather than URL? ;-)
Seems to me that we need access by url to see if the document has already been indexed.
Randy ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:22 1999 Return-Path: <ghutchis@wso.williams.edu> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id QAA01749 for <andrew@contigo.com>; Mon, 1 Feb 1999 16:14:34 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id QAA24753; Mon, 1 Feb 1999 16:14:42 -0800 (PST) From: Geoff Hutchison <ghutchis@wso.williams.edu> Reply-To: htdig3-dev@htdig.org Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B64372.BeroList-2.5.9@sob.htdig.org> In-Reply-To: <36B62950.BeroList-2.5.9@sob.htdig.org> References: <36B6067C.BeroList-2.5.9@sob.htdig.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Date: Mon, 1 Feb 1999 18:28:16 -0400 Subject: [htdig3-dev] Databases (was Re: Final Push)
>Seems to me that we need access by url to see if the document has >already been indexed.
Right. But let's say we keep a (temporary) url -> docID list/database while indexing...
Then a search request makes one lookup per document returned. Now a search request makes two lookups per document returned...
Hmm. If given a choice between slowing down the indexing (if at all) to speed up the search, or vice-versa, I'll choose the faster searching every time. Besides, with a URL -> docID list, it's only needed when indexing so you can delete it if pressed for space.
-Geoff
------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:22 1999 Return-Path: <grdetil@scrc.umanitoba.ca> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id PAA32206 for <andrew@contigo.com>; Mon, 1 Feb 1999 15:40:55 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id PAA24659; Mon, 1 Feb 1999 15:37:57 -0800 (PST) From: Gilles Detillieux <grdetil@scrc.umanitoba.ca> Reply-To: htdig3-dev@htdig.org Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B63ADC.BeroList-2.5.9@sob.htdig.org> Date: Mon, 1 Feb 1999 17:36:58 -0600 (CST) In-Reply-To: <36B5EFF1.BeroList-2.5.9@sob.htdig.org> from "Marjolein Katsma" at Feb 1, 99 06:50:39 pm X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: [htdig3-dev] tiny typo...
Just a little typo I came across.
--- ./htdoc/attrs.html.docfix3 Fri Jan 29 15:16:41 1999 +++ ./htdoc/attrs.html Mon Feb 1 17:35:05 1999 @@ -2793,7 +2793,7 @@ weight of words in any META description tags in a document. The number may be a floating point number. See also the <a href="#title_factor">title_factor</a> and <a href= - "#text_factor">text_factor</a>attributes. + "#text_factor">text_factor</a> attributes. </dd> <dt> <em>example:</em>
-- Gilles R. Detillieux E-mail: <grdetil@scrc.umanitoba.ca> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:22 1999 Return-Path: <grdetil@scrc.umanitoba.ca> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id PAA32397 for <andrew@contigo.com>; Mon, 1 Feb 1999 15:45:17 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id PAA24687; Mon, 1 Feb 1999 15:45:36 -0800 (PST) From: Gilles Detillieux <grdetil@scrc.umanitoba.ca> Reply-To: htdig3-dev@htdig.org Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B63CA0.BeroList-2.5.9@sob.htdig.org> Date: Mon, 1 Feb 1999 17:44:36 -0600 (CST) In-Reply-To: <36B5EFF1.BeroList-2.5.9@sob.htdig.org> from "Marjolein Katsma" at Feb 1, 99 06:50:39 pm X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: [htdig3-dev] patch to PDF.cc, to handle some strange PDFs I have hereI don't know where on the bug-fix/new-feature spectrum this fits, but I patched the PDF processing code to handle some PDFs we have here. They were generated by Acrobat PDFwriter from some Corel Draw files, and the PostScript files that acroread made from them did some strange stuff with character spacing - essentially it would commonly crank up the character spacing before putting out the last letter in a word, rather than using a space character or a positioning command. The result was that all the words were getting stuck together when indexing. This patch fixes this.
I'd really appreciate it if others could test this out with their PDF files to see if it breaks anything for them.
--- ./htdig/PDF.h.spacebug Thu Jul 23 11:18:54 1998 +++ ./htdig/PDF.h Mon Feb 1 14:48:15 1999 @@ -64,6 +64,13 @@ // appended to _parsedString instead of parsing it. int _continueString; + // Sometimes the character spacing, as set by the Tc command, is set + // to a very high value, and is used to treat the characters in the next + // Tj as separate words. When this variable is true, text is appended + // to _parsedString with a space after each character, instead of as + // a single word. + int _bigSpacing; + // String beeing read String _parsedString; --- ./htdig/PDF.cc.spacebug Tue Jan 26 18:27:52 1999 +++ ./htdig/PDF.cc Mon Feb 1 17:15:13 1999 @@ -14,6 +14,7 @@ #include "htdig.h" #include <htString.h> #include <StringList.h> +#include <stdlib.h> #include <ctype.h> @@ -24,6 +25,7 @@ { _data = 0; _dataLength = 0; + _bigSpacing = 0; initParser(); } @@ -361,10 +363,17 @@ else if (!strcmp(cmd, "Td") || !strcmp(cmd, "TD") || !strcmp(cmd, "Tm") || !strcmp(cmd, "T*")) { - // Text positionning commands Td, TD, Tm and T* are condidered + // Text positioning commands Td, TD, Tm and T* are considered // as a word break (see PDF 1.2 spec, chapter 8.7.3) parseString(); } + else if (!strcmp(cmd, "Tc")) + { + // Text positioning command Tc, with operand of 3 or more, seems + // sometimes to act as a word break between or after characters in + // the following Tj command. (E.g. PDFs generated from .cdr files.) + _bigSpacing = (atof(position) >= 3.0); + } else { // Other commands are not considered as a word break @@ -415,6 +424,8 @@ default: _parsedString << (char)val; } + if (_bigSpacing) + _parsedString << ' '; // To do : handle more special characters } @@ -436,6 +447,8 @@ default : // Add the escaped character _parsedString << *pos; + if (_bigSpacing) + _parsedString << ' '; pos++; } } @@ -444,6 +457,8 @@ { // Add character to the string _parsedString << *pos; + if (_bigSpacing) + _parsedString << ' '; pos++; } } @@ -507,7 +522,7 @@ // // Characters that are not part of a word // - if (!*position && isspace(*position)) + if (*position && isspace(*position)) { // // Reduce all multiple whitespace to a single space @@ -555,5 +570,6 @@ // Flush parsed string _parsedString = 0; + _bigSpacing = 0; }
-- Gilles R. Detillieux E-mail: <grdetil@scrc.umanitoba.ca> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:22 1999 Return-Path: <ghutchis@wso.williams.edu> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id QAA02901 for <andrew@contigo.com>; Mon, 1 Feb 1999 16:35:38 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id QAA24808; Mon, 1 Feb 1999 16:35:54 -0800 (PST) From: Geoff Hutchison <ghutchis@wso.williams.edu> Reply-To: htdig3-dev@htdig.org Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B6486B.BeroList-2.5.9@sob.htdig.org> Date: Mon, 1 Feb 1999 19:35:01 -0500 (EST) In-Reply-To: <36B63CA0.BeroList-2.5.9@sob.htdig.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Subject: Re: [htdig3-dev] patch to PDF.cc, to handle some strange PDFs I have hereOn Mon, 1 Feb 1999, Gilles Detillieux wrote:
> I don't know where on the bug-fix/new-feature spectrum this fits, but I > patched the PDF processing code to handle some PDFs we have here. They > were generated by Acrobat PDFwriter from some Corel Draw files, and the > PostScript files that acroread made from them did some strange stuff with > character spacing - essentially it would commonly crank up the character
I would generally categorize this as a bug, right? I'm assuming the patch doesn't affect other PDFs?
-Geoff
------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:22 1999 Return-Path: <ghutchis@wso.williams.edu> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id HAA31864 for <andrew@contigo.com>; Tue, 2 Feb 1999 07:37:41 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id HAA27930; Tue, 2 Feb 1999 07:38:01 -0800 (PST) From: Geoff Hutchison <ghutchis@wso.williams.edu> Reply-To: htdig3-dev@htdig.org Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B71BDA.BeroList-2.5.9@sob.htdig.org> Date: Tue, 2 Feb 1999 10:37:05 -0500 (EST) In-Reply-To: <36B710AE.BeroList-2.5.9@sob.htdig.org> (Netscape Messaging Server 3.5) with SMTP id 547; Tue, 2 Feb 1999 15:51:37 +0100 MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Subject: Re: [htdig3-dev] Re: Buildroot + solaris 2.6 patch
> I'll definitely have a look at them. I prefer using RPM's whenever > possible. It already struck me 'oddish' that no .spec files where in the > htdig distrib (or even src.rpm's) but that may be silly ol' me ;)
I would be glad to put a .spec in (contrib ?) if people think it would be a good idea. I hesitate slightly since there are lots of package formats and I don't intend on holding up the release so we can put in better support for everyone's favorite packaging scheme. :-P
-Geoff
------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:22 1999 Return-Path: <klaren@www.telin.nl> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id GAA29532 for <andrew@contigo.com>; Tue, 2 Feb 1999 06:41:22 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id GAA27691; Tue, 2 Feb 1999 06:41:26 -0800 (PST) From: Ric Klaren <klaren@www.telin.nl> Reply-To: htdig3-dev@htdig.org Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B70EA4.BeroList-2.5.9@sob.htdig.org> (Netscape Messaging Server 3.5) with SMTP id 512 for <htdig3-dev@htdig.org>; Tue, 2 Feb 1999 15:42:27 +0100 Date: Tue, 2 Feb 1999 15:43:22 +0000 References: <36B1BA59.BeroList-2.5.9@sob.htdig.org> <36B1C535.BeroList-2.5.9@sob.htdig.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.95.1i In-Reply-To: <36B1C535.BeroList-2.5.9@sob.htdig.org>; from Geoff Hutchison on Fri, Jan 29, 1999 at 09:18:46AM -0400 Organization: Telematica Instituut Subject: Re: [htdig3-dev] Buildroot + solaris 2.6 patch
Hi,
On Fri, Jan 29, 1999 at 09:18:46AM -0400, Geoff Hutchison wrote: GF> If you grab the latest CVS snapshot or the CVS tree, you can see that I've GF> fixed this using a pretty clean autoconf construct. Basically I check to GF> see what type we can use as that parameter and set GETPEERNAME_LENGTH_T GF> accordingly.
Knowing my knowledge about autoconf I trust you 110% on this =)
GF> I haven't tried development versions of egcs for stability reasons. I have GF> enough with hunting down the bugs in ht://Dig :-). I think someone else is GF> looking at that.
Grabbing on of the latest snapshots fixed practically all troubles I had with the stable release and egcs and autoconf too =)
Ric ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:22 1999 Return-Path: <grdetil@scrc.umanitoba.ca> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id HAA32211 for <andrew@contigo.com>; Tue, 2 Feb 1999 07:44:43 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id HAA27959; Tue, 2 Feb 1999 07:44:51 -0800 (PST) From: Gilles Detillieux <grdetil@scrc.umanitoba.ca> Reply-To: htdig3-dev@htdig.org Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B71D7A.BeroList-2.5.9@sob.htdig.org> Date: Tue, 2 Feb 1999 09:43:52 -0600 (CST) In-Reply-To: <36B6486B.BeroList-2.5.9@sob.htdig.org> from "Geoff Hutchison" at Feb 1, 99 07:35:01 pm X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [htdig3-dev] patch to PDF.cc, to handle some strange PDFs I have here
According to Geoff Hutchison: > > > On Mon, 1 Feb 1999, Gilles Detillieux wrote: > > > I don't know where on the bug-fix/new-feature spectrum this fits, but I > > patched the PDF processing code to handle some PDFs we have here. They > > were generated by Acrobat PDFwriter from some Corel Draw files, and the > > PostScript files that acroread made from them did some strange stuff with > > character spacing - essentially it would commonly crank up the character > > I would generally categorize this as a bug, right? I'm assuming the patch > doesn't affect other PDFs?
The only other PDF on our web site is a table from a WordPerfect document. It was indexed fine before, and still is. However, as I have so few PDFs, I was hoping other users could test out this patch, to make sure it doesn't cause problems with other PDFs. As long as the units used for the Tc command in PDFs is consistent, it should not pose a problem, but I'd like some independent confirmation (i.e. testing) of this.
Maybe if the patch is included in the next snapshot, we can post a message to the whole htdig mailing list, asking for testers for this and a lot of other new changes/fixes, before going to final release.
By the way, another problem with these weird PDFs from CorelDraw files is that occasionally they'd insert a TD positioning command right in the middle of a word, leading htdig to break it into two words. That is not nearly as easy to fix, and as it doesn't do this very frequently, I'm not going to bother with trying to fix it.
-- Gilles R. Detillieux E-mail: <grdetil@scrc.umanitoba.ca> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:23 1999 Return-Path: <klaren@telin.nl> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id GAA29948 for <andrew@contigo.com>; Tue, 2 Feb 1999 06:50:15 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id GAA27719; Tue, 2 Feb 1999 06:50:18 -0800 (PST) From: Ric Klaren <klaren@telin.nl> Reply-To: htdig3-dev@htdig.org Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B710AE.BeroList-2.5.9@sob.htdig.org> (Netscape Messaging Server 3.5) with SMTP id 547; Tue, 2 Feb 1999 15:51:37 +0100 Date: Tue, 2 Feb 1999 15:52:32 +0000 References: <36B1BA59.BeroList-2.5.9@sob.htdig.org> <36B21E87.BeroList-2.5.9@sob.htdig.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.95.1i In-Reply-To: <36B21E87.BeroList-2.5.9@sob.htdig.org>; from Gilles Detillieux on Fri, Jan 29, 1999 at 02:42:44PM -0600 Organization: Telematica Instituut Subject: [htdig3-dev] Re: Buildroot + solaris 2.6 patchHi,
On Fri, Jan 29, 1999 at 02:42:44PM -0600, Gilles Detillieux wrote: GD> Neat idea. It seems, though, that you should also change the call to GD> mkinstalldirs in Makefile.in, as well as all the lines that install GD> stuff in COMMON_DIR and SEARCH_DIR, to use INSTALL_ROOT there too.
I missed those? doh!
GD> For the RPMs I put together, I used a different approach, which I GD> borrowed from Mihai Ibanescu, who put together an RPM for ht://Dig GD> 3.0.8b2. This involve patching CONFIG.in to use all the directories GD> I want, and prefixing them with $(ROOT). Then, in the spec file, GD> I can configure things like so:
Yup. Used that too before stumbling over this approach. But was kindoff messy when things (scripts) winded up with wrong paths in them.
GD> If you want to see how I've set up my rpms, you can see them, and the GD> individual spec and source files, on my web site at: GD> GD> http://www.scrc.umanitoba.ca/htdig/rpms/
I'll definitely have a look at them. I prefer using RPM's whenever possible. It already struck me 'oddish' that no .spec files where in the htdig distrib (or even src.rpm's) but that may be silly ol' me ;)
Ric ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:23 1999 Return-Path: <grdetil@scrc.umanitoba.ca> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id IAA03162 for <andrew@contigo.com>; Tue, 2 Feb 1999 08:52:46 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id IAA28236; Tue, 2 Feb 1999 08:52:58 -0800 (PST) From: Gilles Detillieux <grdetil@scrc.umanitoba.ca> Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B72D70.BeroList-2.5.9@sob.htdig.org> Date: Tue, 2 Feb 1999 10:51:50 -0600 (CST) In-Reply-To: <36B710AE.BeroList-2.5.9@sob.htdig.org> from "Ric Klaren" at Feb 2, 99 03:52:32 pm X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [htdig3-dev] Re: Buildroot + solaris 2.6 patch
According to Ric Klaren: > I'll definitely have a look at them. I prefer using RPM's whenever > possible. It already struck me 'oddish' that no .spec files where in the > htdig distrib (or even src.rpm's) but that may be silly ol' me ;)
I've been putting together RPMs for htdig since 3.1.0b1, and after suggesting it to Geoff, he's been putting them up on http://www.htdig.org/files/binaries/ since 3.1.0b2. The src.rpm for 3.1.0b4 is up there now, and when 3.1.0 is released, I'll be putting together the RPMs for it too.
I suppose including the .spec file right in the source tarball might not be a bad idea, but I think making the source RPMs available is more useful.
-- Gilles R. Detillieux E-mail: <grdetil@scrc.umanitoba.ca> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:23 1999 Return-Path: <grdetil@scrc.umanitoba.ca> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id JAA04961 for <andrew@contigo.com>; Tue, 2 Feb 1999 09:15:20 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id JAA28334; Tue, 2 Feb 1999 09:15:38 -0800 (PST) From: Gilles Detillieux <grdetil@scrc.umanitoba.ca> Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B732BB.BeroList-2.5.9@sob.htdig.org> Date: Tue, 2 Feb 1999 11:14:38 -0600 (CST) In-Reply-To: <36B6C9AE.BeroList-2.5.9@sob.htdig.org> from "J. op den Brouw" at Feb 2, 99 10:46:31 am X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [htdig3-dev] Parsing Ms WordAccording to J. op den Brouw: > Gilles Detillieux wrote: > > According to J. op den Brouw: > > > Well , the web sever sends you a mime-type back that > > > is configured for the extnsion .doc. The server doesn't > > > know what the contents is. WP docs should have > > > extensions like .wp or .wp5 or .wp<whatever> > > > > (Snip a lot...) > > Here is a WP 6 file that has a .doc extention. Try to index it > and you'll see (I hope) that htdig crashes because catdoc > sends back 8-bit characters... > > http://www.st.hhs.nl/htdig/cec3wp6.doc
OK, I grabbed the file, but I haven't set up catdoc on my system yet. That's why I was hoping you'd test out my patched version of ExternalParsers.cc for me. :) Your message doesn't make it clear if htdig still crashes after the patch is applied. If it does, I'd gladly look into it further. I don't spot anything in the code that would blow up on 8-bit characters, but that doesn't mean testing won't reveal something.
Just so I know I'm testing the same thing you are, which version of catdoc & htparsedoc are you running, and where can I get it. All I have is the stuff in contrib/htparsedoc, from Sept. 7.
Also, if you can get a backtrace from a core dump when htdig crashes, I'd like to see where it's happening. I can try to reproduce the problem here, but I'd like to know if what I try to find and fix is the same problem you're running into - these things are sometimes system dependent.
-- Gilles R. Detillieux E-mail: <grdetil@scrc.umanitoba.ca> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:23 1999 Return-Path: <grdetil@scrc.umanitoba.ca> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id MAA13840 for <andrew@contigo.com>; Tue, 2 Feb 1999 12:00:33 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id MAA28961; Tue, 2 Feb 1999 12:00:02 -0800 (PST) From: Gilles Detillieux <grdetil@scrc.umanitoba.ca> Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B7596E.BeroList-2.5.9@sob.htdig.org> Date: Tue, 2 Feb 1999 13:58:57 -0600 (CST) In-Reply-To: <36B0A8A7.BeroList-2.5.9@sob.htdig.org> from "Geoff Hutchison" at Jan 28, 99 01:10:52 pm X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [htdig3-dev] TODO updateAccording to Geoff Hutchison: > OK, here's an update to the TODO file. I expect others may have additional > suggestions. I'll briefly mention things I'd like to see in 3.2. First off, > I'd like to see multiple transport protocols and this will most likely > require the mime.types stuff Gilles was talking about (and not just for > local files!).
Another thing I thought should be added to the TODO file would be a feature to allow chaining of parsers. E.g. if an external parser converts a file to HTML or PostScript, then the external parser handler should pass this on to the appropriate parser. This would require being able to identify file types by their content, which should be fairly easy for HTML and Postscript, but if you want to involve more file types that can be chained, you'd need support for something like an /etc/magic file. I guess another way this could be handled would be to come up with external parsers for HTML & PostScript, that produced the codes used by ExternalParser.cc. Then, the chaining could be done within shell or perl scripts.
-- Gilles R. Detillieux E-mail: <grdetil@scrc.umanitoba.ca> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:23 1999 Return-Path: <ghutchis@wso.williams.edu> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id OAA21774 for <andrew@contigo.com>; Tue, 2 Feb 1999 14:07:39 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id OAA29444; Tue, 2 Feb 1999 14:07:48 -0800 (PST) From: Geoff Hutchison <ghutchis@wso.williams.edu> Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B7773E.BeroList-2.5.9@sob.htdig.org> In-Reply-To: <36B7596E.BeroList-2.5.9@sob.htdig.org> References: <36B0A8A7.BeroList-2.5.9@sob.htdig.org> from "Geoff Hutchison" at Jan 28, 99 01:10:52 pm Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Date: Tue, 2 Feb 1999 17:01:09 -0400 Cc: htdig3-dev@htdig.org Subject: Re: [htdig3-dev] TODO update>Another thing I thought should be added to the TODO file would be a >feature to allow chaining of parsers. E.g. if an external parser
Several people have mentioned this in different language. Here's how I would say it:
* Allow "external decoders," programs to perform some action on files before parsing. * Compress, gzip, bzip2, zlib decoders * DVI, TeX -> PS decoders
-Geoff
------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:23 1999 Return-Path: <tlm@po-net.prato.it> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id CAA17364 for <andrew@contigo.com>; Wed, 3 Feb 1999 02:03:29 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id CAA32082; Wed, 3 Feb 1999 02:02:54 -0800 (PST) From: "U.O. Telematica Municipale - Comune di Prato" <tlm@po-net.prato.it> Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B81EFF.BeroList-2.5.9@sob.htdig.org> X-Sender: c.giorge@mbox.comune.prato.it X-Mailer: Windows Eudora Pro Version 3.0.1 (32) [I] Date: Wed, 03 Feb 1999 11:04:48 +0100 In-Reply-To: <36B5AE32.BeroList-2.5.9@sob.htdig.org> References: <36B59C38.BeroList-2.5.9@sob.htdig.org> <36B28F15.BeroList-2.5.9@sob.htdig.org> <36B1A24C.BeroList-2.5.9@sob.htdig.org> Mime-Version: 1.0 Content-Type: text/enriched; charset="us-ascii" Subject: Re: [htdig3-dev] Updating only a part of the database
In 08.27 01/02/99 -0400, hai scritto:
>Go to http://www.htdig.org/files/snapshots/ Download
>htdig-3.1.0dev-013199.tar.gz. This is a snapshot of the latest development
>code from the CVS tree. It was current as of 12:00AM EST Sunday morning.
>
>-Geoff
>
I did this way. But it doesn't work yet as I would.
Infact if I re-run htdig <underline>without options</underline>, trying to update only a part of the whole database, it works very well. It recognizes the changed files and not found ones too. But I would like to make it run with "-a" option, that would allow me to make it searchable while the updating process.
Here it is:
HTDIG RUN WITHOUT OPTIONS
0:1:1:http://search.comune.prato.it:81/prova/2/2.htm: (changed) * size = 202
1:3:2:http://search.comune.prato.it:81/prova/2/4/4.htm: not changed
2:4:2:http://search.comune.prato.it:81/prova/2/5/5.htm: not found
3:2:1:http://search.comune.prato.it:81/prova/3/3.htm: not changed
4:0:-1:http://search.comune.prato.it:81/prova/home.htm: not changed
htdig: Run complete
htdig: 1 server seen:
htdig: search.comune.prato.it:81 5 documents
htdig: Errors to take note of:
Not found: http://search.comune.prato.it:81/prova/2/5/5.htm Ref:
htmerge: Sorting...
htmerge: doc #1 has been superceeded.
<bold>htmerge: Removing doc #4
</bold>htmerge: Merging...
As you can see, it removes the file that it doesn't find.
HTDIG WITH RUN "-a" OPTION
0:1:1:http://search.comune.prato.it:81/prova/2/2.htm: (changed) * size = 202
1:3:2:http://search.comune.prato.it:81/prova/2/4/4.htm: not changed
2:4:2:http://search.comune.prato.it:81/prova/2/5/5.htm: not found
3:2:1:http://search.comune.prato.it:81/prova/3/3.htm: not changed
4:0:-1:http://search.comune.prato.it:81/prova/home.htm: not changed
htdig: Run complete
htdig: 1 server seen:
htdig: search.comune.prato.it:81 5 documents
htdig: Errors to take note of:
Not found: http://search.comune.prato.it:81/prova/2/5/5.htm Ref:
htmerge: Sorting...
htmerge: Merging...
As you can see, it doesn't remove the file http://search.comune.prato.it:81/prova/2/5/5.htm
I hope I could be useful to you
Thanks and Ciao
Gabriele
----------------------------------------------------------
U.O. Rete Civica - Comune di Prato
Via Ricasoli, 4 - 59100 Prato PO Italia
Tel. +39 0574616342 Fax +39 0574616003
E-Mail: tlm@mbox.comune.prato.it
---------------------------------------------------------- ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:23 1999 Return-Path: <tlm@po-net.prato.it> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id CAA18293 for <andrew@contigo.com>; Wed, 3 Feb 1999 02:33:06 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id CAA32163; Wed, 3 Feb 1999 02:33:21 -0800 (PST) From: "U.O. Telematica Municipale - Comune di Prato" <tlm@po-net.prato.it> Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B825FA.BeroList-2.5.9@sob.htdig.org> X-Sender: c.giorge@mbox.comune.prato.it X-Mailer: Windows Eudora Pro Version 3.0.1 (32) [I] Date: Wed, 03 Feb 1999 11:35:14 +0100 Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: [htdig3-dev] How should I set the date for htnotify
Hi folks.
I once used htnotify by inserting into the documents the htdig-notification-date as follows:
<META NAME="htdig-notification-date" CONTENT="2/10/1999">
And it all went right.
But since I have set the iso_8601 option to true, it doesn't work anymore.
I tried with the yyyy/mm/dd format for the day, but nothing ...
Can you help me?
Ciao and thanx Gabriele
----------------------------------------------------------
U.O. Rete Civica - Comune di Prato Via Ricasoli, 4 - 59100 Prato PO Italia Tel. +39 0574616342 Fax +39 0574616003
http://www.comune.prato.it E-Mail: tlm@mbox.comune.prato.it
---------------------------------------------------------- ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:23 1999 Return-Path: <ghutchis@wso.williams.edu> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id HAA28532 for <andrew@contigo.com>; Wed, 3 Feb 1999 07:31:59 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id HAA00631; Wed, 3 Feb 1999 07:32:00 -0800 (PST) From: Geoff Hutchison <ghutchis@wso.williams.edu> Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B86C03.BeroList-2.5.9@sob.htdig.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Date: Wed, 3 Feb 1999 10:29:47 -0400 Subject: [htdig3-dev] Release Status
Aside from the problem Gabriele reported, I think we're pretty set. I wanted to put in code to call sort -m when merging databases for a speed boost, but it's not worth holding up the release.
If people can send me documentation updates, I'd be psyched. Should I roll a pre-release with the docs slightly unfinished?
-Geoff
REPORTED SHOWSTOPPERS: * Problems between running update digs with or without -a.
OTHER BUGS: (none outstanding)
ISSUES: * htdig usage message does not specify -l option for start and resume feature. (if -l is specified, on receiving a signal, htdig will write the urls it must process to the file config["url_log"] and resume from them if called with -l when restarted.)
DOCUMENTATION: * htmerge for -m option to merge databases * htdig for -l option for stop and restart feature * attrs.html, cf_byname, cf_byprog for: url_log compression_level noindex_start noindex_end allow_in_form bad_querystr no_title_text
------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:24 1999 Return-Path: <ghutchis@wso.williams.edu> Received: from wso.williams.edu (wso.williams.edu [137.165.37.207]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id JAA02103 for <andrews@contigo.com>; Wed, 3 Feb 1999 09:17:29 -0800 (PST) Received: from localhost (ghutchis@localhost) by wso.williams.edu (8.9.2/8.9.2/Debian/GNU) with SMTP id MAA18175; Wed, 3 Feb 1999 12:17:29 -0500 (EST) Date: Wed, 3 Feb 1999 12:17:28 -0500 (EST) From: Geoff Hutchison <ghutchis@wso.williams.edu> To: Andrew Scherpbier <andrews@contigo.com> cc: "J. op den Brouw" <MSQL_User@st.hhs.nl>, Geoff Hutchison <Geoffrey.R.Hutchison@williams.edu> Subject: Re: [htdig3-dev] Odd comment... In-Reply-To: <36B8833C.2DAD358F@contigo.com> Message-ID: <Pine.LNX.3.96.990203121536.18098A-100000@wso.williams.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII
On Wed, 3 Feb 1999, Andrew Scherpbier wrote:
> > When is the first htdig conference? > Ask Geoff, I guess. :-)
Good question. I think from the discussion on the list, the location would have to be Amsterdam or Den Haag, no? :-)
I guess it depends on travel arrangements. (Can I earn enough money to buy the ticket and spend a week or so?) Maybe this summer.
-Geoff Hutchison Williams Students Online http://wso.williams.edu/ From - Thu Feb 4 22:12:24 1999 Return-Path: <grdetil@scrc.umanitoba.ca> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id KAA05158 for <andrew@contigo.com>; Wed, 3 Feb 1999 10:06:47 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id KAA01170; Wed, 3 Feb 1999 10:07:04 -0800 (PST) From: Gilles Detillieux <grdetil@scrc.umanitoba.ca> Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B8904E.BeroList-2.5.9@sob.htdig.org> Date: Wed, 3 Feb 1999 12:05:11 -0600 (CST) Cc: htdig3-dev@htdig.org In-Reply-To: <36B825FA.BeroList-2.5.9@sob.htdig.org> from "U.O. Telematica Municipale - Comune di Prato" at Feb 3, 99 11:35:14 am X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [htdig3-dev] How should I set the date for htnotify
According to U.O. Telematica Municipale - Comune di Prato: > I once used htnotify by inserting into the documents the > htdig-notification-date as follows: > > <META NAME="htdig-notification-date" CONTENT="2/10/1999"> > > And it all went right. > > But since I have set the iso_8601 option to true, it doesn't work anymore. > > I tried with the yyyy/mm/dd format for the day, but nothing ... > > Can you help me?
Use yyyy-mm-dd instead. I wonder if this couldn't be made more flexible, regardless of the setting of iso_8601.
-- Gilles R. Detillieux E-mail: <grdetil@scrc.umanitoba.ca> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:24 1999 Return-Path: <ghutchis@wso.williams.edu> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id MAA13902 for <andrew@contigo.com>; Wed, 3 Feb 1999 12:26:26 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id MAA01594; Wed, 3 Feb 1999 12:26:28 -0800 (PST) From: Geoff Hutchison <ghutchis@wso.williams.edu> Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B8B106.BeroList-2.5.9@sob.htdig.org> Date: Wed, 3 Feb 1999 15:25:25 -0500 (EST) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Subject: [htdig3-dev] Re: Vunerability in ht://Dig: htnotify/VU#8258 (fwd)I thought you guys should know about this. (Yes, I'm aware that this message will go to the publically-available archive.)
-Geoff
---------- Forwarded message ---------- Date: Tue, 02 Feb 99 22:32:43 EST From: "CERT(R) Coordination Center" <cert@cert.org> To: Geoff Hutchison <ghutchis@wso.williams.edu> Cc: "CERT(R) Coordination Center" <cert@cert.org> Subject: Re: Vunerability in ht://Dig: htnotify/VU#8258
-----BEGIN PGP SIGNED MESSAGE-----
Geoff Hutchison <ghutchis@wso.williams.edu> writes: >That's pretty close. To be completely precise, you can run arbitrary >commands with the account running ht://Dig. (OK, I'm splitting > hairs.)
Splitting hairs is fine by us. :-)
>See http://www.htdig.org/uses.html for some sites that use it. I would guess >a lot of sites use it beyond those we know. It's in all the Linux >distributions and available on sunsite, etc. The sites that deploy it are a >rather heterogenous bunch. They range from corporate users to univerities to >the FSF itself!
Okay; thanks for the pointer. I'll go check it out.
>I have posted public notices to the ht://Dig mailing list at >htdig@htdig.org, in part because we released a version (3.1.0b4) that fixed >the bug. The publically-available release notes mention the bug, though it >does not give specific details. I have heard of no exploits based on this >vulnerability.
We haven't gotten any incident reports that seem to involve it in any way.
>In the next week, we expect to release the final 3.1.0 release, which is >also free from this bug. If the CERT advisory comes out after the 3.1.0 >release, we'd naturally advise all users to upgrade to that release.
Okay; when I'm back in the office (I'm at home now) I'll check our calendar, and send you a possible schedule. My guess is that Tuesday the 15th or Wednesday the 16th would be good. How does that sound?
Generally, we like to contact the OS vendors who may distribute the vulnerable software as well as the maintainers and distributors of the package itself. To your knowledge, what vendors distribute ht::/Dig? Based on your comments above, I'll assume most of the Linux vendors do. Do you know if any of the commercial vendors do? Is there a version for NT, MacOS, OS/2, Novell, or other operating systems?
Finally, do you have a PGP key we can use to encrypt our communications with?
Thanks, Shawn
- -- Shawn V. Hernan CERT (R) Coordination Center Software Engineering Institute Carnegie Mellon University Pittsburgh, PA USA 15213-3890
-----BEGIN PGP SIGNATURE----- Version: 2.6.2
iQCVAwUBNrfFenVP+x0t4w7BAQFHKwP/bocYm/A7VvaVJtHzm2HsUoa96B4/OTqK aDea+6lGhmbDh/pNcAos0fSOsvsetXgRPO+wvnXvzkx/dkZdVEnJimzv7bhYd3bf 7Bz4IIS3+5JV+za0T6KNDhaBrrFOL6ATNaBvc0dIICy+MF+3Gjd0wOfctG9UFRjD 0s53fV+s/zg= =MWiZ -----END PGP SIGNATURE-----
------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:24 1999 Return-Path: <grdetil@scrc.umanitoba.ca> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id NAA16862 for <andrew@contigo.com>; Wed, 3 Feb 1999 13:21:23 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id NAA01734; Wed, 3 Feb 1999 13:21:40 -0800 (PST) From: Gilles Detillieux <grdetil@scrc.umanitoba.ca> Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B8BDE6.BeroList-2.5.9@sob.htdig.org> Date: Wed, 3 Feb 1999 15:20:31 -0600 (CST) Cc: htdig3-dev@htdig.org In-Reply-To: <36B86C03.BeroList-2.5.9@sob.htdig.org> from "Geoff Hutchison" at Feb 3, 99 10:29:47 am X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [htdig3-dev] Release Status
According to Geoff Hutchison: > Aside from the problem Gabriele reported, I think we're pretty set. I > wanted to put in code to call sort -m when merging databases for a speed > boost, but it's not worth holding up the release. > > If people can send me documentation updates, I'd be psyched. Should I roll > a pre-release with the docs slightly unfinished?
I for one would like to see a pre-release in the next day or two, to test all the latest patches (or am I the only one still sending them in? :) Give us at least a day, though, to get in any last bug fixes, and documentation updates in the works right now.
> REPORTED SHOWSTOPPERS: > * Problems between running update digs with or without -a.
The only explanation I could find for this problem, after looking through the code, is that with the -a, the db.wordlist.work file doesn't have any records in it beginning with "-" or "!", though I can't see anything in htdig that would prevent those records from appearing. Can Gabriele provide a dump of db.wordlist.work, after htdig but before htmerge, to confirm this? This would tell us whether to focus on htdig or htmerge as the cause of the problem.
> OTHER BUGS: > (none outstanding)
Doesn't stop me from finding 'em! :) See below.
> ISSUES: > * htdig usage message does not specify -l option for start and resume feature. > (if -l is specified, on receiving a signal, htdig will write the urls it > must process to the file config["url_log"] and resume from them if called > with -l when restarted.)
Another issue is that the new --with-cgi-bin-dir stuff in configure has broken my current RPM support. I'm going to try to figure out a technique like Ric was suggesting earlier, which will require patches to the Makefile.in's. I hope to get this done today or tomorrow, and I'd like it in the final release if possible, but it's not a show-stopper. If need be, I'll just include my own patches in my source RPMs.
> DOCUMENTATION: > * htmerge for -m option to merge databases > * htdig for -l option for stop and restart feature > * attrs.html, cf_byname, cf_byprog for: > url_log > compression_level > noindex_start > noindex_end > allow_in_form > bad_querystr > no_title_text
I was just looking through the no_title_text handling in htsearch/Display.cc, and noticed something that got dropped when this code was put in. It no longer handles the case where the URL has no "/", and leaves str uninitialized before it's used. This should fix it:
--- ./htsearch/Display.cc.noslashbug Fri Jan 29 12:55:08 1999 +++ ./htsearch/Display.cc Wed Feb 3 11:41:51 1999 @@ -277,8 +277,10 @@ { title++; // Skip slash str = new String(form("[%s]", title)); - // URL without '/' ?? } + else + // URL without '/' ?? + str = new String("[No title]"); } else // use configure 'no title' text
An unlikely event to occur, sure, but I don't like to see bits of defensive programming being taken out, especially in light of all the problems we've been dealing with that cause segmentation faults.
-- Gilles R. Detillieux E-mail: <grdetil@scrc.umanitoba.ca> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:24 1999 Return-Path: <ghutchis@wso.williams.edu> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id OAA19863 for <andrew@contigo.com>; Wed, 3 Feb 1999 14:22:38 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id OAA01855; Wed, 3 Feb 1999 14:22:32 -0800 (PST) From: Geoff Hutchison <ghutchis@wso.williams.edu> Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B8CC36.BeroList-2.5.9@sob.htdig.org> Date: Wed, 3 Feb 1999 17:21:24 -0500 (EST) cc: htdig3-dev@htdig.org In-Reply-To: <199902032120.PAA30316@cliff.scrc.umanitoba.ca> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Subject: Re: [htdig3-dev] Release Status> in? :) Give us at least a day, though, to get in any last bug fixes, > and documentation updates in the works right now.
That was the plan. It may take that long to try it out on a variety of platforms. I just tried it under gcc-2.8/Solaris-2.6 and it won't compile!
> Doesn't stop me from finding 'em! :) See below.
Or me. Over the weekend I fixed a rather subtle bug in htfuzzy support--basically some fuzzy matching was previously case-sensitive!
> to the Makefile.in's. I hope to get this done today or tomorrow, and I'd > like it in the final release if possible, but it's not a show-stopper. > If need be, I'll just include my own patches in my source RPMs.
Oh well. I was hoping that stuff would *help* RPM and other package maintainers. I wasn't completely sure of the changes needed in the Makefile.in as suggested, so I figured I'd hold off for someone who knew better.
> An unlikely event to occur, sure, but I don't like to see bits of > defensive programming being taken out, especially in light of all the > problems we've been dealing with that cause segmentation faults.
My mistake. Marjolein's patches needed to be put in by hand, and this was left out inadvertently.
-Geoff
------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:24 1999 Return-Path: <grdetil@scrc.umanitoba.ca> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id OAA21244 for <andrew@contigo.com>; Wed, 3 Feb 1999 14:51:30 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id OAA01919; Wed, 3 Feb 1999 14:51:53 -0800 (PST) From: Gilles Detillieux <grdetil@scrc.umanitoba.ca> Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B8D30A.BeroList-2.5.9@sob.htdig.org> Date: Wed, 3 Feb 1999 16:50:41 -0600 (CST) Cc: htdig3-dev@htdig.org In-Reply-To: <36B8CC36.BeroList-2.5.9@sob.htdig.org> from "Geoff Hutchison" at Feb 3, 99 05:21:24 pm X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [htdig3-dev] Release Status
According to Geoff Hutchison: > > Doesn't stop me from finding 'em! :) See below. > > Or me. Over the weekend I fixed a rather subtle bug in htfuzzy > support--basically some fuzzy matching was previously case-sensitive!
D'oh! That reminds me, I keep forgetting to mention this, but I noticed that even though the URLs used as keys for the db.docdb database are no longer mapped to lower case, the URLs used for tracking visited documents internally in htdig still are. I just thought I should mention that, as there may still be a problem with upper vs lower case conflicts in URLs, if anyone actually uses documents with the same name, but different case. I wouldn't call that a show-stopper, though.
> > to the Makefile.in's. I hope to get this done today or tomorrow, and I'd > > like it in the final release if possible, but it's not a show-stopper. > > If need be, I'll just include my own patches in my source RPMs. > > Oh well. I was hoping that stuff would *help* RPM and other package > maintainers. I wasn't completely sure of the changes needed in the > Makefile.in as suggested, so I figured I'd hold off for someone who knew > better.
Ultimately, it should help. I just have to adapt my stuff to use it. Right now, my spec file patches CONFIG.in to use the paths I want. That stopped working when you took some of the paths out of CONFIG.in, so now I have to pass them to configure. I think that will mean that I have to move the BuildRoot support from CONFIG to Makefile.in, as Ric was suggesting. I just have to find the time to figure it out.
-- Gilles R. Detillieux E-mail: <grdetil@scrc.umanitoba.ca> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:24 1999 Return-Path: <ghutchis@wso.williams.edu> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id PAA23052 for <andrew@contigo.com>; Wed, 3 Feb 1999 15:09:06 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id PAA01971; Wed, 3 Feb 1999 15:09:24 -0800 (PST) From: Geoff Hutchison <ghutchis@wso.williams.edu> Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B8D725.BeroList-2.5.9@sob.htdig.org> Date: Wed, 3 Feb 1999 18:08:13 -0500 (EST) cc: htdig3-dev@htdig.org In-Reply-To: <36B8D30A.BeroList-2.5.9@sob.htdig.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Subject: Re: [htdig3-dev] Release StatusOn Wed, 3 Feb 1999, Gilles Detillieux wrote:
> internally in htdig still are. I just thought I should mention that, as > there may still be a problem with upper vs lower case conflicts in URLs,
Hans-Peter checked in a patch for this. It was after the last snapshot, so you don't have it yet.
Cheers, -Geoff
------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:24 1999 Return-Path: <ghutchis@wso.williams.edu> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id QAA27544 for <andrew@contigo.com>; Wed, 3 Feb 1999 16:38:26 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id QAA02402; Wed, 3 Feb 1999 16:38:39 -0800 (PST) From: Geoff Hutchison <ghutchis@wso.williams.edu> Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B8EC1E.BeroList-2.5.9@sob.htdig.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Date: Wed, 3 Feb 1999 19:35:55 -0400 Subject: [htdig3-dev] Prerelease Snapshot
Hi,
I think there are still some remaining problems to be worked out. I couldn't get the source to compile under gcc-2.8 on Solaris 2.6 this afternoon. :-( I also received a message from someone reporting a bus error on the latest (1/31) snapshot. Plus there's still the -a oddness...
Nevertheless, I made a "prelease" snapshot. Heck, I even went so far as to change the version number. ;-)
Let's stomp some bugs... -Geoff
------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:24 1999 Return-Path: <ghutchis@wso.williams.edu> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id TAA01437 for <andrew@contigo.com>; Wed, 3 Feb 1999 19:31:19 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id TAA02983; Wed, 3 Feb 1999 19:31:10 -0800 (PST) From: Geoff Hutchison <ghutchis@wso.williams.edu> Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B91484.BeroList-2.5.9@sob.htdig.org> Date: Wed, 3 Feb 1999 22:30:02 -0500 (EST) MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="-559023410-360206568-918097638=:22855" Content-ID: <Pine.LNX.3.96.990203222452.11508C@wso.williams.edu> Subject: [htdig3-dev] Re: [htdig] segmentation fault in htsearch (fwd)
This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info.
---559023410-360206568-918097638=:22855 Content-Type: TEXT/PLAIN; CHARSET=US-ASCII Content-ID: <Pine.LNX.3.96.990203222452.11508D@wso.williams.edu>
Well I've been spending some time working out bugs upon bugs with Andrew here. First off, it seems we get segfaults on Solaris 2.6 in getdate, when we call the system function strftime! Fortunately I have an ultra running 2.6 I can test.
But when we fixed that, it died in the HTML parser, as seen here. This is clearly a showstopper. The HTML file is attached, though I'm betting on a database problem...
-Geoff
---------- Forwarded message ---------- Date: Wed, 3 Feb 1999 19:07:18 -0800 (PST) From: Andrew Storrs <astorrs@cs.sfu.ca> To: Geoff Hutchison <ghutchis@wso.williams.edu> Subject: Re: [htdig] segmentation fault in htsearch
On Wed, 3 Feb 1999, Geoff Hutchison wrote:
> Well, I didn't mean that *specific* locale. See /usr/lib/locale for > Solaris boxes. It varies considerably depending on US v. Canada. :-)
Okay thanks I'll figure it out.
> Wonderful! BTW, you're one of the most helpful bug-hunters yet. If you > could send me the HTML file and a backtrace, I'd appreciate it. This > signals a bug in the HTML parser. :-(
I have attached the HTML file, its one of the many sites hosted on the machine, so I'm not sure what part of the file is causing the problem, i only chose to search it because the site is very small and i could get error messages faster.
As for the backtrace:
#0 0x3959c in DocumentRef::Deserialize (this=0x174550, stream=@0x176cb5) at DocumentRef.cc:468 #1 0x3788c in DocumentDB::operator[] (this=0x174550, u=0xeffff370 "") at DocumentDB.cc:149 #2 0x323a8 in Retriever::got_href (this=0xeffff7c0, url=@0x163738, description=0x1703e8 "") at Retriever.cc:1083 #3 0x2e050 in HTML::do_tag (this=0x160540, retriever=@0xeffff7c0, tag=@0x164b89) at HTML.cc:929 #4 0x2cabc in HTML::parse (this=0x160540, retriever=@0xeffff7c0, baseURL=@0x103a) at HTML.cc:248 #5 0x311a0 in Retriever::RetrievedDocument (this=0xeffff7c0, doc=@0x11b0e8, ref=0x163b08) at Retriever.cc:674 #6 0x30d6c in Retriever::parse_url (this=0xeffff7c0, urlRef=@0x170a48) at Retriever.cc:576 #7 0x307b0 in Retriever::Start (this=0xeffff7c0) at Retriever.cc:406 #8 0x34c78 in main (ac=3, av=0xeffff9d4) at main.cc:275
---559023410-360206568-918097638=:22855 Content-Type: TEXT/PLAIN; CHARSET=US-ASCII; NAME="errata.html" Content-ID: <Pine.SO4.4.02.9902031907180.22855@geranium.css.sfu.ca> Content-Description: Content-Transfer-Encoding: BASE64
PEhUTUw+DQo8VElUTEU+IEdyZWF0IENhbmFkaWFuIFNjaWVudGlzdHMgQm9v ayBFcnJhdGEgPC9USVRMRT4NCjxCT0RZIEJHQ09MT1I9IiNmZmZmZmYiIEFM SU5LPSIjZmZmZmZmIiBWTElOSz0iI2ZmMDAwMCI+IA0KDQo8Q0VOVEVSPg0K PEJSPg0KPElNRyBTUkM9ImdyYXBoaWNzL2Jhbm5lci5naWYiPjxCUj4NCjwv Q0VOVEVSPg0KDQoNCjxIUiBOT1NIQURFPg0KDQo8aDE+DQpCb29rIEVycmF0 YSAmIEFkZGVuZGE8L2gxPg0KDQoNCkFsdGhvdWdoIGV2ZXJ5IGVmZm9ydCB3 YXMgbWFkZSB0byBiZSBhY2N1cmF0ZSwgZXJyb3JzIGFuZA0Kb21pc3Npb25z IGRvIG9jY3VyIGluIHRoZSBib29rLiBXZSBhcG9sb2dpemUgZm9yIGFueQ0K aW5jb252ZW5pZW5jZSBvciBlbWJhcnJhc3NtZW50IHRoaXMgbWlnaHQgaGF2 ZSBjYXVzZWQgYW5kIHdlIGFzayB5b3UgDQp0byB1c2Ugb3VyIDxBIEhSRUY9 ImZlZWRiYWNrLmh0bWwiPmZlZWRiYWNrIHBhZ2U8L0E+IHRvIHJlcG9ydCBh bnkgb3RoZXIgZXJyb3JzIHNvIHRoZXkgY2FuIGJlIGNvcnJlY3RlZCBpbiB0 aGUgbmV4dCBwcmludGluZyBvZiB0aGUgYm9vay48cD4NCg0KPGgyPg0KPEEg TkFNRT0iSGVhZGluZzMiPkVycmF0YTwvQT48L2gyPg0KDQpwLiAyOiAgVGhl IGZvbGxvd2luZyB2ZXJ5IGltcG9ydGFudCB0ZXh0IHdhcyBvbWl0dGVkIGZy b20gdGhlIGJhY2sgb2YgdGhlIHRpdGxlIHBhZ2UuICJBbGwgcGhvdG9zIGFy ZSBlaXRoZXIgYnkgQmFycnkgU2hlbGwgb3Igd2VyZSBwcm92aWRlZCBieSB0 aGUgcmVzcGVjdGl2ZSBzY2llbnRpc3RzIGZvciB0aGVpciBjaGFwdGVycyBl eGNlcHQgZm9yIHRoZSBwaG90byBvZiBKb2huIFBvbGFueWkgKHAuIDg4KSBi eSBCcmlhbiBXaWxsZXIsIE1hY2xlYW5zOyB0aGUgcGhvdG8gb2YgSm9kaSBL YWN6dXIgYW5kIEFybm9sZCBTY2h3YXJ6ZW5uZWdlciAocC4gMTE5KSBieSBM b3JyYWluZSBLYWN6dXI7IGFuZCB0aGUgaW1hZ2VzIG9mIE5HQyA0MjYxIGZy b20gdGhlIEh1YmJsZSBzcGFjZSB0ZWxlc2NvcGUgKHAuIDU5KSBieSBIb2xs YW5kIEZvcmQgYW5kIExhdXJhIEZlcnJlcmVzZSAoSm9obnMgSG9wa2lucyBV bml2ZXJzaXR5KSwgU3BhY2UgVGVsZXNjb3BlIFNjaWVuY2UgSW5zdGl0dXRl LCBhbmQgTkFTQS4iDQogPFA+DQpwLiAyOiAgTGF1cmEgV2FsbGFjZSBkaWQg bWFueSBvZiB0aGUgcGVuIGFuZCBpbmsgZnJlZWhhbmQgZHJhd2luZ3MgdGhh dCBhcHBlYXIgb24gdGhlIGNvdmVyIGFuZCB0aHJvdWdob3V0IHRoZSBib29r Lg0KIDxQPg0KcC4gMjk6ICBUaGVyZSBpcyBubyBuZWVkIGZvciB0aGUgZXhw ZXJpbWVudGVyIHRvIHJlY29yZCB0aGUgbnVtYmVyIG4uIFRoZSBudW1iZXIg biBpcyBqdXN0IGEgbmFtZSB0YWcgYW5kIGlzIG5vdCBwYXJ0IG9mIHRoZSBj YWxjdWxhdGlvbi4gVGhlIG51bWJlcnMgaDAsIGgxLCBoMiwgZXRjLiBhcmUg bGlrZSBzYXlpbmcgImhlaWdodCB6ZXJvIiwgImhlaWdodCBvbmUiLCAiaGVp Z2h0IHR3byIsIGV0Yy4gVGhlICJuIiBqdXN0IHRha2VzIHRoZSBwbGFjZSBv ZiB0aGUgMCwgMSwgMiwgLi4uIGV0Yy4gaW4gdGhlIGhuIG5vdGF0aW9uLiBU aGUgbnVtYmVyIG4gaXMgbm90IHVzZWQgaW4gdGhlIGNhbGN1bGF0aW9uLCBo ZW5jZSB0aGVyZSBpcyBubyBuZWVkIHRvIHJlY29yZCBpdC4gV2UgYWxyZWFk eSBnaXZlIHNwYWNlcyB0byB3cml0ZSBkb3duIGgwLCBoMSwgaDIsIGFuZCBo MyBhbmQgdGhlc2UgYXJlIHRoZSBvbmx5IG51bWJlcnMgeW91IG5lZWQuDQoN CiA8UD4NCnAuIDYzOiBJbiB0aGUgaWxsdXN0cmF0aW9uICgxLiBNZW50YWwg Um90YXRpb24gVGVzdCkgeW91IGNhbiBmaW5kIHR3byBtYXRjaGVzDQpmb3Ig dGhlIG9iamVjdCBvbiB0aGUgbGVmdCwgbm90IGp1c3Qgb25lIGFzIGl0IHNh eXMgaW4gdGhlIHRleHQuDQo8UD4NCg0KcC4gMTAyOiB1bmRlciBudW1iZXIg NSwgc29tZWhvdyB0aGUgbGFzdCBsaW5lIGdvdCBjdXQgb2ZmLiBUaGUgbWlz c2luZyB3b3JkcyBhcmUgInRha2UgYSBsb25nIHRpbWUgdG8gZ2V0IGV4YWN0 bHkgdGhlIHJpZ2h0IG11dGF0aW9uLiINCiA8UD4NCnAuIDEyMDogVGhlcmUg aXMgYSBudW1iZXIgKDcpIGluIHRoZSBpbWFnZSB0aGF0IHNob3VsZCBub3Qg YmUgdGhlcmUuDQo8UD4NCnAuIDEyNDogVGhlIGljb25zIGZvciBtYWxlIGFu ZCBmZW1hbGUgd2VyZSBhY2NpZGVudGFsbHkgdHJhbnNwb3NlZC4gVGhlDQpn cmFwaGljIGZvciBtYWxlIHNob3VsZCBiZSB0aGUgZ3JhcGhpYyBmb3IgZmVt YWxlIGFuZCB2aWNlIHZlcnNhLiBUaGV5IGFyZQ0KY29ycmVjdCB0aHJvdWdo b3V0IHRoZSBzaG9ydCBiaW9ncmFwaHkgc2VjdGlvbiwgaG93ZXZlci4NCg0K cC4gMTI4OiBBbGV4YW5kZXIgR3JhaGFtIEJlbGwgd2FzIGJvcm4gaW4gMTg0 Nywgbm90IDE4NzQuIA0KIDxQPg0KcC4gMTU3LTg6IExhaWRsZXIgcmVjZWl2 ZWQgaGlzIEIuQSBpbiAxOTM3IGF0IE94Zm9yZCwgd2hlcmUgaGUgbGF0ZXIg Z290IGFuIE1BICgxOTM4KSBhbmQgRFNjICgxOTQwKS4gRHVyaW5nIFdvcmxk IFdhciBJSSBMYWlkbGVyIHdvcmtlZCBpbiBDYW5hZGEgKHdpdGggYSBwZXJp b2QgaW4gRW5nbGFuZCksIGFuZCBlbmRlZCB1cCBhcyBhIENoaWVmIFNjaWVu dGlmaWMgT2ZmaWNlciBhdCB0aGUgQmFsbGlzdGljcyBMYWJvcmF0b3JpZXMg aW4gVmFsY2FydGllciwgUXVlYmVjLg0KIDxQPg0KcC4gMTY1OiBNZW50ZW4n cyBmaXJzdCBuYW1lcyB3ZXJlIE1hdWQgTGVvbm9yYSAobm90IE1hdWRlKS4g DQogPFA+DQpwLiAxNzA6ICBIb3dhcmQgUmFwc29uIGRpZWQgaW4gTWFyY2gs IDE5OTcuIA0KIDxQPg0KcC4gMTg2OiBILiBHLiBXZWxscyBzaG91bGQgYmUg dW5kZXIgVyBub3QgSCBpbiB0aGUgZ2xvc3NhcnkuDQogPFA+DQoNCjxoMj5B ZGRlbmRhPC9oMj4NCk1hbnkgb3RoZXIgQ2FuYWRpYW4gc2NpZW50aXN0cyBo YXZlIGNvbWUgdG8gbXkgYXR0ZW50aW9uIHNpbmNlIGJlZ2lubmluZyB0aGlz IHByb2plY3QuIEknbSBzb3JyeSB0aGV5IHdlcmUgb3Zlcmxvb2tlZCBpbiB0 aGUgYm9vay4gU29tZSB0aGF0IHdpbGwgYmUgaW5jbHVkZWQgaGVyZSBvbiB0 aGUgd2Vic2l0ZSBhbmQgaW4gc3Vic2VxdWVudCBlZGl0aW9ucyBhcmU6DQo8 VUw+DQo8TEk+Ui4gRi4gVy4gQmFkZXIgIChRdWFudHVtIHRoZW9yeSBvZiBh dG9tcyBpbiBtb2xlY3VsZXMpDQo8TEk+Tm9ybWFuIExldmkgQm93ZW4gKFBl dHJvbG9neSwgQm93ZW4gcmVhY3Rpb24gc2VyaWVzKQ0KPExJPlJlZ2luYWxk IEJ1bGxlciAoUGh5c2ljYWwgTXljb2xvZ2lzdC0tdGhlIEJ1bGxlciBkcm9w KQ0KPExJPkUuIEYuIEJ1cnRvbg0KPExJPkEuIEUuIERvdWdsYXMNCjxMST5G LiBFLiBKLiBGcnkgKEZpc2hlcmllcyBiaW9sb2d5KQ0KPExJPlJpY2hhcmQg R3JpZXZlIChNZXRlb3IgaW1wYWN0IHN0dWRpZXMpDQo8TEk+Q2FsdmluIEIu IEhhcmxleSAoQ2VsbHVsYXIgY2xvY2spDQo8TEk+Si5TLiBIYXJ0IChmaXJz dCB3b3JsZCBleHBlcnQgb24gaG93IGZpc2ggc3dpbSkNCjxMST5Kb2huIEwu IEhvbG1lcw0KPExJPkQuIEcuIEh1cnN0DQo8TEk+S2VpdGggVS4gSW5nb2xk DQo8TEk+SC4gRS4gSm9obnMNCjxMST5TdXNhbiBLaWVmZmVyICh2b2xjYW5v IHNwZWNpYWxpc3QpDQo8TEk+IEouIEtsZWluDQo8TEk+Vy4gQmVubmV0dCBM ZXdpcyAodGhlIGZhdGhlciBvZiB0aGUgQ0FORFUgcmVhY3RvcikNCjxMST5B bGJlcnQgRS4gTGl0aGVybGFuZA0KPExJPk90dG8gTWFhc3MNCjxMST5TdGV2 ZW4gUGlua2VyIChMaW5ndWlzdGljcyBhbmQgYXJ0aWZpY2lhbCBpbnRlbGxp Z2VuY2UpDQo8TEk+Sm9obiBQbGFza2V0dCAoQ2FuYWRhJ3MgZ3JlYXRlc3Qg YXN0cm9ub21lcikNCjxMST5Eb25hbGQgQS4gUmFtc2F5DQo8TEk+Vy4gRy4g U2NobmVpZGVyDQo8TEk+SnVhbiBTY2lhbmlvDQo8TEk+R29yZG9uIFNocnVt IChQaHlzaWNzKQ0KPExJPlBoaWxpcCBTZWVtYW4gKERvcGFtaW5lIHJlY2Vw dG9ycykNCjxMST5XaWxsZW0gU2llYnJhbmQNCjxMST5IYXJyeSBHLiBUaG9k ZQ0KPExJPkhlbnJ5IE1hcnNoYWwgVG9yeQ0KPExJPkhhcnJ5IEwuIFdlbHNo DQo8TEk+TWF4IFd5bWFuDQoNCjwvVUw+DQo8UD4NCklmIHlvdSBjYW4gc3Vn Z2VzdCBhbnkgbW9yZSwgcGxlYXNlIGRvLg0KPFA+DQoNCjxIUiBOT1NIQURF Pg0KDQo8Q0VOVEVSPg0KDQo8IS0tIEdDUyBCdXR0b24gQmFyIC0tPg0KPFA+ DQo8QSBIUkVGPSJob21lLmh0bWwiPjxJTUcgQk9SREVSPSIwIiBTUkM9Imdy YXBoaWNzL2J1dHRvbmJhci9ob21lcGFnZS5naWYiIEFMVD0iSG9tZXBhZ2Ui PjwvQT4NCjxBIEhSRUY9InNlYXJjaC5odG1sIj48SU1HIEJPUkRFUj0iMCIg U1JDPSJncmFwaGljcy9idXR0b25iYXIvc2VhcmNoLmdpZiIgQUxUPSJTZWFy Y2giPjwvQT4NCjxBIEhSRUY9ImZlZWRiYWNrLmh0bWwiPjxJTUcgQk9SREVS PSIwIiBTUkM9ImdyYXBoaWNzL2J1dHRvbmJhci9mZWVkYmFjay5naWYiIEFM VD0iRmVlZGJhY2siPjwvQT4NCjxBIEhSRUY9InByb2ZpbGVzLmh0bWwiPjxJ TUcgQk9SREVSPSIwIiBTUkM9ImdyYXBoaWNzL2J1dHRvbmJhci9wcm9maWxl cy5naWYiIEFMVD0iUHJvZmlsZXMiPjwvQT4NCjxBIEhSRUY9InJlZmVyZW5j ZS5odG1sIj48SU1HIEJPUkRFUj0iMCIgU1JDPSJncmFwaGljcy9idXR0b25i YXIvcmVmZXJlbmNlLmdpZiIgQUxUPSJSZWZlcmVuY2UiPjwvQT4NCjxBIEhS RUY9ImFzay8iPjxJTUcgQk9SREVSPSIwIiBTUkM9ImdyYXBoaWNzL2J1dHRv bmJhci9hc2suZ2lmIiBBTFQ9IkFzayBhIFNjaWVudGlzdCI+PC9BPg0KPEEg SFJFRj0iZ2FtZXMuaHRtbCI+PElNRyBCT1JERVI9IjAiIFNSQz0iZ3JhcGhp Y3MvYnV0dG9uYmFyL2dhbWVzLmdpZiIgQUxUPSJHYW1lcyI+PC9BPg0KPEEg SFJFRj0icXVpei8iPjxJTUcgQk9SREVSPSIwIiBTUkM9ImdyYXBoaWNzL2J1 dHRvbmJhci9xdWl6LmdpZiIgQUxUPSJRdWl6Ij48L0E+DQo8UD4NCjwhLS0g R0NTIEJ1dHRvbiBCYXIgLS0+DQoNCjxUQUJMRT4NCjxUUj48VEQgV0lEVEg9 IjI1MCIgQUxJR049ImNlbnRlciI+DQo8SDU+V2ViIGZhY2lsaXRpZXMgcHJv dmlkZWQgY291cnRlc3kgb2YgdGhlPEJSPg0KPEEgSFJFRj0iaHR0cDovL2Zh cy5zZnUuY2EvY3NzIj5DZW50cmUgZm9yIFN5c3RlbXMgU2NpZW5jZTwvQT4g DQphdCA8QSBIUkVGPSJodHRwOi8vd3d3LnNmdS5jYSI+U2ltb24gRnJhc2Vy IFVuaXZlcnNpdHk8L0E+PC9INT48UD4NCjwvVEQ+PFREIFdJRFRIPSIyNTAi IEFMSUdOPSJjZW50ZXIiPiANCjxINT5DcmVhdGl2ZSB3ZWIganVpY2VzIHBy b3ZpZGVkIGNvdXJ0ZXN5IG9mOjxCUj4NClRoZSA8QSBIUkVGPSJjcmVkaXRz Lmh0bWwiPkdDUyBUZWFtPC9BPjwvSDU+DQo8L1REPjwvVFI+DQo8VFI+DQo8 VEQgQ09MU1BBTj0iMiIgQUxJR049ImNlbnRlciI+DQo8SDU+JmNvcHk7IDE5 OTQsIDE5OTUsIDE5OTYsIDE5OTcgR0NTIFJlc2VhcmNoIFNvY2lldHk8L0g1 Pg0KPC9URD4NCjwvVFI+DQo8L1RBQkxFPg0KDQoNCg0KPC9ib2R5PjwvaHRt bD4NCg== ---559023410-360206568-918097638=:22855-- ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:24 1999 Return-Path: <tlm@po-net.prato.it> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id CAA12601 for <andrew@contigo.com>; Thu, 4 Feb 1999 02:15:14 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id CAA04447; Thu, 4 Feb 1999 02:15:14 -0800 (PST) From: "U.O. Telematica Municipale - Comune di Prato" <tlm@po-net.prato.it> Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B9733E.BeroList-2.5.9@sob.htdig.org> X-Sender: c.giorge@mbox.comune.prato.it X-Mailer: Windows Eudora Pro Version 3.0.1 (32) [I] Date: Thu, 04 Feb 1999 11:16:15 +0100 Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: Re: [htdig3-dev] How should I set the date for htnotify
In 09.54 03/02/99 -0400, hai scritto: >>But since I have set the iso_8601 option to true, it doesn't work anymore. > >ISO 8601 format for short dates is seen as follows: > if (config.Boolean("iso_8601")) > { > sscanf(date, "%d-%d-%d", &year, &month, &day); > } > >Cheers, >-Geoff >
I'm gonna do further tries, but so far they went all bad.
I set the ISO_8601 option and changed the document htdig-notification-date to "1999-02-10" (next Wednesday). I ran htdig, but the notification message come launched again.
And so I disabled the option ISO_8601 and continue using the old one.
Ciao and thanx Gabriele
----------------------------------------------------------
U.O. Rete Civica - Comune di Prato Via Ricasoli, 4 - 59100 Prato PO Italia Tel. +39 0574616342 Fax +39 0574616003
http://www.comune.prato.it E-Mail: tlm@mbox.comune.prato.it
---------------------------------------------------------- ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:24 1999 Return-Path: <s.budd@ic.ac.uk> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id GAA19997 for <andrew@contigo.com>; Thu, 4 Feb 1999 06:11:18 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id GAA05309; Thu, 4 Feb 1999 06:11:36 -0800 (PST) From: "Budd, S." <s.budd@ic.ac.uk> Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B9AA9F.BeroList-2.5.9@sob.htdig.org> Date: Thu, 4 Feb 1999 14:09:58 -0000 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2232.9) Content-Type: text/plain Subject: [htdig3-dev] excessive memory or not?
I am using an unpatched htdig3.1.0b4 My system is SunOS mozart 5.6 Generic_105181-06 sun4c sparc SUNW,Sun_4_50
Below is the ps line I have for my dig. Do you think it reflects a memory leak in htdig? I have searched the mail archive for a report of memory leak in htdig3.1.0b4 but found none. The dig is still grinding along.
USER PID %CPU %MEM SZ RSS TT S START TIME COMMAND www-data 193 6.7 71.048120044932 pts/0 S Jan 22 6976:54 /home1/www-data/htdig-3.1.0b4/bin/htdig -i -v -s -t -c /home1/www-data/htdig-3.1.0b4/conf/htdig.b4a.config
a count of url's date ; wc e14.urls Thu Feb 4 11:16:33 gmt 1999 4634031 4633994 324829184 e14.urls ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:24 1999 Return-Path: <ghutchis@wso.williams.edu> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id HAA23173 for <andrew@contigo.com>; Thu, 4 Feb 1999 07:32:06 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id HAA05911; Thu, 4 Feb 1999 07:32:13 -0800 (PST) From: Geoff Hutchison <ghutchis@wso.williams.edu> Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B9BD92.BeroList-2.5.9@sob.htdig.org> Date: Thu, 4 Feb 1999 10:30:56 -0500 (EST) cc: htdig3-dev@htdig.org In-Reply-To: <36B9AA9F.BeroList-2.5.9@sob.htdig.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Subject: Re: [htdig3-dev] excessive memory or not?
On Thu, 4 Feb 1999, Budd, S. wrote:
> memory leak in htdig3.1.0b4 but found none.
We've fixed some smallish memory leaks since 3.1.0b4... But we're talking a few bytes here and there.
> 4634031 4633994 324829184 e14.urls
That is to say you have 4.6 million URLs? If so, I wouldn't think about small memory leaks. Instead I'd realize that I'm trying to index about an order of magnitude more URLs than ht://Dig has before. I hope you're just grabbing one page per URL... :-)
-Geoff
------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:24 1999 Return-Path: <klaren@telin.nl> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id HAA23186 for <andrew@contigo.com>; Thu, 4 Feb 1999 07:32:13 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id HAA05923; Thu, 4 Feb 1999 07:32:31 -0800 (PST) From: Ric Klaren <klaren@telin.nl> Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B9BD90.BeroList-2.5.9@sob.htdig.org> (Netscape Messaging Server 3.5) with SMTP id 289 for <htdig3-dev@htdig.org>; Thu, 4 Feb 1999 16:33:34 +0100 Date: Thu, 4 Feb 1999 16:34:21 +0000 Mail-Followup-To: htdig3-dev@htdig.org References: <36B710AE.BeroList-2.5.9@sob.htdig.org> <36B71BDA.BeroList-2.5.9@sob.htdig.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.95.1i In-Reply-To: <36B71BDA.BeroList-2.5.9@sob.htdig.org>; from Geoff Hutchison on Tue, Feb 02, 1999 at 10:37:05AM -0500 Organization: Telematica Instituut Subject: [htdig3-dev] Re: Buildroot + solaris 2.6 patch
Hi,
On Tue, Feb 02, 1999 at 10:37:05AM -0500, Geoff Hutchison wrote: GH> I would be glad to put a .spec in (contrib ?) if people think it would be GH> a good idea. I hesitate slightly since there are lots of package formats GH> and I don't intend on holding up the release so we can put in better GH> support for everyone's favorite packaging scheme. :-P
Good idea IMHO, Put them in contrib/unsupported ;)
Ric ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:24 1999 Return-Path: <ghutchis@wso.williams.edu> Received: from wso.williams.edu (wso.williams.edu [137.165.37.207]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id IAA26082 for <andrews@contigo.com>; Thu, 4 Feb 1999 08:36:54 -0800 (PST) Received: from localhost (ghutchis@localhost) by wso.williams.edu (8.9.2/8.9.2/Debian/GNU) with SMTP id LAA31868; Thu, 4 Feb 1999 11:36:32 -0500 (EST) Date: Thu, 4 Feb 1999 11:36:32 -0500 (EST) From: Geoff Hutchison <ghutchis@wso.williams.edu> To: "J. op den Brouw" <MSQL_User@st.hhs.nl> cc: Andrew Scherpbier <andrews@contigo.com> Subject: Re: [htdig3-dev] Odd comment... In-Reply-To: <36B974F5.70DDFC90@st.hhs.nl> Message-ID: <Pine.LNX.3.96.990204113129.31509B-100000@wso.williams.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII
On Thu, 4 Feb 1999, J. op den Brouw wrote:
> Right for me..... > It sounds stupid, but our school has facilities to accomodate > a medium sized group (550 people), and it's about an > hour with one train from Amsterdam airport.
I doubt if we had a conference that we'd need space for more than 550 people. The mailing list isn't even 150 unique addresses. (It is growing though.)
> We have hotels here, and I can have 4 guests in total........ > (hope the lady of the house will like it)
I would assume some of us can stay nearby with friends. :-) Some of those friends went so far as to say "don't bother learning Dutch..."
> But what to do next....
I think it depends on a few things. The end of the summer would be a nice time, but does Andrew want to buy another ticket to Holland for August? When will I decide what I'm going to do this summer? :-)
-Geoff From - Thu Feb 4 22:12:24 1999 Return-Path: <klaren@telin.nl> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.8.8/Debian/GNU) with ESMTP id HAA23817 for <andrew@contigo.com>; Thu, 4 Feb 1999 07:45:38 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id HAA06110; Thu, 4 Feb 1999 07:46:07 -0800 (PST) From: Ric Klaren <klaren@telin.nl> Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36B9C0C0.BeroList-2.5.9@sob.htdig.org> (Netscape Messaging Server 3.5) with SMTP id 292 for <htdig3-dev@htdig.org>; Thu, 4 Feb 1999 16:46:56 +0100 Date: Thu, 4 Feb 1999 16:47:44 +0000 Mail-Followup-To: Htdig3 developers <htdig3-dev@htdig.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary=VbJkn9YxBvnuCH5J X-Mailer: Mutt 0.95.1i Organization: Telematica Instituut Subject: [htdig3-dev] improved buildroot patch + egcs patch + RPM spec file
--VbJkn9YxBvnuCH5J Content-Type: text/plain; charset=us-ascii
Hi,
Hope I haven't been doing something already fixed... but here is it anyway.. patches against the 3.1.0-020399 snapshot.
Patches for all the (relevant) Makefile.in's and the CONFIG.in. For the buildroot construct discussed earlier.
A teeny patch to htsearch/Display.cc to shut egcs up over a discarding const to blah blah etc...
An adapted spec file (original from Gilles). To go with the previous patches.
Some changes with respect to install.. I prefer installing the html stuff and the cgi to a /home/httpd/htdig/ directory.. (which is then again a virtual root for apache).
See the spec file for details... (sowwy :( for that)
Cheers Ric
PS the snapshot runs like a charm =) pdf's work etc. At least I didn't notice any irregularities yet.
--VbJkn9YxBvnuCH5J Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="htdig.patch"
--- htdig-3.1.0-020399/htdig/Makefile.in.rkorig Thu Feb 4 12:16:35 1999 +++ htdig-3.1.0-020399/htdig/Makefile.in Thu Feb 4 12:16:58 1999 @@ -21,7 +21,7 @@ install: $(TARGET) transform=@program_transform_name@ - $(INSTALL_PROGRAM) $(TARGET) $(BIN_DIR)/`echo $(TARGET) | sed '$(transform)'` + $(INSTALL_PROGRAM) $(TARGET) $(INSTALL_ROOT)$(BIN_DIR)/`echo $(TARGET) | sed '$(transform)'` clean: rm -f $(TARGET) $(OBJS) *~ *.bak *% a.out *.orig core --- htdig-3.1.0-020399/htfuzzy/Makefile.in.rkorig Thu Feb 4 12:17:13 1999 +++ htdig-3.1.0-020399/htfuzzy/Makefile.in Thu Feb 4 12:18:19 1999 @@ -30,7 +30,7 @@ install: $(TARGET) transform=@program_transform_name@ - $(INSTALL_PROGRAM) $(TARGET) $(BIN_DIR)/`echo $(TARGET) | sed '$(transform)'` + $(INSTALL_PROGRAM) $(TARGET) $(INSTALL_ROOT)$(BIN_DIR)/`echo $(TARGET) | sed '$(transform)'` clean: rm -f $(TARGET) $(LIBTARGET) $(OBJS) *~ *.bak *% a.out *.orig core --- htdig-3.1.0-020399/htmerge/Makefile.in.rkorig Thu Feb 4 12:18:29 1999 +++ htdig-3.1.0-020399/htmerge/Makefile.in Thu Feb 4 12:18:45 1999 @@ -19,7 +19,7 @@ install: $(TARGET) transform=@program_transform_name@ - $(INSTALL_PROGRAM) $(TARGET) $(BIN_DIR)/`echo $(TARGET) | sed '$(transform)'` + $(INSTALL_PROGRAM) $(TARGET) $(INSTALL_ROOT)$(BIN_DIR)/`echo $(TARGET) | sed '$(transform)'` clean: rm -f $(OBJS) $(TARGET) *~ *% *.bak core a.out *.orig --- htdig-3.1.0-020399/htnotify/Makefile.in.rkorig Thu Feb 4 12:18:56 1999 +++ htdig-3.1.0-020399/htnotify/Makefile.in Thu Feb 4 12:19:09 1999 @@ -20,7 +20,7 @@ install: $(TARGET) transform=@program_transform_name@ - $(INSTALL_PROGRAM) $(TARGET) $(BIN_DIR)/`echo $(TARGET) | sed '$(transform)'` + $(INSTALL_PROGRAM) $(TARGET) $(INSTALL_ROOT)$(BIN_DIR)/`echo $(TARGET) | sed '$(transform)'` clean: rm -f $(TARGET) $(OBJS) *~ *.bak *% a.out *.orig core --- htdig-3.1.0-020399/htsearch/Makefile.in.rkorig Thu Feb 4 12:19:24 1999 +++ htdig-3.1.0-020399/htsearch/Makefile.in Thu Feb 4 12:19:39 1999 @@ -22,7 +22,7 @@ install: all transform=@program_transform_name@ - $(INSTALL_PROGRAM) $(TARGET) $(CGIBIN_DIR)/`echo $(TARGET) | sed '$(transform)'` + $(INSTALL_PROGRAM) $(TARGET) $(INSTALL_ROOT)$(CGIBIN_DIR)/`echo $(TARGET) | sed '$(transform)'` clean: rm -f $(OBJS) $(TARGET) *~ *.bak *% core *.orig a.out --- htdig-3.1.0-020399/CONFIG.in.rkorig Thu Feb 4 11:38:51 1999 +++ htdig-3.1.0-020399/CONFIG.in Thu Feb 4 11:56:59 1999 @@ -23,19 +23,19 @@ # # This specifies the root of the directory tree to be used by ht://Dig # -DEST= $(prefix) +DEST= @localstatedir@ # # BIN_DIR # Set this macro to where you want the binaries to be installed. # -BIN_DIR= $(exec_prefix)/bin +BIN_DIR= @bindir@ # # CONFIG_DIR # This is the directory that contains ht://Dig configuration files # -CONFIG_DIR= $(DEST)/conf +CONFIG_DIR= @sysconfdir@ # # COMMON_DIR --- htdig-3.1.0-020399/Makefile.in.rkorig Thu Feb 4 12:04:52 1999 +++ htdig-3.1.0-020399/Makefile.in Thu Feb 4 12:57:30 1999 @@ -82,7 +82,7 @@ @echo "" @echo "Creating directories (if needed)..." -@for i in $(CREATEDIRS); do \ - $(top_srcdir)/mkinstalldirs $$i; \ + $(top_srcdir)/mkinstalldirs $(INSTALL_ROOT)$$i; \ done && test -z "$$fail" @echo "" @echo "Installing individual programs..." @@ -91,24 +91,24 @@ done && test -z "$$fail" @echo "" @echo "Installing default configuration files..." - @if [ ! -f $(CONFIG_DIR)/htdig.conf ]; then sed -e s%@DATABASE_DIR@%$(DATABASE_DIR)% -e s%@IMAGEDIR@%$(IMAGE_URL_PREFIX)% $(top_srcdir)/installdir/htdig.conf >$(CONFIG_DIR)/htdig.conf; echo $(CONFIG_DIR)/htdig.conf;fi - @if [ ! -f $(COMMON_DIR)/bad_words ]; then $(INSTALL_DATA) $(top_srcdir)/installdir/bad_words $(COMMON_DIR); echo $(COMMON_DIR)/bad_words; fi - @if [ ! -f $(SEARCH_DIR)/$(SEARCH_FORM) ]; then sed -e s%@IMAGEDIR@%$(IMAGE_URL_PREFIX)% $(top_srcdir)/installdir/search.html >$(SEARCH_DIR)/$(SEARCH_FORM); echo $(SEARCH_DIR)/$(SEARCH_FORM);fi - @if [ ! -f $(COMMON_DIR)/footer.html ]; then sed -e s%@IMAGEDIR@%$(IMAGE_URL_PREFIX)% $(top_srcdir)/installdir/footer.html >$(COMMON_DIR)/footer.html; echo $(COMMON_DIR)/footer.html;fi - @if [ ! -f $(COMMON_DIR)/header.html ]; then sed -e s%@IMAGEDIR@%$(IMAGE_URL_PREFIX)% $(top_srcdir)/installdir/header.html >$(COMMON_DIR)/header.html; echo $(COMMON_DIR)/header.html;fi - @if [ ! -f $(COMMON_DIR)/wrapper.html ]; then sed -e s%@IMAGEDIR@%$(IMAGE_URL_PREFIX)% $(top_srcdir)/installdir/wrapper.html >$(COMMON_DIR)/wrapper.html; echo $(COMMON_DIR)/wrapper.html;fi - @if [ ! -f $(COMMON_DIR)/nomatch.html ]; then sed -e s%@IMAGEDIR@%$(IMAGE_URL_PREFIX)% $(top_srcdir)/installdir/nomatch.html >$(COMMON_DIR)/nomatch.html; echo $(COMMON_DIR)/nomatch.html;fi - @if [ ! -f $(COMMON_DIR)/syntax.html ]; then sed -e s%@IMAGEDIR@%$(IMAGE_URL_PREFIX)% $(top_srcdir)/installdir/syntax.html >$(COMMON_DIR)/syntax.html; echo $(COMMON_DIR)/syntax.html;fi - @if [ ! -f $(COMMON_DIR)/english.0 ]; then $(INSTALL_DATA) $(top_srcdir)/installdir/english.0 $(COMMON_DIR); echo $(COMMON_DIR)/english.0;fi - @if [ ! -f $(COMMON_DIR)/english.aff ]; then $(INSTALL_DATA) $(top_srcdir)/installdir/english.aff $(COMMON_DIR); echo $(COMMON_DIR)/english.aff;fi - @if [ ! -f $(COMMON_DIR)/synonyms ]; then $(INSTALL_DATA) $(top_srcdir)/installdir/synonyms $(COMMON_DIR); echo $(COMMON_DIR)/synonyms;fi + @if [ ! -f $(INSTALL_ROOT)$(CONFIG_DIR)/htdig.conf ]; then sed -e s%@DATABASE_DIR@%$(DATABASE_DIR)% -e s%@IMAGEDIR@%$(IMAGE_URL_PREFIX)% $(top_srcdir)/installdir/htdig.conf >$(INSTALL_ROOT)$(CONFIG_DIR)/htdig.conf; echo $(CONFIG_DIR)/htdig.conf;fi + @if [ ! -f $(INSTALL_ROOT)$(COMMON_DIR)/bad_words ]; then $(INSTALL_DATA) $(top_srcdir)/installdir/bad_words $(INSTALL_ROOT)$(COMMON_DIR); echo $(COMMON_DIR)/bad_words; fi + @if [ ! -f $(INSTALL_ROOT)$(SEARCH_DIR)/$(SEARCH_FORM) ]; then sed -e s%@IMAGEDIR@%$(IMAGE_URL_PREFIX)% $(top_srcdir)/installdir/search.html >$(INSTALL_ROOT)$(SEARCH_DIR)/$(SEARCH_FORM); echo $(SEARCH_DIR)/$(SEARCH_FORM);fi + @if [ ! -f $(INSTALL_ROOT)$(COMMON_DIR)/footer.html ]; then sed -e s%@IMAGEDIR@%$(IMAGE_URL_PREFIX)% $(top_srcdir)/installdir/footer.html >$(INSTALL_ROOT)$(COMMON_DIR)/footer.html; echo $(COMMON_DIR)/footer.html;fi + @if [ ! -f $(INSTALL_ROOT)$(COMMON_DIR)/header.html ]; then sed -e s%@IMAGEDIR@%$(IMAGE_URL_PREFIX)% $(top_srcdir)/installdir/header.html >$(INSTALL_ROOT)$(COMMON_DIR)/header.html; echo $(COMMON_DIR)/header.html;fi + @if [ ! -f $(INSTALL_ROOT)$(COMMON_DIR)/wrapper.html ]; then sed -e s%@IMAGEDIR@%$(IMAGE_URL_PREFIX)% $(top_srcdir)/installdir/wrapper.html >$(INSTALL_ROOT)$(COMMON_DIR)/wrapper.html; echo $(COMMON_DIR)/wrapper.html;fi + @if [ ! -f $(INSTALL_ROOT)$(COMMON_DIR)/nomatch.html ]; then sed -e s%@IMAGEDIR@%$(IMAGE_URL_PREFIX)% $(top_srcdir)/installdir/nomatch.html >$(INSTALL_ROOT)$(COMMON_DIR)/nomatch.html; echo $(COMMON_DIR)/nomatch.html;fi + @if [ ! -f $(INSTALL_ROOT)$(COMMON_DIR)/syntax.html ]; then sed -e s%@IMAGEDIR@%$(IMAGE_URL_PREFIX)% $(top_srcdir)/installdir/syntax.html >$(INSTALL_ROOT)$(COMMON_DIR)/syntax.html; echo $(COMMON_DIR)/syntax.html;fi + @if [ ! -f $(INSTALL_ROOT)$(COMMON_DIR)/english.0 ]; then $(INSTALL_DATA) $(top_srcdir)/installdir/english.0 $(INSTALL_ROOT)$(COMMON_DIR); echo $(COMMON_DIR)/english.0;fi + @if [ ! -f $(INSTALL_ROOT)$(COMMON_DIR)/english.aff ]; then $(INSTALL_DATA) $(top_srcdir)/installdir/english.aff $(INSTALL_ROOT)$(COMMON_DIR); echo $(COMMON_DIR)/english.aff;fi + @if [ ! -f $(INSTALL_ROOT)$(COMMON_DIR)/synonyms ]; then $(INSTALL_DATA) $(top_srcdir)/installdir/synonyms $(INSTALL_ROOT)$(COMMON_DIR); echo $(COMMON_DIR)/synonyms;fi @echo "Installing images..." @for i in $(IMAGES); do \ - if [ ! -f $(IMAGE_DIR)/$$i ]; then $(INSTALL_DATA) $(top_srcdir)/installdir/$$i $(IMAGE_DIR)/$$i; echo $(IMAGE_DIR)/$$i;fi; \ + if [ ! -f $(INSTALL_ROOT)$(IMAGE_DIR)/$$i ]; then $(INSTALL_DATA) $(top_srcdir)/installdir/$$i $(INSTALL_ROOT)$(IMAGE_DIR)/$$i; echo $(IMAGE_DIR)/$$i;fi; \ done && test -z "$$fail" @echo "Creating rundig script..." - @if [ ! -f $(BIN_DIR)/rundig ]; then \ - sed -e s%@BIN_DIR@%$(BIN_DIR)% -e s%@COMMON_DIR@%$(COMMON_DIR)% -e s%@DATABASE_DIR@%$(DATABASE_DIR)% $(top_srcdir)/installdir/rundig >$(BIN_DIR)/rundig; \ + @if [ ! -f $(INSTALL_ROOT)$(BIN_DIR)/rundig ]; then \ + sed -e s%@BIN_DIR@%$(BIN_DIR)% -e s%@COMMON_DIR@%$(COMMON_DIR)% -e s%@DATABASE_DIR@%$(DATABASE_DIR)% $(top_srcdir)/installdir/rundig >$(INSTALL_ROOT)$(BIN_DIR)/rundig; \ chmod 755 $(BIN_DIR)/rundig; \ fi @echo "Installation done."
--VbJkn9YxBvnuCH5J Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="htdig-egcs.patch"
--- htdig-3.1.0-020399/htsearch/Display.cc.rkorig Thu Feb 4 12:38:27 1999 +++ htdig-3.1.0-020399/htsearch/Display.cc Thu Feb 4 12:39:22 1999 @@ -370,7 +370,7 @@ Display::setVariables(int pageNumber, Li else if (mystrcasecmp(config["match_method"], "or") == 0) vars.Add("MATCH_MESSAGE", new String("some")); vars.Add("MATCHES", new String(form("%d", nMatches))); - vars.Add("PLURAL_MATCHES", new String(nMatches == 0 ? "" : "s")); + vars.Add("PLURAL_MATCHES", new String(nMatches == 0 ? (char *)"" : (char *)"s")); vars.Add("PAGE", new String(form("%d", pageNumber))); vars.Add("PAGES", new String(form("%d", nPages))); vars.Add("FIRSTDISPLAYED", @@ -1116,8 +1116,8 @@ Display::compareTitle(const void *a1, co { ResultMatch *m1 = *((ResultMatch **) a1); ResultMatch *m2 = *((ResultMatch **) a2); - char *t1 = (m1->getRef()) ? m1->getRef()->DocTitle() : ""; - char *t2 = (m2->getRef()) ? m2->getRef()->DocTitle() : ""; + char *t1 = (m1->getRef()) ? m1->getRef()->DocTitle() : (char *)""; + char *t2 = (m2->getRef()) ? m2->getRef()->DocTitle() : (char *)""; if (!t1) t1 = ""; if (!t2) t2 = "";
--VbJkn9YxBvnuCH5J Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="htdig.spec"
Summary: A web indexing and searching system for a small domain or intranet Name: htdig Version: 3.1.0-020399 Release: 0 Copyright: GPL Group: Networking/Utilities BuildRoot: /var/tmp/htdig-root Source0: http://www.htdig.org/files/htdig-%{PACKAGE_VERSION}.tar.gz #Source1: htdig-%{PACKAGE_VERSION}-htdig.conf #Source2: htdig-%{PACKAGE_VERSION}-rundig Patch0: htdig.patch Patch1: htdig-egcs.patch URL: http://www.htdig.org/ Packager: Gilles Detillieux <grdetil@scrc.umanitoba.ca>
%description The ht://Dig system is a complete world wide web indexing and searching system for a small domain or intranet. This system is not meant to replace the need for powerful internet-wide search systems like Lycos, Infoseek, Webcrawler and AltaVista. Instead it is meant to cover the search needs for a single company, campus, or even a particular sub section of a web site.
As opposed to some WAIS-based or web-server based search engines, ht://Dig can span several web servers at a site. The type of these different web servers doesn't matter as long as they understand the HTTP 1.0 protocol. %prep %setup -q -n htdig-%{PACKAGE_VERSION} %patch0 -p1 %patch1 -p1
%build CFLAGS="$RPM_OPT_FLAGS" ./configure --prefix=/usr \ --bindir=/usr/sbin --libexec=/usr/lib --libdir=/usr/lib \ --mandir=/usr/man --sysconfdir=/etc/htdig --localstatedir=/var/lib/htdig \ --with-image-dir=/home/httpd/htdig \ --with-cgi-bin-dir=/home/httpd/htdig/cgi-bin \ --with-search-dir=/home/httpd/htdig make
%install
test -n "$RPM_BUILD_ROOT" -a x"/" != x"$RPM_BUILD_ROOT" && rm -rf $RPM_BUILD_ROOT mkdir -p $RPM_BUILD_ROOT/etc/htdig mkdir -p $RPM_BUILD_ROOT/etc/cron.daily mkdir -p $RPM_BUILD_ROOT/home/httpd/htdig/cgi-bin #mkdir -p $RPM_BUILD_ROOT/home/httpd/htdig/images mkdir -p $RPM_BUILD_ROOT/usr/sbin mkdir -p $RPM_BUILD_ROOT/var/lib/htdig/common mkdir -p $RPM_BUILD_ROOT/var/lib/htdig/db
make INSTALL_ROOT=$RPM_BUILD_ROOT install chmod -R go-w $RPM_BUILD_ROOT/home/httpd/htdig chmod a-x $RPM_BUILD_ROOT/var/lib/htdig/common/* strip $RPM_BUILD_ROOT/usr/sbin/ht* $RPM_BUILD_ROOT/home/httpd/htdig/cgi-bin/htsearch #install -m644 $RPM_SOURCE_DIR/htdig-%{PACKAGE_VERSION}-htdig.conf \ # $RPM_BUILD_ROOT/etc/htdig/htdig.conf #install -m755 $RPM_SOURCE_DIR/htdig-%{PACKAGE_VERSION}-rundig \ # $RPM_BUILD_ROOT/usr/sbin/rundig chmod 755 $RPM_BUILD_ROOT/usr/sbin/rundig ln -s ../../usr/sbin/rundig $RPM_BUILD_ROOT/etc/cron.daily/htdig-dbgen ln -s ../../../../usr/doc/htdig-%{PACKAGE_VERSION} \ $RPM_BUILD_ROOT/home/httpd/htdig/htdoc
%clean test -n "$RPM_BUILD_ROOT" -a x"/" != x"$RPM_BUILD_ROOT" && rm -rf $RPM_BUILD_ROOT
%post # Only run this if installing for the first time if [ "$1" = 1 ]; then SERVERNAME="`grep '^ServerName' /etc/httpd/conf/httpd.conf | awk '{print $2}'`" [ -z "$SERVERNAME" ] && SERVERNAME="`hostname -f`" [ -z "$SERVERNAME" ] && SERVERNAME="localhost" echo "start_url: http://$SERVERNAME/ local_urls: http://$SERVERNAME/=/home/httpd/html/ local_user_urls: http://$SERVERNAME/=/home/,/public_html/" >> /etc/htdig/htdig.conf
fi
%files %defattr(-,root,root) %config /etc/htdig/htdig.conf %config /usr/sbin/rundig /etc/cron.daily/htdig-dbgen /usr/sbin/htdig /usr/sbin/htfuzzy /usr/sbin/htmerge /usr/sbin/htnotify /var/lib/htdig /home/httpd/htdig
%doc CONFIG README htdoc/*
%changelog * Thu Feb 4 1999 Ric Klaren <klaren@telin.nl> - updated to 3.1.0-020399 - changed buildroot stuff - minor spec file fixes - install web stuff in /home/httpd/htdig - made rundig config file
* Mon Jan 4 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca> - updated to 3.1.0b4
* Tue Dec 15 1998 Gilles Detillieux <grdetil@scrc.umanitoba.ca> - updated to 3.1.0b3, changed version number & rundig script accordingly
* Thu Nov 5 1998 Gilles Detillieux <grdetil@scrc.umanitoba.ca> - patched htdoc/where.html to reflect latest version
* Tue Nov 3 1998 Gilles Detillieux <grdetil@scrc.umanitoba.ca> - updated to 3.1.0b2, changed patches accordingly
* Tue Sep 22 1998 Gilles Detillieux <grdetil@scrc.umanitoba.ca> - Changed config patches to remove -ggdb compile option (for 3.1.0b1) - Added local_urls stuff to generated htdig.conf file
* Fri Sep 18 1998 Gilles Detillieux <grdetil@scrc.umanitoba.ca> - Built the rpm from latest htdig source (3.1.0b1), using earlier versions of rpms by Mihai Ibanescu <misa@dntis.ro> and Elliot Lee <sopwith@cuc.edu> as a model, incorporating ideas from both. I've made the install locations as FSSTND compliant as I can think of.
--VbJkn9YxBvnuCH5J-- ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:24 1999 Return-Path: <grdetil@scrc.umanitoba.ca> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.9.2/Debian/GNU) with ESMTP id QAA22307 for <andrew@contigo.com>; Thu, 4 Feb 1999 16:13:57 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id QAA00886; Thu, 4 Feb 1999 16:14:13 -0800 (PST) From: Gilles Detillieux <grdetil@scrc.umanitoba.ca> Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36BA37E3.BeroList-2.5.9@sob.htdig.org> Date: Thu, 4 Feb 1999 18:12:55 -0600 (CST) Cc: htdig3-dev@htdig.org In-Reply-To: <36B91484.BeroList-2.5.9@sob.htdig.org> from "Geoff Hutchison" at Feb 3, 99 10:30:02 pm X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [htdig3-dev] Re: [htdig] segmentation fault in htsearch (fwd)
According to Geoff Hutchison: > Well I've been spending some time working out bugs upon bugs with Andrew > here. First off, it seems we get segfaults on Solaris 2.6 in getdate, when > we call the system function strftime! Fortunately I have an ultra running > 2.6 I can test.
Before or after my getdate patch? Does strftime care that the weekday is not set in the tm structure, even if we don't ask it to output it? Maybe it would help to zero it out.
> But when we fixed that, it died in the HTML parser, as seen here. This is > clearly a showstopper. The HTML file is attached, though I'm betting on a > database problem...
Given that it's crashing in Deserialize(), I'd say it's a safe bet that it's a database problem. I don't see anything wrong with the HTML file, nor do I see how a bad HTML file could cause a problem in Deserialize(), unless the HTML parser when haywire and started stomping on memory all over the place.
I did notice that Deserialize() only checks to see that s is less than end once per deserialized object, but not while deserializing each object. There really should be tests in all the macros in there to make sure that you NEVER look beyond the end of the input string, otherwiese a corrupt database could easily lead it astray!
-- Gilles R. Detillieux E-mail: <grdetil@scrc.umanitoba.ca> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:24 1999 Return-Path: <grdetil@scrc.umanitoba.ca> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.9.2/Debian/GNU) with ESMTP id QAA22681 for <andrew@contigo.com>; Thu, 4 Feb 1999 16:21:21 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id QAA00922; Thu, 4 Feb 1999 16:21:41 -0800 (PST) From: Gilles Detillieux <grdetil@scrc.umanitoba.ca> Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36BA3996.BeroList-2.5.9@sob.htdig.org> Date: Thu, 4 Feb 1999 18:20:06 -0600 (CST) Cc: htdig3-dev@htdig.org In-Reply-To: <36B9733E.BeroList-2.5.9@sob.htdig.org> from "U.O. Telematica Municipale - Comune di Prato" at Feb 4, 99 11:16:15 am X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [htdig3-dev] How should I set the date for htnotifyAccording to Geoff: > >>But since I have set the iso_8601 option to true, it doesn't work anymore. > > > >ISO 8601 format for short dates is seen as follows: > > if (config.Boolean("iso_8601")) > > { > > sscanf(date, "%d-%d-%d", &year, &month, &day); > > }
According to U.O. Telematica Municipale - Comune di Prato: > I'm gonna do further tries, but so far they went all bad. > > I set the ISO_8601 option and changed the document htdig-notification-date > to "1999-02-10" (next Wednesday). I ran htdig, but the notification message > come launched again. > > And so I disabled the option ISO_8601 and continue using the old one.
Strange. It's pretty straightforward code, so I don't see why it's failing. Could it be that some implementations of sscanf don't like the "-" characters in format strings, because they think the "-" is part of a number? I must admit I haven't tested this out on my system yet.
-- Gilles R. Detillieux E-mail: <grdetil@scrc.umanitoba.ca> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:24 1999 Return-Path: <grdetil@scrc.umanitoba.ca> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.9.2/Debian/GNU) with ESMTP id RAA25440 for <andrew@contigo.com>; Thu, 4 Feb 1999 17:18:08 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id RAA01046; Thu, 4 Feb 1999 17:18:37 -0800 (PST) From: Gilles Detillieux <grdetil@scrc.umanitoba.ca> Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36BA46EF.BeroList-2.5.9@sob.htdig.org> Date: Thu, 4 Feb 1999 19:17:10 -0600 (CST) Cc: htdig3-dev@htdig.org In-Reply-To: <36B9C0C0.BeroList-2.5.9@sob.htdig.org> from "Ric Klaren" at Feb 4, 99 04:47:44 pm X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [htdig3-dev] improved buildroot patch + egcs patch + RPM spec fileAccording to Ric Klaren: > Hope I haven't been doing something already fixed... but here is it anyway.. > patches against the 3.1.0-020399 snapshot.
I think the two of us were working in parallel, but you finished before me. No problem. I hadn't gotten very far anyway, so I started over with the latest snapshot and your patches. I've now got my new RPMs built (on Red Hat 4.2), and I hope to test them tomorrow.
> Patches for all the (relevant) Makefile.in's and the CONFIG.in. For the > buildroot construct discussed earlier. > > A teeny patch to htsearch/Display.cc to shut egcs up over a discarding > const to blah blah etc... > > An adapted spec file (original from Gilles). To go with the previous patches. > > Some changes with respect to install.. I prefer installing the html stuff > and the cgi to a /home/httpd/htdig/ directory.. (which is then again a > virtual root for apache).
Hmmm. That requires a customised setup of Apache for it to work. I had set up my RPMs so you could drop them onto a standard Red Hat installation, with their uncustomized Apache RPM, run rundig, and you're off. That's why I used the paths that I did. I guess this is one of those situations that Geoff alluded to about "support for everyone's favorite packaging scheme." Anyway, it doesn't matter to me which spec file winds up in contrib, mine or yours. I'll stick with my spec file (with some of your changes) for my Red Hat RPMs.
Your patch to CONFIG.in is just the thing to take all the paths out of the source and put them all on the configure command line in the spec file. I wonder, though, whether other users will want those changes as a permanent part of CONFIG.in in the source tree. If so, great, because then the RPMs won't need to include any patches, but if not, then this one patch will still have to remain in the RPMs.
I used your patch to the Makefile.in files, but then I made a few more changes to clean things up, and echo the actual locations where files are installed. I also put the HTML & dictionary file installations in loops, to make things tidier. Here's my patch to your patched Makefile.in:
--- ./Makefile.in.klaren Thu Feb 4 18:07:42 1999 +++ ./Makefile.in Thu Feb 4 18:24:43 1999 @@ -45,6 +45,8 @@ button1.png button2.png button3.png button4.png button5.png \ button6.png button7.png button8.png button9.png buttonl.png \ buttonr.png button10.png htdig.png star.png star_blank.png +COMMONHTML= header.html footer.html wrapper.html nomatch.html syntax.html +COMMONDICT= bad_words english.0 english.aff synonyms all: @for i in $(DIRS); do \ @@ -91,31 +93,28 @@ done && test -z "$$fail" @echo "" @echo "Installing default configuration files..." - @if [ ! -f $(INSTALL_ROOT)$(CONFIG_DIR)/htdig.conf ]; then sed -e s%@DATABASE_DIR@%$(DATABASE_DIR)% -e s%@IMAGEDIR@%$(IMAGE_URL_PREFIX)% $(top_srcdir)/installdir/htdig.conf >$(INSTALL_ROOT)$(CONFIG_DIR)/htdig.conf; echo $(CONFIG_DIR)/htdig.conf;fi - @if [ ! -f $(INSTALL_ROOT)$(COMMON_DIR)/bad_words ]; then $(INSTALL_DATA) $(top_srcdir)/installdir/bad_words $(INSTALL_ROOT)$(COMMON_DIR); echo $(COMMON_DIR)/bad_words; fi - @if [ ! -f $(INSTALL_ROOT)$(SEARCH_DIR)/$(SEARCH_FORM) ]; then sed -e s%@IMAGEDIR@%$(IMAGE_URL_PREFIX)% $(top_srcdir)/installdir/search.html >$(INSTALL_ROOT)$(SEARCH_DIR)/$(SEARCH_FORM); echo $(SEARCH_DIR)/$(SEARCH_FORM);fi - @if [ ! -f $(INSTALL_ROOT)$(COMMON_DIR)/footer.html ]; then sed -e s%@IMAGEDIR@%$(IMAGE_URL_PREFIX)% $(top_srcdir)/installdir/footer.html >$(INSTALL_ROOT)$(COMMON_DIR)/footer.html; echo $(COMMON_DIR)/footer.html;fi - @if [ ! -f $(INSTALL_ROOT)$(COMMON_DIR)/header.html ]; then sed -e s%@IMAGEDIR@%$(IMAGE_URL_PREFIX)% $(top_srcdir)/installdir/header.html >$(INSTALL_ROOT)$(COMMON_DIR)/header.html; echo $(COMMON_DIR)/header.html;fi - @if [ ! -f $(INSTALL_ROOT)$(COMMON_DIR)/wrapper.html ]; then sed -e s%@IMAGEDIR@%$(IMAGE_URL_PREFIX)% $(top_srcdir)/installdir/wrapper.html >$(INSTALL_ROOT)$(COMMON_DIR)/wrapper.html; echo $(COMMON_DIR)/wrapper.html;fi - @if [ ! -f $(INSTALL_ROOT)$(COMMON_DIR)/nomatch.html ]; then sed -e s%@IMAGEDIR@%$(IMAGE_URL_PREFIX)% $(top_srcdir)/installdir/nomatch.html >$(INSTALL_ROOT)$(COMMON_DIR)/nomatch.html; echo $(COMMON_DIR)/nomatch.html;fi - @if [ ! -f $(INSTALL_ROOT)$(COMMON_DIR)/syntax.html ]; then sed -e s%@IMAGEDIR@%$(IMAGE_URL_PREFIX)% $(top_srcdir)/installdir/syntax.html >$(INSTALL_ROOT)$(COMMON_DIR)/syntax.html; echo $(COMMON_DIR)/syntax.html;fi - @if [ ! -f $(INSTALL_ROOT)$(COMMON_DIR)/english.0 ]; then $(INSTALL_DATA) $(top_srcdir)/installdir/english.0 $(INSTALL_ROOT)$(COMMON_DIR); echo $(COMMON_DIR)/english.0;fi - @if [ ! -f $(INSTALL_ROOT)$(COMMON_DIR)/english.aff ]; then $(INSTALL_DATA) $(top_srcdir)/installdir/english.aff $(INSTALL_ROOT)$(COMMON_DIR); echo $(COMMON_DIR)/english.aff;fi - @if [ ! -f $(INSTALL_ROOT)$(COMMON_DIR)/synonyms ]; then $(INSTALL_DATA) $(top_srcdir)/installdir/synonyms $(INSTALL_ROOT)$(COMMON_DIR); echo $(COMMON_DIR)/synonyms;fi + @if [ ! -f $(INSTALL_ROOT)$(CONFIG_DIR)/htdig.conf ]; then sed -e s%@DATABASE_DIR@%$(DATABASE_DIR)% -e s%@IMAGEDIR@%$(IMAGE_URL_PREFIX)% $(top_srcdir)/installdir/htdig.conf >$(INSTALL_ROOT)$(CONFIG_DIR)/htdig.conf; echo $(INSTALL_ROOT)$(CONFIG_DIR)/htdig.conf;fi + @if [ ! -f $(INSTALL_ROOT)$(SEARCH_DIR)/$(SEARCH_FORM) ]; then sed -e s%@IMAGEDIR@%$(IMAGE_URL_PREFIX)% $(top_srcdir)/installdir/search.html >$(INSTALL_ROOT)$(SEARCH_DIR)/$(SEARCH_FORM); echo $(INSTALL_ROOT)$(SEARCH_DIR)/$(SEARCH_FORM);fi + @for i in $(COMMONHTML); do \ + if [ ! -f $(INSTALL_ROOT)$(COMMON_DIR)/$$i ]; then sed -e s%@IMAGEDIR@%$(IMAGE_URL_PREFIX)% $(top_srcdir)/installdir/$$i >$(INSTALL_ROOT)$(COMMON_DIR)/$$i; echo $(INSTALL_ROOT)$(COMMON_DIR)/$$i;fi \ + done && test -z "$$fail" + @for i in $(COMMONDICT); do \ + if [ ! -f $(INSTALL_ROOT)$(COMMON_DIR)/$$i ]; then $(INSTALL_DATA) $(top_srcdir)/installdir/$$i $(INSTALL_ROOT)$(COMMON_DIR)/$$i; echo $(INSTALL_ROOT)$(COMMON_DIR)/$$i; fi \ + done && test -z "$$fail" @echo "Installing images..." @for i in $(IMAGES); do \ - if [ ! -f $(INSTALL_ROOT)$(IMAGE_DIR)/$$i ]; then $(INSTALL_DATA) $(top_srcdir)/installdir/$$i $(INSTALL_ROOT)$(IMAGE_DIR)/$$i; echo $(IMAGE_DIR)/$$i;fi; \ + if [ ! -f $(INSTALL_ROOT)$(IMAGE_DIR)/$$i ]; then $(INSTALL_DATA) $(top_srcdir)/installdir/$$i $(INSTALL_ROOT)$(IMAGE_DIR)/$$i; echo $(INSTALL_ROOT)$(IMAGE_DIR)/$$i;fi; \ done && test -z "$$fail" @echo "Creating rundig script..." @if [ ! -f $(INSTALL_ROOT)$(BIN_DIR)/rundig ]; then \ sed -e s%@BIN_DIR@%$(BIN_DIR)% -e s%@COMMON_DIR@%$(COMMON_DIR)% -e s%@DATABASE_DIR@%$(DATABASE_DIR)% $(top_srcdir)/installdir/rundig >$(INSTALL_ROOT)$(BIN_DIR)/rundig; \ - chmod 755 $(BIN_DIR)/rundig; \ + chmod 755 $(INSTALL_ROOT)$(BIN_DIR)/rundig; \ fi @echo "Installation done." @echo "" @echo "Before you can start searching, you will need to create a" @echo "search database. A sample script to do this has been" - @echo "installed as " $(BIN_DIR)/rundig + @echo "installed as " $(INSTALL_ROOT)$(BIN_DIR)/rundig install-strip: $(MAKE) INSTALL_PROGRAM='$(INSTALL_PROGRAM) -s' \
> --- htdig-3.1.0-020399/htsearch/Display.cc.rkorig Thu Feb 4 12:38:27 1999 > +++ htdig-3.1.0-020399/htsearch/Display.cc Thu Feb 4 12:39:22 1999 > @@ -370,7 +370,7 @@ Display::setVariables(int pageNumber, Li > else if (mystrcasecmp(config["match_method"], "or") == 0) > vars.Add("MATCH_MESSAGE", new String("some")); > vars.Add("MATCHES", new String(form("%d", nMatches))); > - vars.Add("PLURAL_MATCHES", new String(nMatches == 0 ? "" : "s")); > + vars.Add("PLURAL_MATCHES", new String(nMatches == 0 ? (char *)"" : (char *)"s")); > vars.Add("PAGE", new String(form("%d", pageNumber))); > vars.Add("PAGES", new String(form("%d", nPages))); > vars.Add("FIRSTDISPLAYED", > @@ -1116,8 +1116,8 @@ Display::compareTitle(const void *a1, co > { > ResultMatch *m1 = *((ResultMatch **) a1); > ResultMatch *m2 = *((ResultMatch **) a2); > - char *t1 = (m1->getRef()) ? m1->getRef()->DocTitle() : ""; > - char *t2 = (m2->getRef()) ? m2->getRef()->DocTitle() : ""; > + char *t1 = (m1->getRef()) ? m1->getRef()->DocTitle() : (char *)""; > + char *t2 = (m2->getRef()) ? m2->getRef()->DocTitle() : (char *)""; > > if (!t1) t1 = ""; > if (!t2) t2 = "";
I had a couple observations about this patch: 1) it seems harmless enough, so if it helps with egcs, then great! It's quite odd, though, that egcs would need those type casts. I thought any character string in double quotes was a char *, in C and C++. Why does egcs seem to think otherwise? 2) I just noticed that the test for PLURAL_MATCHES is wrong. It should be nMatches == 1, not nMatches == 0! I'd send a patch, but it hardly seems worth it for one character. Geoff, can you put this fix in manually, please?
In case anyone is interested, here is my current spec file, from which I'll eventually develop the final release. The changes to the Makefiles in the past month have allowed me to really clean up the installation. The patches will go, eventually. --------------- (snip) --------------- Summary: A web indexing and searching system for a small domain or intranet Name: htdig Version: 3.1.0-020399 Release: 0 Copyright: GPL Group: Networking/Utilities BuildRoot: /var/tmp/htdig-root Source0: http://www.htdig.org/files/htdig-%{PACKAGE_VERSION}.tar.gz Patch0: htdig-%{PACKAGE_VERSION}-buildroot.patch Patch1: htdig-%{PACKAGE_VERSION}-buildroot.patch2 Patch2: htdig-%{PACKAGE_VERSION}-nmatch.patch URL: http://www.htdig.org/ Packager: Gilles Detillieux <grdetil@scrc.umanitoba.ca>
%description The ht://Dig system is a complete world wide web indexing and searching system for a small domain or intranet. This system is not meant to replace the need for powerful internet-wide search systems like Lycos, Infoseek, Webcrawler and AltaVista. Instead it is meant to cover the search needs for a single company, campus, or even a particular sub section of a web site.
As opposed to some WAIS-based or web-server based search engines, ht://Dig can span several web servers at a site. The type of these different web servers doesn't matter as long as they understand the HTTP 1.0 protocol. %prep %setup -q -n htdig-%{PACKAGE_VERSION} %patch0 -p1 %patch1 -p1 -b .klaren %patch2 -p1
%build CFLAGS="$RPM_OPT_FLAGS" ./configure --prefix=/usr \ --bindir=/usr/sbin --libexec=/usr/lib --libdir=/usr/lib \ --mandir=/usr/man --sysconfdir=/etc/htdig \ --localstatedir=/var/lib/htdig \ --with-image-dir=/home/httpd/html/htdig \ --with-cgi-bin-dir=/home/httpd/cgi-bin \ --with-search-dir=/home/httpd/html make
%install rm -rf $RPM_BUILD_ROOT make INSTALL_ROOT=$RPM_BUILD_ROOT install-strip mkdir -p $RPM_BUILD_ROOT/etc/cron.daily ln -s ../../usr/sbin/rundig $RPM_BUILD_ROOT/etc/cron.daily/htdig-dbgen ln -s ../../../../usr/doc/htdig-%{PACKAGE_VERSION} \ $RPM_BUILD_ROOT/home/httpd/html/htdig/htdoc
%clean rm -rf $RPM_BUILD_ROOT
%post # Only run this if installing for the first time if [ "$1" = 1 ]; then SERVERNAME="`grep '^ServerName' /etc/httpd/conf/httpd.conf | awk '{print $2}'`" [ -z "$SERVERNAME" ] && SERVERNAME="`hostname -f`" [ -z "$SERVERNAME" ] && SERVERNAME="localhost" echo "start_url: http://$SERVERNAME/ local_urls: http://$SERVERNAME/=/home/httpd/html/ local_user_urls: http://$SERVERNAME/=/home/,/public_html/" >> /etc/htdig/htdig.conf
fi
%files %defattr(-,root,root) %config /etc/htdig/htdig.conf %config /usr/sbin/rundig %config /home/httpd/html/search.html /etc/cron.daily/htdig-dbgen /usr/sbin/htdig /usr/sbin/htfuzzy /usr/sbin/htmerge /usr/sbin/htnotify /var/lib/htdig /home/httpd/cgi-bin/htsearch /home/httpd/html/htdig
%doc CONFIG README htdoc/*
%changelog * Thu Feb 4 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca> - put web stuff back in /home/httpd/html & /home/httpd/cgi-bin, so it can go over a standard Apache installation on Red Hat - cleaned up %install to make use of new features
* Thu Feb 4 1999 Ric Klaren <klaren@telin.nl> - updated to 3.1.0-020399 - changed buildroot stuff - minor spec file fixes - install web stuff in /home/httpd/htdig - made rundig config file
* Mon Jan 4 1999 Gilles Detillieux <grdetil@scrc.umanitoba.ca> - updated to 3.1.0b4
* Tue Dec 15 1998 Gilles Detillieux <grdetil@scrc.umanitoba.ca> - updated to 3.1.0b3, changed version number & rundig script accordingly
* Thu Nov 5 1998 Gilles Detillieux <grdetil@scrc.umanitoba.ca> - patched htdoc/where.html to reflect latest version
* Tue Nov 3 1998 Gilles Detillieux <grdetil@scrc.umanitoba.ca> - updated to 3.1.0b2, changed patches accordingly
* Tue Sep 22 1998 Gilles Detillieux <grdetil@scrc.umanitoba.ca> - Changed config patches to remove -ggdb compile option (for 3.1.0b1) - Added local_urls stuff to generated htdig.conf file
* Fri Sep 18 1998 Gilles Detillieux <grdetil@scrc.umanitoba.ca> - Built the rpm from latest htdig source (3.1.0b1), using earlier versions of rpms by Mihai Ibanescu <misa@dntis.ro> and Elliot Lee <sopwith@cuc.edu> as a model, incorporating ideas from both. I've made the install locations as FSSTND compliant as I can think of.
--------------- (snip) ---------------
-- Gilles R. Detillieux E-mail: <grdetil@scrc.umanitoba.ca> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message. From - Thu Feb 4 22:12:24 1999 Return-Path: <ghutchis@wso.williams.edu> Received: from sob.htdig.org (htdig.org [209.75.193.22]) by rodan.contigo.com (8.9.2/8.9.2/Debian/GNU) with ESMTP id TAA30802 for <andrew@contigo.com>; Thu, 4 Feb 1999 19:48:35 -0800 (PST) Received: from sob.htdig.org (localhost [127.0.0.1]) by sob.htdig.org (8.9.2/8.9.1/Debian/GNU) with SMTP id TAA01634; Thu, 4 Feb 1999 19:49:01 -0800 (PST) From: Geoff Hutchison <ghutchis@wso.williams.edu> Errors-To: htdig3-dev@htdig.org To: htdig3-dev@htdig.org Message-ID: <36BA6A35.BeroList-2.5.9@sob.htdig.org> In-Reply-To: <36BA46EF.BeroList-2.5.9@sob.htdig.org> References: <36B9C0C0.BeroList-2.5.9@sob.htdig.org> from "Ric Klaren" at Feb 4, 99 04:47:44 pm Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Date: Thu, 4 Feb 1999 22:46:06 -0400 file Cc: htdig3-dev@htdig.org Subject: Re: [htdig3-dev] improved buildroot patch + egcs patch + RPM spec>Your patch to CONFIG.in is just the thing to take all the paths out of >the source and put them all on the configure command line in the spec >file. I wonder, though, whether other users will want those changes >as a permanent part of CONFIG.in in the source tree. If so, great, >because then the RPMs won't need to include any patches, but if not, >then this one patch will still have to remain in the RPMs.
I think for now, I'm going to leave this out. I think I'd rather see an effort after release to do something a la Apache where you can pick your layout easily using configure. Until then, let's stick to what people generally expect.
>2) I just noticed that the test for PLURAL_MATCHES is wrong. It should be >nMatches == 1, not nMatches == 0! I'd send a patch, but it hardly seems >worth it for one character. Geoff, can you put this fix in manually, please?
Got it. Good call Gilles!
>In case anyone is interested, here is my current spec file, from which >I'll eventually develop the final release. The changes to the Makefiles >in the past month have allowed me to really clean up the installation. >The patches will go, eventually.
I'll wait a bit and put this in contrib/ a bit before release. It's not something that requires much testing, and it's more like documentation than code. :-P
-Geoff
------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message.
This archive was generated by hypermail 2.0b3 on Thu Feb 04 1999 - 22:14:20 PST