Re: [htdig] Searches on Date Ranges

Gilles Detillieux (
Tue, 30 Mar 1999 13:00:33 -0600 (CST)

According to mike grommet:
> I feel tho that the sorting on date capability is needed, in many cases.
> (not to mention, in my current project <g>)

It's there right now, and works fine. This feature is somewhat
independent of any date range selection code that gets added to htsearch,
though it uses the same time_t field for modification date that your
code will have to use.

> I believe that many of the expensive commercial search engine alternatives
> allow this capability... I know that Ultraseek does, but egads they want a
> lot
> of money for that software.
> htdig is a great package nd most of the framework already exists from what
> I can see in the code, we just have to hammer out the implementation details
> It seems to me that date+time is not so much useful as just date.
> No one is indexing on the fly anyway right? So I assume we could just zero
> out the
> hrs, mins, second fields of the date+time?
> I would expect the entering of dates
> using month, day, year select boxes (1 each). Then, htsearch can piece them
> together into one long string for the comparison?

I agree that from a user-interface perspective, treating the year, month
& day as separate fields is best. That way, they can be input boxes,
select boxes, radio buttons, whatever you want. It also lets you avoid
the date format question, and allows the search form to dictate that.

The trick will be for follow-up searches, because you don't want to
make htsearch generate ALL of these different options for use in the
results template. One or two will have to do (e.g. you could generate
FROM_MONTH and SELECTED_FROM_MONTH, much like the other select options
are generated). Either that or you could just pass the raw numbers to
the results form, which would either keep these as hidden fields (so all
follow ups use the same range, just like the restrict and exclude fields
are dealt with), or would use them in input boxes only.

As for piecing them together, remember that you're not doing a string
comparison, but a time_t comparison with the DocTime field. That means
plugging the year, month and day into a tm structure, and converting it
to a time_t, as htdig/ does.

You're probably right that the time won't be relevant. If someone
decides later they need it, it should be fairly easy to add that in as
separate fields. For now, set the "from" time to 0:00:00 and the "to"
time to 23:59:59 before making the time_t values, so that you'll match
the whole date range inclusively.

> Also, we definately need to consider the possibility of the user
> entering in an invalid date, such as Feb 30, and such.

Yeah, but it's not a serious problem. htnotify allows days 1 to 31 for
any month, so Feb 30 is taken as > Feb 29 and < Mar 1. If you passed
the date 1999-02-30 to mytimegm or glibc's mktime (in a tm structure,
of course) it would take it as 1999-03-02, without any further checking.
Other C library mktime or timegm fuctions may behave differently, but in
any case you should get some sort of time_t value out of it. Some sort
of validity checking would be a good idea, though, to make sure things
don't blow up (e.g. if a month > 12 is used as an array index). Also note
that a signed 32-bit time_t will overflow after 2038-01-19 03:14:07 UTC.

> From: Torsten Neuer []
> On Die, 30 Mär 1999, Geoff Hutchison wrote:
> >On Mon, 29 Mar 1999 wrote:
> >
> >> Doesn't the HTTP header give you the file modification date? And I
> >> sure hope it's in the format YYYY so it will work next year, too ;-)
> >
> >No, that's not the problem. We already get the date of the documents from
> >the HTTP header. The problem is how you get the date from the *search
> >form* since the most obvious solution (a text field for users to input a
> >date) requires working out the date format.
> >
> >So my question is this: do we just limit the format? If so, what do we
> >use? If not, how do we figure out the format used?
> I think it should be up to the creator of the form how (s)he handles
> input and formatting of time/date. Just provide a couple of variables
> where to store the stuff. ISO time/date format will probably work
> best because of its sortability.

Sortability isn't a big issue here, because we use time_t internally,
and already can sort on that.

> That way one could use ISO time/date formatted values or ranges to
> be directly passed to htsearch in some easy form whereas in a more
> complex form time/date input could be done with some nifty form
> fields, passed on to a wrapper that checks the entry, ISOifies it
> and passes it on to htsearch.
> IMHO htsearch should not be overloaded with features, but stay with
> good ol' *nix philosophy of being a small tool that handles a part
> of a problem. Other tools can do the rest. That would make htsearch
> much handier in general and not restrict its use to a specific case.

Yes, but as separate year, month and day fields give the most flexibility,
I'd recommend putting that in the htsearch tool, and anyone who prefers
a unified date field can put that in a wrapper.

Gilles R. Detillieux              E-mail: <>
Spinal Cord Research Centre       WWW:
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930
To unsubscribe from the htdig mailing list, send a message to containing the single word "unsubscribe" in
the SUBJECT of the message.

This archive was generated by hypermail 2.0b3 on Tue Mar 30 1999 - 13:27:04 PST