Re: htdig: Sorting results on date (2)


Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Mon, 14 Dec 1998 18:46:22 -0600 (CST)


According to htDig user:
> On Tue, 8 Dec 1998, Gilles Detillieux wrote:
> > According to htDig user:
> > > Is it possible to sort the results of htdig-requests by date?
>
> > Well, I'm just speculating here, because I haven't tried it. I'd start by
> > adding a new compare() method to the Display class, such as compareDate().
> > It would be like compare(), except the value it returns would be something
> > like:
> > return m2->getRef()->DocTime() - m1->getRef()->DocTime();
>
> I could only wish it was so easy :-)
>
> I am currently writing my own routines for this sorting-stuff...
>
> Somehow, I'm unable to access the DocTime, DocTitle and such values.
> e.g. In Display.cc, I've tried something like this:
>
> printf("%s",array[counter]->getRef()->DocTitle());
>
> It _IS_ a string, according to ../htcommon/DocumentRef.cc ....
>
> Am I missing something here?

Yup! Something I missed too. The problem is that the setRef() isn't
done early enough. It's done after the sort, not before, so the getRef()
returns NULL.

> printf("%s",array[counter]->getURL()) works fine.. Think I need to write
> my own getTime() when I want to sort my results by date...... :-)
>
> > Then, in Display::sort(), you'd have to test the value of a new input
> > parameter, e.g. input["sortby"], to see if it selects a sort by date.
> > If so, the final argument it passes to qsort() would be
> > Display::compareDate instead of Display::compare.
>
> I skipped that extra-parameter part. It's ok with me to work with separate
> htsearch executeables...
>
>
> Is there someone out there who can help me out?

OK, here's a barely tested, fairly quick and dirty patch to add a sort
input option to htsearch. It fixes the maxScore calculation to make it
sort type independent, and does the setRef before it's needed. It requires
an input option, "sort", which can be "score", "date" or "title". You'll
need to change your search.html to set the option, and the template files
common/*.html to pass it along in subsequent searches using something like:

<input type=hidden name=sort value=$(SORT)>

Making the sort option as a config file parameter, as well as an input
parameter, would take a few other changes, but could be patterned after
the "restrict" parameter.

My patch was on the 3.1.0b3 pre-release code from

http://www.htdig.org/files/snapshots/htdig-3.1.0b3-121398.tar.gz

If you use that, you'll also need the 2nd patch below for another fix.
If you apply it to the 3.1.0b2 source, I can't promise it'll work, as
I haven't tested it, but the patch program should be able to apply it.

Here's the sort patch:
---------------------------------------------
--- htsearch/Display.h.sort Fri Dec 11 19:45:29 1998
+++ htsearch/Display.h Mon Dec 14 17:49:28 1998
@@ -158,6 +158,8 @@
     List *buildMatchList();
     void sort(List *);
     static int compare(const void *, const void *);
+ static int compareTime(const void *, const void *);
+ static int compareTitle(const void *, const void *);
     int includeURL(char *);
     String *readFile(char *);
     void expandVariables(char *);
--- htsearch/Display.cc.sort Fri Dec 11 19:45:29 1998
+++ htsearch/Display.cc Mon Dec 14 17:59:49 1998
@@ -189,7 +189,7 @@
         displayNomatch();
         return;
     }
- maxScore = match->getScore();
+ // maxScore = match->getScore(); // now done in buildMatchList()
             
     //
     // Display the window of matches requested.
@@ -355,6 +355,7 @@
     vars.Add("VERSION", new String(config["version"]));
     vars.Add("RESTRICT", new String(config["restrict"]));
     vars.Add("EXCLUDE", new String(config["exclude"]));
+ vars.Add("SORT", new String(input->get("sort")));
     if (mystrcasecmp(config["match_method"], "and") == 0)
         vars.Add("MATCH_MESSAGE", new String("all"));
     else if (mystrcasecmp(config["match_method"], "or") == 0)
@@ -492,6 +493,8 @@
         s << "method=" << input->get("method") << '&';
     if (input->exists("format"))
         s << "format=" << input->get("format") << '&';
+ if (input->exists("sort"))
+ s << "sort=" << input->get("sort") << '&';
     if (input->exists("matchesperpage"))
         s << "matchesperpage=" << input->get("matchesperpage") << '&';
     if (input->exists("words"))
@@ -848,6 +851,7 @@
 
         thisMatch->setIncompleteScore(score);
         thisMatch->setAnchor(dm->anchor);
+ thisMatch->setRef(thisRef);
                 
         //
         // Append this match to our list of matches.
@@ -974,6 +978,7 @@
 {
     int numberOfMatches = matches->Count();
     int i;
+ static char *sorttypes[] = { "score", "date", "title" };
 
     ResultMatch **array = new ResultMatch*[numberOfMatches];
     for (i = 0; i < numberOfMatches; i++)
@@ -982,12 +987,26 @@
     }
     matches->Release();
 
+ if (input->exists("sort")) {
+ char *st = input->get("sort");
+ for (i = sizeof(sorttypes)/sizeof(sorttypes[0]); --i > 0; )
+ {
+ if (mystrcasecmp(sorttypes[i], st) == 0)
+ break;
+ }
+ }
+ else
+ i = 0;
     qsort((char *) array, numberOfMatches, sizeof(ResultMatch *),
+ (i == 2) ? Display::compareTitle :
+ (i == 1) ? Display::compareTime :
           Display::compare);
 
     for (i = 0; i < numberOfMatches; i++)
     {
         matches->Add(array[i]);
+ if (i == 0 || maxScore < array[i]->getScore())
+ maxScore = array[i]->getScore();
     }
     delete [] array;
 }
@@ -1000,6 +1019,30 @@
     ResultMatch *m2 = *((ResultMatch **) a2);
 
     return m2->getScore() - m1->getScore();
+}
+
+//*****************************************************************************
+int
+Display::compareTime(const void *a1, const void *a2)
+{
+ ResultMatch *m1 = *((ResultMatch **) a1);
+ ResultMatch *m2 = *((ResultMatch **) a2);
+
+ return (int) (m2->getRef()->DocTime() - m1->getRef()->DocTime());
+}
+
+//*****************************************************************************
+int
+Display::compareTitle(const void *a1, const void *a2)
+{
+ ResultMatch *m1 = *((ResultMatch **) a1);
+ ResultMatch *m2 = *((ResultMatch **) a2);
+ char *t1 = m1->getRef()->DocTitle();
+ char *t2 = m2->getRef()->DocTitle();
+
+ if (!t1) t1 = "";
+ if (!t2) t2 = "";
+ return mystrcasecmp(t1, t2);
 }
 
 
---------------------------------------------

Here's the fix for a small omission in the 121398 pre-release:
---------------------------------------------
--- htcommon/WordList.cc.orig Sat Dec 12 18:15:35 1998
+++ htcommon/WordList.cc Mon Dec 14 16:18:13 1998
@@ -235,6 +235,7 @@
     char *word;
     String new_word;
     char *valid_punctuation = config["valid_punctuation"];
+ int minimumWordLength = config.Value("minimum_word_length", 3);
 
     while (fl && fgets(buffer, sizeof(buffer), fl))
     {
---------------------------------------------

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:29:51 PST