Re: htdig: 3.1.b2 -> 3.1.b3 performance degradation +


Dan Dexter (ddexter@lincom-asg.com)
Mon, 18 Jan 1999 01:55:21 -0600


Mr. Gilles Detillieux

In reference to: http://www.htdig.org/mail/1999-01/0045.html

I am running htDig 3.1.0b4 on a DEC Alpha using Digital UNIX 4.0D.

I have an htDig database that is around 111,000 documents and
about 600 Meg in size.

With htDig built from the 3.1.0b4 source, searches were taking
several minutes to complete. With the patch you have supplied below
it is now in the seconds range when date_factor and backlink_factor
set to zero.

Please make sure this patch makes it into the 3.1.0 release.

Later,
Dan

---------------

According to Geoff Hutchison:
>
>
> On Thu, 24 Dec 1998, Maren S. Leizaola wrote:
>
> > Just for my information if I set the back_link factor to 0 will
> > the disk I/O speed up?
>
> No. We're working on a patch that will do this.

Here's a patch to 3.1.0b4 to do this. You have to set backlink_factor
and date_factor to 0. By default, date_factor is already 0, but
backlink_factor isn't.

--- htsearch/Display.cc.backlink Tue Dec 22 20:15:39 1998
+++ htsearch/Display.cc Tue Jan 5 17:18:44 1999
@@ -800,6 +800,8 @@
     String url;
     ResultMatch *thisMatch;
     List *matches = new List();
+ double backlink_factor = config.Double("backlink_factor");
+ double date_factor = config.Double("date_factor");
          
     results->Start_Get();
     while ((id = results->Get_Next()))
@@ -833,7 +835,6 @@
         //
         DocMatch *dm = results->find(id);
         double score = dm->score;
- DocumentRef *thisRef = docDB[thisMatch->getURL()];
  
         // We need to scale based on date relevance and backlinks
         // Other changes to the score can happen now
@@ -843,19 +844,23 @@
         // We want older docs to have smaller values and the
         // ultimate values to be a reasonable size (max about 100)
  
- if (thisRef) // We better hope it's not null!
+ if (date_factor != 0.0 || backlink_factor != 0.0)
+ {
+ DocumentRef *thisRef = docDB[thisMatch->getURL()];
+ if (thisRef) // We better hope it's not null!
           {
- score += config.Double("date_factor") *
+ score += date_factor *
               ((thisRef->DocTime() * 1000 / (double)time(0)) - 900);
             int links = thisRef->DocLinks();
             if (links == 0)
               links = 1; // It's a hack, but it helps...
- score += config.Double("backlink_factor")
+ score += backlink_factor
               * (thisRef->DocBackLinks() / (double)links);
           }
  
- // Get rid of it to free the memory!
- delete thisRef;
+ // Get rid of it to free the memory!
+ delete thisRef;
+ }
  
         thisMatch->setIncompleteScore(score);
         thisMatch->setAnchor(dm->anchor);

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:
http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Wed Jan 20 1999 - 08:37:46 PST