Re: [htdig] Request for new htdig META property: htdig-description


Subject: Re: [htdig] Request for new htdig META property: htdig-description
From: Patrick Jennings (synaptic@synaptic.bc.ca)
Date: Fri Apr 07 2000 - 15:16:19 PDT


> >
> > description_meta_tag_names: htdig-description description
> >
> > which generates the behaviour:
> > if no htdig-description then look for description,
> > if no description then automatically generated
> > excerpt
> >
> > description_meta_tag_names: htdig-description
> >
> > if no htdig-description, then automatically generated
> > excerpt.
> >
> > This is a more elegant and configurable implementation. I like
it.
> > It'll just take a bit more code to implement.
>
> The precedence aspect would be more difficult to implement. The
HTML.cc
> code parses HTML in one pass, and deals with tags as they occur, so
it
> would require storing the description tags until you find the
highest
> precedence one, then ignoring subsequent tags, or something like
that,
> and only indexing the words in the tag after you know you have the
final
> one. I guess that makes sense in any case, as you probably don't
want
> to index more than one of these, which is what your quick fix does.

Yep. But much of the code could live in the meta_dsc object, with
very little structural change resulting. The HTML.do_tag method would
contain lines like:

...
if (conf["name"] && conf["content"])

    if (meta_dsc.listed(cache)) then // if type string matches one
of description_meta_tag_names
        meta_dsc.save_data(cache, content) // write only if new
data, or higher priority
    }
}

case ("/head" pattern) // "/head" added to tags.Pattern, I image
    if (meta_dsc.found) // if we called meta_dsc.save_data at
least once
        meta_dsc.done // let the object know this is the final
answer
        add description.text to the index
    }
}
...

Or something like that.

Fortunately, in my case I want the pages with htdig-descriptions to
show up first in a search. So the quick-fix side-effect of adding
both sets of descriptions to the index increases keyword repetition.
Not a bug, it's a feature!

Cheers,

Patrick.

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-unsubscribe@htdig.org
You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Fri Apr 07 2000 - 13:01:09 PDT