Re: [htdig] excluding page section? sorting output?

Subject: Re: [htdig] excluding page section? sorting output?
From: Torsten Neuer (
Date: Wed Jan 17 2001 - 04:04:25 PST

Bernhard Krickl wrote:
> > > Is there a way to sort the output by category?
> > Basically, you can sort the output by score, time and title.
> > If you structure your Web-Site in a way that you can automagically
> > use the document titles for categories, that's the way it goes...
> > For more information, please see:
> >
> This does not help. I'm thinking about self-defined categories,
> maybe defined by some Meta-tag or meta-keywords.
> Doc-titles might be out of question, but I'll check it.
> Any more ideas?

Categories could also be implemented via URL structures. In this case
you could either patch the CGI program to add a sort-by-url method or
run the (complete) search output through an additional wraper script.

If you have categories based upon META tags, you'll need to change the
database in order to support this special information.

> > > Is there a possibility to index Shockwave Flash files?
> > This is a bit harder. I searched the web for an existing parser but
> > only
> > found some more-or-less useful docs and one generic parser.
> >
> > This generic parser (see attachment) can easily be used within a wrapper
> > script to at least extract links from a flash menu, which in my opinion
> > is
> > the most requested feature.
> Thanx for this one, but I'll need a bit more time to check it.
> Anway, extracting links is not enough, i think. keywords or full text
> index are needed.

Well, full text index should also be possible, but requires some more
work on the parser. The attached one is just a very generic one which
dumps all the different record entries of a flash file. It is not de-
signed to be an axternal parser for Ht://Dig, but it works well with
the shell wrapper to extract links from flash menus. With some addi-
tional work it shoudl be possible to produce a fully fledged external
parser out of it (yet, I haven't found the time nor did I have some
projects depending on that).



InWise - Wirtschaftlich-Wissenschaftlicher Internet Service GmbH
Waldhofstraße 14                            Tel: +49-4101-403605
D-25474 Ellerbek                            Fax: +49-4101-403606
E-Mail:            Internet:

------------------------------------ To unsubscribe from the htdig mailing list, send a message to You will receive a message to confirm this. List archives: <> FAQ: <>

This archive was generated by hypermail 2b28 : Wed Jan 17 2001 - 04:19:50 PST