Re: [htdig] excluding page section? sorting output?


Subject: Re: [htdig] excluding page section? sorting output?
From: Torsten Neuer (tneuer@inwise.de)
Date: Wed Jan 17 2001 - 03:04:12 PST


Bernhard Krickl wrote:
>
> Hi!
>
> My boss keeps asking me about features he wants with htdig.
> Recently he came up with the following:
>
> Is there a way to exclude a section on an HTML-page from
> indexing? Thats because navigational elements often produce hits
> when the content doesn't match much. (Frames are not an option!)

There is a way. Please see the Ht://Dig documentation:
        http://www.htdig.org/attrs.html#noindex_start
        http://www.htdig.org/attrs.html#noindex_end

>
> Is there a way to sort the output by category?

Yes and no ;)

This highly depends upon how you define "category".

Basically, you can sort the output by score, time and title.
If you structure your Web-Site in a way that you can automagically
use the document titles for categories, that's the way it goes...
For more information, please see:
        http://www.htdig.org/attrs.html#sort

>
> And here's another one:
> Is there a possibility to index Shockwave Flash files?
> Let me guess: Yes, if I have an external parser.
Yep ;)
> In this case: Where do i find one?

This is a bit harder. I searched the web for an existing parser but
only
found some more-or-less useful docs and one generic parser.

This generic parser (see attachment) can easily be used within a wrapper
script to at least extract links from a flash menu, which in my opinion
is
the most requested feature.

hth,

  Torsten

-- 
InWise - Wirtschaftlich-Wissenschaftlicher Internet Service GmbH
Waldhofstraße 14                            Tel: +49-4101-403605
D-25474 Ellerbek                            Fax: +49-4101-403606
E-Mail: info@inwise.de            Internet: http://www.inwise.de


swfparser.tar.gz

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this. List archives: <http://www.htdig.org/mail/menu.html> FAQ: <http://www.htdig.org/FAQ.html>



This archive was generated by hypermail 2b28 : Wed Jan 17 2001 - 03:19:51 PST