[htdig] htdig parsing <object>


Subject: [htdig] htdig parsing
From: Rzepa, Henry (h.rzepa@ic.ac.uk)
Date: Fri Oct 06 2000 - 05:21:44 PDT


We are extensively converting our "legacy" html to xhtml,
using a combination of htdig, JTidy and locally written
JChemTidy.

One element we have focused on much is <object>. We
replace all instances of <embed> by< object>, because

a) <embed> is not well formed (ie it should be <embed />
b) it is not validatable. This is because the attributes of
<embed> are not defined by a DTD, but are instead implicit
in whatever attributes the plugin that <embed> resolves to
supports. Thus two users with different plugins may well
be running implicitly different DTDs for their document.
This is not good.

<object> solves both these problems.

Our only problem is that htdig 3.2 does not parse object.

A long time ago, we hacked htdig 3.1 to parse <embed> and
<object>, but these mods do not appear to have been incorporated
into htdig 3.2.

If someone could rescue them, we would be very grateful.
On this point, if htdig could also be persuaded to index the
title attribute of elements such as <object> it would be a great
help. As part of the xhtml conversion process, we build a title
if none exists, and it would be nice to have htdig pick it up!

Thanks.

-- 

Henry Rzepa. +44 (0)20 7594 5774 (Office) +44 (0)20 7594 5804 (Fax) Dept. Chemistry, Imperial College, London, SW7 2AY, UK. http://www.ch.ic.ac.uk/rzepa/

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this. List archives: <http://www.htdig.org/mail/menu.html> FAQ: <http://www.htdig.org/FAQ.html>



This archive was generated by hypermail 2b28 : Fri Oct 06 2000 - 05:26:35 PDT