[wp-hackers] WordPress, web standards, and (X)HTML

Benjamin Hawkes-Lewis bhawkeslewis at googlemail.com
Sat Dec 2 11:00:07 GMT 2006

On Sat, 2006-12-02 at 00:17 -0500, Joey B wrote:

> But Internet Explorer won't be forgotten in the next five years, and
> unfortunately for us, probably not the next 10. Any project looking at
> the long term shouldn't be thinking "100 years," but a more realistic
> number, say, "10."

Like most producers of (would-be) ephemera, I think you're vastly
underestimating the ultimate cultural importance of collections like the
Internet Archive. In the latest web framework for producing throw-away
intranet applications, this might be an understandable attitude. In a
project for producing cultural content, it's extremely short-sighted.
Obscure digital formats are difficult enough to deal with. Documents
that claim to comply with obscure digital formats but actually don't,
are /really/ difficult to deal with.

> With that said, we have to cater to the browser. If we want to use 
> XHTML and advocate its use over HTML, what are we to do when IE
> destroys a page using it? 

Well-formed, valid, conformant modularized XHTML has some potential
advantages over HTML 4.01 Strict. Non-well-formed, non-valid,
non-conformant, non-modularized "XHTML 1.0" served as text/html
(henceforth "faux XHTML" for short) has only disadvantages. By serving
such markup, you're /neither/ "using" nor "advocating" XHTML, you're
just transforming it from a document format to an empty marketing term.

> Many standards lovers (myself included) consider using text/html perfectly acceptable.

Somebody who loves standards might accept that well-formed, valid,
conformant XHTML 1.0 Strict could be served as text/html to browsers
that do not support application/xhtml+xml, so long as browsers that do
support application/xhtml+xml are served content with the correct media
type. I happen to think (as do many other "standards lovers") that the
HTML compatibility guidelines were not only technically inadequate but a
strategic error. But to say that "standards lovers" might endorse
sending only faux XHTML to all browsers regardless of ability is to rob
the phrase "standards lovers" of all essential meaning.

> It's no different than CSS hacks.

First, especially where conditional comments could have served instead,
CSS hacks have been shown to have been a terrible idea by the release of
IE8. So that's hardly a recommendation.

Second, faux XHTML is demonstrably worse than CSS hacks:

1) CSS hacks /attempt/ to target non-compliant browsers only. WordPress
setups usually serve faux XHTML indiscriminately to all user agents.

2) Most (though not all) CSS hacks involve validating CSS that targets
non-compliant browsers through their CSS parsing failures. By contrast,
faux XHTML is non-validating tag soup.

3) If the user finds your hacky CSS breaks the rendering in her browser,
she can turn it off or use a text browser. No such luck if you send
broken markup.

4) Getting rid of CSS hacks is easy: just change your stylesheet. Fixing
broken markup stored in hundreds of posts and comments in a database is
much trickier.

> I don't wanna argue much here (it's late and I'm tired :D )
> but I wanted to put my 2 cents in.

Thanks for your 2 cents but you seem to have completely misunderstood
what I'm proposing. I am /not/ suggesting for one second WordPress
should serve application/xhtml+xml to Internet Explorer and laugh as
that browser fails miserably (and your audience goes to another site).

On the contrary, I'm suggesting one of three alternatives:

1) If WordPress is going to use XHTML at all, it should serve
application/xhtml+xml to supporting engines like Gecko and recent WebKit
builds. For browsers that do not support application/xhtml+xml, serving
XHTML 1.0 as text/html is a poor solution IMHO. It's "acceptable",
however, so long as it's valid and conformant. That's a minimal
solution. To expect anything less is to undercut WordPress's claim to be
a standards-based system.

2) A better solution would be to serve XHTML 1.1 or modularized XHTML to
application/xhtml+xml-supporting browsers and transform the content to
HTML 4.01 Strict for HTML-only user agents. Storing an object-model in
the database and serializing to either XHTML or HTML would put WordPress
in an excellent position for adopting Web Applications 1.0 and/or XHTML
2. At present, it would of course be hopeless for either.

3) A radically simpler solution would be to just serve HTML 4.01 Strict
to all comers. This will probably continue to work, and removes any need
for content negotiation or transformation. If we start seeing mainstream
browsers that no longer handle text/html, we can always push HTML 4.01
Strict through a transformation to convert it to XHTML 1.1 or
modularized XHTML.

None of these solutions will allow Internet Explorer LTE 8 to "destroy a
page". If, like me, you think XHTML has potential, you really need to
back solution 2).

Benjamin Hawkes-Lewis

More information about the wp-hackers mailing list