[wp-hackers] WP issues

Thu May 31 18:55:33 GMT 2007

As I promised I'd write ages ago, here's a list of issues with  
WordPress (though I accidentally sent this from my main email address  
and it got caught in the moderation queue yesterday, so here it is  
from the proper address (which, as it is subscribed, should read all  
of you)):

1. People have been asking for an XML serialiser to be used for all  
the XML WordPress produces for years. This doesn't exist. This allows  
invalid bytes to get into XML data. Try parsing <http://photomatt.net/ 
comments/feed/atom/> with a compliant XML parser. You'll get a fatal  
error. This is the exact sort of issue that an XML serialiser would  
avoid.

2. Having waited years for an Atom 1.0 feed to be offered:
	a) We eventually get one that allows malformed XML to be inserted  
(see 1. above).
	b) Uses RFC 822 dates (what part of section 3.3 of RFC 4287[RFC4287]  
is unclear?).
	c) Uses @content for <link> (where did that come from? There's no  
@content in the entire spec! Please see @href, section 4.2.7.1 
[RFC4287].).
	d) Claims that the blog title is a MIME type, and when the feed is  
meant to link to itself it links to the RSS 2.0 feed (what's unclear  
in section 4.2.7.2 of RFC 4287[RFC4287]?).

Those four issues are just from a quick glance at the above mentioned  
feed. Seeming so many of these things are CLEARLY wrong, it looks as  
if the person who implemented it had NEVER read the Atom 1.0 spec,  
RFC 4287[RFC4287].

3. Why was no Atom 1.0 feed offered?
	a) "Aggregators I tried with the existing patch didn't  
work."[TICKET1526] — We already offer multiple feeds to make sure  
the UA can parse one (though in the above case, all of Photo Matt's  
comment feeds are currently broken, and any parser that parses it is  
broken).
	b) "There is no satisfactory patch available."[TICKET1526] —  
Waiting doesn't seem to have made this much better than patches that  
were around several years ago, see 2.
	c) "We have no way currently to ensure XHTML validity."[TICKET1526]  
— See 1.

4. When will the above (see 2) issues be resolved?
	a) There was a bug I reported around six months ago in the Atom 0.3  
feed[TICKET3377], which still hasn't been fixed in WordPress 2.0.x,  
despite the fact it makes the feed totally unusable in some  
aggregators (Firefox 2.0 included).
	b) This bug had been in WordPress for several years, yet remained  
unfixed, despite the fact that a visit to the feed validator 
[FEEDVALIDATOR] would have clearly pointed it out.
	c) Matt failed to read the report: "Since our feeds currently work  
everywhere, I'm not inclined to change this."[TICKET3377] (the bug  
report explicitly cites Firefox 2.0 as breaking). Is it really too  
much to ask to actually read a bug report before closing as wontfix?

5. WordPress has no guarantee that the XHTML it outputs is well- 
formed OR valid.
	a) Us, as implementers, are allowed to parse XHTML (even when served  
as text/html) as XML, so WordPress will fail horribly.
	b) There is no method to switch WordPress to output HTML (so we  
don't expose users to fatal errors), and this can't be done at a  
theme level as in places empty tags are closed within WordPress's  
source. We can't change this without using an XML parser on  
WordPress's output, but as there is no guarantee it's well-formed,  
this doesn't help.

6. WordPress claims that it has "a focus on aesthetics, web  
standards, and usability."[WORDPRESS]
	a) If you focus on web standards, why is that very page served as  
text/html while having an XHTML 1.1 DOCTYPE? This SHOULD NOT[RFC2119]  
be done under advice from the old HTML WG[XHTMLMIME]. Can you please  
cite your reasons in this "particular behavior is acceptable or even  
useful"[RFC2119] (from the normative definition of SHOULD NOT).
	b) WordPress by default uses a DOCTYPE that exists for transiting to  
standards. If you have a focus on web standards, why is this  
transition still going on after many years?
	c) WordPress uses XHTML served as text/html, which MAY be done. Why  
does this (the blogosphere) "particular marketplace requires it or  
because the vendor feels that it enhances the product"[RFC2119] (from  
the normative definition of MAY)?

7. WordPress has on several occasions avoided changing libraries  
(sometimes to the extent of just not updating them) on grounds on  
backwards compatibility.
	a) We're moving from script.aculo.us to JQuery, but keeping JQuery  
around for plugins/themes that need it. Why can we not have multiple  
versions of other libraries (or different libraries that do the same  
thing)?
	b) Better yet, why don't we just have an abstraction layer, so the  
library is never called directly (and therefore can be swapped  
without breaking anything)?

8. There are other bugs that do in many places cause large issues,  
including but not limited to:
	a) IRIs as comment links get stripped (e.g., James Holderness,  
<http://www.詹姆斯.com/>, posted a comment on my blog. The IRI was  
rewritten to <http://www..com/>.).
	b) <blockquote> elements cannot be nested within one another 
[TICKET1170]. This is marked as closed, although I just recreated it  
on both 2.2 and 2.3. This bug has been opened for over _TWO YEARS  
WITH PATCHES AVAILABLE_!

9. There have been several occasions on which Matt has overruled  
consensus, sometimes giving a reason like "I misunderstood" (does  
that mean _EVERYONE_ misunderstood, or just you, Matt?).

10. Matt has claimed that the development process isn't broken  
because everything gets replied to. I'll continue waiting, then 
[HACKERS12601]. Oh, and see if anyone replies to this.

11. WordPress tries to improve the quality of releases by moving onto  
120 day releases[HACKERS8907], yet proceeds to break it with the  
first release on the cycle[CHANGESET5110] (or does tagging not count  
as "crazy wild fun development"?).

In summary, I'd advice those working on WordPress to read the follow  
specifications (including all references):
	a) XML 1.0[XML]
	b) RFC 4287 (The Atom Syndication Format)[RFC4287]
	c) XHTML 1.0[XHTML1]
	d) XHTML 1.1[XHTML11]
	e) RFC 2119 (Key words for use in RFCs to Indicate Requirement  
Levels)[RFC2119]

Hixie has a post[HIXIE1140242962] that explains how to read  
specifications. For those who think this post is all far too serious,  
here's something (slightly) less serious:
	"This specification should be read like all other specifications.  
First, it should be read cover-to-cover, multiple times. Then, it  
should be read backwards at least once. Then it should be read by  
picking random sections from the contents list and following all the  
cross-references."[HTML5]

All the best,

Geoffrey Sneddon

[CHANGESET5110]: http://trac.wordpress.org/changeset/5110
[FEEDVALIDATOR]: http://feedvalidator.org/
[HACKERS12601]: http://comox.textdrive.com/pipermail/wp-hackers/2007- 
May/012601.html
[HACKERS8907]: http://comox.textdrive.com/pipermail/wp-hackers/2006- 
October/008907.html
[HIXIE1140242962]: http://ln.hixie.ch/?start=1140242962&count=1
[HTML5]: http://www.whatwg.org/specs/web-apps/current-work/
[RFC2119]: http://www.ietf.org/rfc/rfc2119.txt
[RFC4287]: http://www.ietf.org/rfc/rfc4287
[TICKET1170]: http://trac.wordpress.org/ticket/1170
[TICKET1526]: http://trac.wordpress.org/ticket/1526
[TICKET3377]: http://trac.wordpress.org/ticket/3377
[TICKET3377]: http://trac.wordpress.org/ticket/3377
[WORDPRESS]: http://wordpress.org/
[XHTML1]: http://w3.org/TR/xhtml1/
[XHTML11]: http://w3.org/TR/xhtml11/
[XHTMLMIME]: http://www.w3.org/TR/xhtml-media-types/
[XML]: http://w3.org/TR/xml/