[wp-hackers] WP issues

Peter Westwood peter.westwood at ftwr.co.uk
Sat Jun 2 21:16:17 GMT 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Geoffrey Sneddon wrote:
> On 2 Jun 2007, at 14:46, Sam Angove wrote:
>> The term "serialiser" is vague (what are you serialising from?), but I
>> assume you meant that the output should be built as, say, a DOM
>> object, then serialised from it to a text|application/xml document. If
>> so, then I disagree. It's not a magic bullet.
> 
> Any XML structure, whether it be SAX, DOM, or something else.
> 

SAX is not relavent in the context of generating XML - SAX is all about
parsing xml in a simple way without having to build an in memory model
of the document to navigate like you do with a DOM parser.

The biggest problem with any XML output serialiser that want to ensure
the document is well formed before providing it is the fact that they
just don't scale well.  Especially not in a web context.

The memory usage of a DOM model of the page and the delay introduced
before sending any content just doesn't seem worth anything for the user
when you consider the fact that all it can do is stop you sending the
invalid XML it can't actually fix the problem.

>> Most errors occur when users save posts and comments full of malformed
>> markup and bad character data. Building output as an XML DOM won't
>> help with that at all, because the broken input comes in as a string
>> and will need to be corrected beforehand. If that problem can be
>> solved, the class of errors that a serialiser would catch are
>> comparatively easy to handle.
> 
> The serialiser will ensure that  that it is well-formed, so would
> therefore strip invalid characters.
> 

The problem here is that if the output of your generator is invalid XML
then you need to fix the generator - wrapping it in a box and hiding the
fact that is doesn't work doesn't help anyone - the user still has to
fix what is being generated in order to get the output they want!

> Using SAX would allow us to behave in similar ways as we already do.
> Tag-balancing issues would never arise with a serialiser. You're never
> going to have test suites to test everything. Something explicitly
> designed to avoid these errors would avoid them happening. There are
> literally thousands of places in WP where I can insert content that'll
> cause a fatal error.

As I have said above SAX doesn't help. Serialisers only ensure valid
html in a purely technical sense.

westi
- --
Peter Westwood
http://blog.ftwr.co.uk
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGYd4hVPRdzag0AcURAuw9AKDDYtU29s+VDZDyo+CCUl4UbUyn3ACeMIcZ
amT3pKbOMecJP+DgsqdpWuk=
=sDC3
-----END PGP SIGNATURE-----


More information about the wp-hackers mailing list