[wp-hackers] HTML Purifier

Peter Westwood peter.westwood at ftwr.co.uk
Wed Feb 14 09:26:19 GMT 2007


On Tue, February 13, 2007 9:55 pm, Edward Z. Yang wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Matt Mullenweg wrote:
>> Andy Skelton wrote:
>>> I would love to replace KSES.
>>
>> Why? We've never found a single vulnerability in the code, which is
>> several years old.
>
> Kses is fairly resilient against XSS attacks, I'll give it that. It
> doesn't understand the HTML spec though, so that always keeps it open to
> attacks in the future.
>
> In terms of standards-compliance, kses doesn't come even close.
> Besides the bare minimum needed to prevent XSS, kses performs no
> attribute validation (<col span="foobar"> is legal), no inline CSS
> validation (WordPress does not allow inline CSS in its attribute set),
> and no nesting validation (<td>asdf</td> is legal even outside of tables).
>

kses isn't meant to be performing any of these tasks it has one job and
one job only to filter down the html allowed so as to stop XSS and other
attacks.

> Kses won't check if tags are balanced: WordPress had to implement custom
> code to overcome this problem. Kses does not properly escape quotes
> outside of tags, so it's totally unusable for XML (WordPress strips tags
> and then htmlentity-izes for that use).
>
> I think these are all very compelling reasons to drop that ancient piece
> of code.
>

If we want to improve the tag balancing then lets do that.

But as matt said kses has done us well and is doing the job we want it to
do fine.

For me tag balancing (balance_tags) and tag filtering (kses) are two
separate processes - and you don't always want both.

I do think we need super correctly (x)html purification in the core either
to me it is the perfect job for a plugin - if people want it they can
install it.

westi
-- 
Peter Westwood <peter.westwood at ftwr.co.uk>
http://blog.ftwr.co.uk


More information about the wp-hackers mailing list