[wp-hackers] Text Flow Again (Was: Let's whip WYSIWYG)

Michel Fortin michel.fortin at michelf.com
Thu Jun 16 03:50:04 GMT 2005


Le 15 juin 2005, à 18:51, Matthew Mullenweg a écrit :

> I read through your article and I think there are some incorrect 
> assumptions.
>
> 1. Removing filters is a bad thing. It's not, that's why we have the 
> function.

No, it's not a bad thing. *Moving* filter elsewhere however can break 
other plugins. Like a previous version of PHP Markdown (1.0.1a) which 
was moving `wp_rel_nofollow` to `get_comment_text`. It was breaking any 
plugin trying to remove it since the filter wasn't at the same place 
anymore.

> 2. Balancetags is always enabled

The filter function is always called. But it does its work only if the 
"WordPress should correct invalidly nested XHTML automatically" 
checkbox is set.

> 3. We could make the_excerpt processing be filters (more hooks!)

Adding hooks may help make things more logical and symmetrical (see the 
dashed "holes" in the diagram), but it won't solve the issue by itself.

* * *

To illustrate the problem, let's say we write guidelines for those who 
want to implement an alternative syntax. The guideline could be 
summarized like this:

For Posts:

1.	Remove `wpautop` from `the_content` and `the_excerpt`.

2.	Add your custom Text-to-HTML filter in `the_content` and 
`get_the_excerpt`
	with priority 6 (so that it is the first filter).

3.	Remove `balance_tags` from `content_save_pre` and `excerpt_save_pre` 
since
	it may tamper with our custom syntax.

4.	If still useful after conversion from your custom syntax: Add 
`balance_tags`
	to `the_content` and `get_the_excerpt` with priority 8 (so that it is 
after
	your custom filter but before `wp_trim_excerpt`).

5.	Add a filter that will wrap text inside a `<p>` tag in 
`the_excerpt`, but
	only when said tag are missing. This is because `the_excerpt` must 
return
	text wrapped in paragraphs, but when it comes from the auto-generated
	trimmed post paragraphs tags are not present.

6.	Add a filter that will remove any `<p>` tag in `the_excerpt_rss`.
	This is because `the_excerpt_rss` must return text not wrapped in
	paragraphs while text coming from `get_the_excerpt` contain `<p>` tags
	once it has passed through custom filter added at step 2.

For Comments:

1.	Remove `wpautop` from `comment_text`.

2.	Add your custom Text-to-HTML filter to `pre_comment_content` at 
level 6
	(so that it is before all other filters that expect HTML).

3.	Find a way to make `wp_filter_kses` not strip `<p>` tags (and maybe 
other
	if needed). [...]

4.	If there is already some comments in the database which were saved in
	your custom format (because previously the filter was applied at 
display
	time), these comments still need to be filtered at display time. [...]


These guidelines reflect exactly what I have done for the Markdown 
plugin. (I left some implementation details in the last two points 
however.)

In my weblog entry, when I say in the first paragraph that "[the text 
flow] is very powerful but badly adapted to a writing syntax different 
from HTML", it relates to my experience with other applications. I 
haven't seen anything near this level of complexity in any other weblog 
or CMS software. In fact, many of them only need the plugin to tell 
them what function to use to do the filtering, they handle everything 
from there.

WordPress does not need to be "like the others", but it should be 
better than it is currently.

> Then you talk for several paragraphs about things that were fixed in 
> 1.5. 1.5 has been downloaded over 300,000 times, I think it's a safe 
> baseline and many other plugins require it.

It hasn't much to do with WP 1.5. Previous versions of Markdown did 
filter the content at display time, so previous comments in the 
database are not saved in HTML, but in Markdown format, and still need 
to be filtered. (I'm talking about comments posted prior installing PHP 
Markdown 1.0.1b, not WordPress 1.5.)

> All comment processing should happen before things are saved to the DB.

Yet `wpautop` is still done at display time. ;-)

> I would be happy to incorporate changes for the text flow to be more 
> accomodating to alternate syntaxes, however besides changing how 
> the_excerpt processes it's not clear to me what are some specific 
> things we could do to make things easier.

Changing `the_excerpt` could help, but what kind of change? I'm 
beginning to think that what would help the most is to put `wpautop` 
first in the chain everywhere: content, excerpts, comments. `wpautop` 
does something Markdown does: it adds paragraphs. But Markdown must 
also run before anything else. After such a change, it will then be 
possible to simply replace `wpautop` with another filter to handle 
custom syntax conversion to HTML without hassle. (I'm not arguing that 
it should run prior `balanceTags` however. Removing/moving this filter 
isn't really hard or complicated.)

(Explanation: Moving `wpautop` first in `get_the_excerpt` will bring 
the same issues Markdown had for excerpt and comments... which means 
WordPress will now need to address those issues itself, instead of 
plugins.)

Anyway, whatever change that simplifies the guidelines I wrote above is 
a step in the right direction. The ultimate improvement would be to 
provide a settable custom-format filter somewhere, WordPress handeling 
all the details by itself... but I think I'm dreaming. :-)


Michel Fortin
michel.fortin at michelf.com
http://www.michelf.com/


More information about the wp-hackers mailing list