[wp-hackers] Shortcodes

Fri Jul 30 03:08:25 UTC 2010

> Message: 7
> Date: Fri, 30 Jul 2010 00:42:28 +0300
> From: scribu <scribu at gmail.com>
> Subject: Re: [wp-hackers] Shortcodes
> To: "wp-hackers at lists.automattic.com"
>        <wp-hackers at lists.automattic.com>
> Message-ID:
>        <AANLkTinVU_8Z_-IOuJO7UPMmlBHgzC9DcdGmCxmFf4KH at mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> This looks like a big step forward.
>
> I think you should open a ticket on trac with your imppementation as a
> patch, so we can discuss specifics.

Will try to get to that tomorrow.

> Message: 8
> Date: Thu, 29 Jul 2010 15:13:33 -0700
> From: "Aaron D. Campbell" <aaron at xavisys.com>
> Subject: Re: [wp-hackers] Shortcodes
> To: wp-hackers at lists.automattic.com
> Message-ID: <4C51FD0D.6030406 at xavisys.com>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
>  Honestly it sounds great.  Can we see the code?  How does it perform
> with long posts or sites with lots of registered shortcodes (I run
> against #8553 <http://core.trac.wordpress.org/ticket/8553> fairly often)?

To be honest, I haven't tried against longer posts or in a production
environment. However, I plan to use shortcodes in an upcoming large
project at work and I wasn't satisfied with the current
implementation. I don't expect backtracking to be a big issue as much
as I do the regex being quite a bit longer.

For sites with lots of registered shortcodes, it uses the same kind of
alternation trick for many shortcode names as the current
implementation does. If that is currently impacting the efficiency of
the regexp, it can be easily modified to use just a regular name
expression and swallow or spit shortcodes that weren't registered
after parsing.

One thing to note is that this is not context free, and the deeper
levels of context are included in the upper level regexps. They are
then evaluated again to get the parse subexpressions (and so on) once
the upper-expressions are matched, so there's a bit of repetition
going on there. There's an easy way out of it that would be minimizing
context depth. I don't know how much it will impact performance, but
otherwise regular expressions won't do much good for context-based
parsing.

As for the source, there are many smaller functions rather than a few
longer ones which I expect will make it easier to work with.