[wp-hackers] Howto exclude capital words (within URL address) from being searched and replaced?

Red Foot Web Design Wordpress wordpress at redfootwebdesign.com
Thu Jan 22 14:45:00 GMT 2009


The problem is stemming from your use of \b (word boundaries). It is seeing
the word "GANDHI" as a word because "
http://somesite.com/2007/04/i-worship-GANDHI/" matches the definition of a
boundary. More specifically it sees GANDHI/, the first character is a word
character and the last character is a non-Word character. Probably the only
way to get away form this is to not use word boundaries. With word
boundaries setup like this you aren't even considering whether or not the
text is wrapped in an <a>. So it is difficult to prevent it from
happening...

You may consider using this regex:

([A-Z][A-Z0-9]{2,})(\s|$)

This will match an capitalized word with a space character following it, or
is the end of line. This will not match the URL above. But should find all
your other capitalized cases.

Lew

On Thu, Jan 22, 2009 at 7:40 AM, Chetan Kunte <ckunte at gmail.com> wrote:

> Hi -
>
> I was wondering if any of you could help me with a unique problem I
> have with regards to a plugin I wrote sometime ago:
> http://wordpress.org/extend/plugins/small-caps/
>
> This plugin takes capitalized words, and turns them into small caps
> using the <abbr> html tag. ( More here:
> http://ckunte.com/archives/small-caps ). Please see the basic plugin
> code below.
>
> -- start of code --
>
> function ckunte_smallcaps($text) {
>        $search = "/\b([A-Z][A-Z0-9]{2,})\b/";
>        $replace = "<abbr>$1</abbr>";
>        $text = preg_replace($search,$replace,$text);
>        return $text;
> }
> //
> if (function_exists('add_filter')) {
>        add_filter('the_content', 'ckunte_smallcaps');
> }
>
> -- end of code --
>
> This code works well--with an exception and an unintended consequence.
> For example, if a URL has a Capital words in them like for example
> below,
>
> http://somesite.com/2007/04/i-worship-GANDHI/
>
> then this plugin does, yes you guessed it:
>
> http://somesite.com/2007/04/i-worship-<abbr>GANDHI</abbr>/
>
> And thus potentially breaking the link.
>
> Could anyone suggest how I could escape the words inside <a> tag please?
>
> Grateful for your help,
> --
> Chetan, ckunte.com
> _______________________________________________
> wp-hackers mailing list
> wp-hackers at lists.automattic.com
> http://lists.automattic.com/mailman/listinfo/wp-hackers
>


More information about the wp-hackers mailing list