[wp-hackers] Howto exclude capital words (within URL address) from being searched and replaced?

Ozh ozh at planetozh.com
Thu Jan 22 15:31:11 GMT 2009

> Could anyone suggest how I could escape the words inside <a> tag please?

Most likely, you want to exclude anything inside tags, not just <a> (and 
not just a tricky regexp to find URLs only). For instance you don't want 
to deal with <em class="BOLD"> or whatever is inside a tag.

The following is untested but should be a decent start:

--- code ---

add_filter('the_content', 'ckunte_smallcaps_parse');

function ckunte_smallcaps_parse($content) {
  $split_on_html_tags = '/(<[^>]+>)/';
  $content_split = preg_split($split_on_html_tags, $content,

  foreach ($content_split as $num=>$chunk) {
    if (substr($chunk, 0, 1) != '<') {
      $content_split[$num] = ckunte_smallcaps($chunk);

  $content = join('', $content_split);

  return $content;

--- /code ---

with your original ckunte_smallcaps function.


We split everything on html tags (the (<[^>]+>) regexp) so
<anytag stuff="this">some text inside tags</anytag>
array('<anytag stuff="this">', 'some text inside tags', '</anytag>');

Then for each of this array items, if it's not a tag (ie first char is not 
"<") we're passing them through your smallcap function

At the end, glue everything back and return

Still not perfect because you won't want to deal with what's within 
<script> and </script>, or <style> and </style> for instance.

Hope that helps.


More information about the wp-hackers mailing list