[wp-hackers] Best functions in WP for bulk MySQL processing?

Mike Schinkel mikeschinkel at newclarity.net
Sun Mar 14 11:13:42 UTC 2010


On Mar 13, 2010, at 11:33 PM, Mike Schinkel wrote:
> I'm writing a tool to help automatically move from one domain to another.  To do this I need to process through all post_content and post_excerpt fields and change any image locations that reference the old domain with the new domain (there are other things I need to do but I'm taking baby steps first.)
> 
> As far as I can tell WordPress' database functions like $wpdb->query() load all posts into memory which won't work for a really large database; it would exhaust memory.  While I can easily write SQL code using the mysql_* functions I'd prefer to do it using WordPress functions.  Are there functions available to use for bulk processing, one I've just not found yet?

Since I need this ASAP I went ahead and decided to implement something and the following is what I came up with.  I decided to use SQL to directly load all post IDs. I figure that is unlikely to ever breaks (post records in MySQL will probably always have IDs, right?) and it's very lightweight to load even 10K IDs (if I ever have to use for a larger database, well I'll cross that bridge then.)  Then I load each post and look for URLs with the old domain and replace with the new domain (fyi, my old and new domains are like "example.dev" which is local on my machine and "example.com" which is on the live web.) What I don't like is having to load every post but it's not really that bad so I'll live with it.

I'm posting this 1.) to see if anyone has any better ways to do this and/or pitfalls I've not recognized, and 2.) if there is no better way and no pitfalls then I post to help others needing to do the same.

Thanks in advance for looking at it.

function switch_domain_in_posts($old,$new) {
	global $wpdb;
	$posts = $wpdb->get_results('SELECT ID from wp_posts');
	foreach($posts as $post) {
		$post = get_post($post->ID);
		$updated = false;
		if (find_domains_in_content($post->post_content,$old)) {
			$post->post_content = replace_domains_in_content($post->post_content,$old,$new);
			$updated = true;
		}
		if (find_domains_in_content($post->post_excerpt,$old)) {
			$post->post_excerpt = replace_domains_in_content($post->post_excerpt,$old,$new);
			$updated = true;
		}
		if (find_domains_in_content($post->guid,$old)) {
			$post->guid = replace_domains_in_content($post->guid,$old,$new);
			$updated = true;
		}
		if ($updated)
			wp_update_post($post);
	}
}
function find_domains_in_content($content,$old) {
	return preg_match('#(https?://)' . str_replace('.','\.',$old) . '#',$content,$m);
}
function replace_domains_in_content($content,$old,$new) {
	$content = preg_replace('#(https?://)' . str_replace('.','\.',$old) . '#',"$1$new",$content);
	return $content;
}

-Mike



More information about the wp-hackers mailing list