[wp-hackers] Pull Latest Post from Each of 6 Categories

Austin Matzko if.website at gmail.com
Tue May 5 06:26:37 GMT 2009


On Mon, May 4, 2009 at 11:57 PM, Mike Schinkel
<mikeschinkel at newclarity.net> wrote:
> "Austin Matzko" <if.website at gmail.com> wrote:
> I would do something like this, where 1-6 are the category ids:
>>
>> $post_ids = array(0);
>> foreach( array(1, 2, 3, 4, 5, 6) as $cat_id ) {
>>         if ( $posts = get_posts(array('cat' => $cat_id, 'showposts' => 1)) ) {
>>                 $first = array_shift($posts);
>>                 $post_ids[] = $first->ID;
>>         }
>> }
>> query_posts(array('post__in' => $post_ids));
>
> How is that better?  It's 14 SQL queries instead of Nathan's 12 or instead of Paul's 2 (2x queries because SQL_CALC_FOUND_ROWS is called first for each query.)

First, what I suggested has no advantage over what Nathan's already
doing, except a trivial one: it lets you use the regular Loop.  I am
arguing against using MySQL directly, not suggesting that my way is
better than what he's already doing.

Second, if Nathan weren't using WP-Super-Cache I might agree with you.
 But since he is, I think using the WordPress API is going to be less
system-intensive.  Most WordPress performance problems occur at the
database level.  By using the WordPress API and caching, he can
dramatically reduce the total number of database queries and therefore
overall system load.

> Sometimes the medicine can be worse than the ailment, no? If his site is too slow it's better to use direct SQL to resolve the issues.

Reducing the role of the database is usually the best medicine for WP
performance ills.

>> and take advantage of the built-in caching.
>
> Can you clarify this?  I might be missing something.

When you query posts using the WordPress API, your caching system
(such as WP-Super-Cache) gets each post's data in a standard way that
it can use beyond the immediate query.

>> I would recommend against approaches like this one.  For one
>> thing, this subquery is more or less the same query as done
>> by the main WordPress query that would run if you used the
>> WordPress API instead
>
> How is that really a problem?

My point in those lines is that the example direct query is not an
*advantage.*  I mentioned the problems after that.

> Besides, if for a plugin many plugins require v5.0, what's really wrong with a plugin requiring v4.1? He could code in a slow way for MySQL v4.0 and do it the fast way for v4.1; version issue solved.

That assumes two things I doubt: that subqueries in general perform
significantly better and that any such queries are better than cached
WordPress API calls.

>> But unlike using the WordPress API, a direct MySQL query like
>> this has several disadvantages.
>> * It might not be cached in a manner consistent with the rest of
>> WordPress.
>
> Can you clarify this with examples?  I might be missing something.

WP_Query caches each queried post's data, so it's available when
requested elsewhere.  In contrast, caching a direct query's results
with an arbitrary key (as in Paul's example) will not cache the
individual posts in a manner that WP can use. So it means that WP will
have to make more queries to get that data.

>> * It's vulnerable to future database changes.
>
> Nathan can write code to anticipate this and fall back in case of an issue.

I don't know how that's possible using the direct database queries in
this thread.  For example, in WordPress 2.2 we could use query_posts,
WP_Query, and get_posts, but there were no terms or taxonomies tables,
so direct queries like this would have failed completely.  Even more
recently things like the post_status and post_type field values have
changed.  The API changes less.

But to be clear, I'm not saying never directly query the database;
just don't do it when you don't need to.

>>  * It's more likely to contain inadvertent missteps.  For
>> example, see this line:
>
> I don't see this as valid criticism; you can just as easily call get_posts() incorrectly; both the WPDB scheme and the get_posts() calling interface are filled with idiosyncrasies.

The difference is that get_posts() takes care of the defaults.  In my
example, I passed two arguments to get_posts(): the category id and
the number of results I wanted.  To make a similar MySQL query
directly, in addition to those I would have to specify post date,
order, type, status, term taxonomy, and probably something else I'm
not thinking of.  That means more opportunities to mess up something.


More information about the wp-hackers mailing list