[buddypress-trac] [BuddyPress Trac] #5349: Activity query refactor for improved performance
buddypress-trac
noreply at wordpress.org
Sun Jan 26 15:51:56 UTC 2014
#5349: Activity query refactor for improved performance
-------------------------------------+------------------
Reporter: boonebgorges | Owner:
Type: task | Status: new
Priority: normal | Milestone: 2.0
Component: Activity | Version:
Severity: normal | Resolution:
Keywords: has-patch needs-testing |
-------------------------------------+------------------
Changes (by boonebgorges):
* keywords: => has-patch needs-testing
Comment:
It turned out that this strategy was easier to implement than I'd
anticipated, and had a bigger impact on performance than I'd anticipated.
See 5349.patch.
First: implementation details.
1. 'bp_use_legacy_activity_query' filter allows you to continue to use the
old query format
2. To fetch user data, I'm using `BP_User_Query` instead of a direct query
to the users table. The performance differences are minimal. My technique
will allow us to implement caching in BP_User_Query and have it trickle
down.
3. Contrary to what I say in the ticket description above, I didn't have
to break backward compatibility with any filters. Hooray!
4. Unit tests are passing.
Second: performance implications. I've got a test database with 153 user
accounts and 105558 activity items.
Legacy query:
{{{
SELECT DISTINCT a.*, u.user_email, u.user_nicename, u.user_login,
u.display_name FROM
wp_bp_activity a LEFT JOIN wp_users u ON a.user_id = u.ID WHERE a.is_spam
= 0 AND
a.hide_sitewide = 0 AND a.type != 'activity_comment' ORDER BY
a.date_recorded DESC LIMIT 0, 20
(1,096.2ms)
}}}
New, split query:
{{{
SELECT DISTINCT a.id FROM wp_bp_activity a WHERE a.is_spam = 0 AND
a.hide_sitewide = 0 AND
a.type != 'activity_comment' ORDER BY a.date_recorded DESC LIMIT 0, 20
(0.3ms)
SELECT * FROM wp_bp_activity WHERE id IN
(105616,105615,105614,105613,105612,105611,105610,105609,105608,105607,105606,105605,
105604,105603,105602,105601,105600,105599,105598,105597) ORDER BY FIELD(
id,
105616,105615,105614,105613,105612,105611,105610,105609,105608,105607,105606,105605,
105604,105603,105602,105601,105600,105599,105598,105597 )
(0.4ms)
SELECT
wp_users.ID,wp_users.user_registered,wp_users.user_login,wp_users.user_nicename,
wp_users.display_name,wp_users.user_email FROM wp_users WHERE 1=1 AND
wp_users.ID IN
(119,40,29,23,95,80,22,66,69,44,42,148,52,26,97,63,37,125,58) ORDER BY
user_login ASC
(0.3ms)
}}}
As you can see, we've gone from 1 query that takes over a second to 3
queries that take just one millisecond. A 1000x improvement. (And note
that this does not take into account any to-be-built caching mechanisms.
With proper caching, the second and third queries could be cut down or
avoided altogether in many cases.)
Feedback welcome. Since this doesn't break any filters, I don't think
there are any backward compatibility concerns, and I think we should
commit right away to get some testing done.
--
Ticket URL: <https://buddypress.trac.wordpress.org/ticket/5349#comment:3>
BuddyPress Trac <http://buddypress.org/>
BuddyPress Trac
More information about the buddypress-trac
mailing list