[buddypress-trac] [BuddyPress Trac] #5349: Activity query refactor for improved performance

buddypress-trac noreply at wordpress.org
Sun Jan 26 15:51:56 UTC 2014


#5349: Activity query refactor for improved performance
-------------------------------------+------------------
 Reporter:  boonebgorges             |       Owner:
     Type:  task                     |      Status:  new
 Priority:  normal                   |   Milestone:  2.0
Component:  Activity                 |     Version:
 Severity:  normal                   |  Resolution:
 Keywords:  has-patch needs-testing  |
-------------------------------------+------------------
Changes (by boonebgorges):

 * keywords:   => has-patch needs-testing


Comment:

 It turned out that this strategy was easier to implement than I'd
 anticipated, and had a bigger impact on performance than I'd anticipated.
 See 5349.patch.

 First: implementation details.

 1. 'bp_use_legacy_activity_query' filter allows you to continue to use the
 old query format
 2. To fetch user data, I'm using `BP_User_Query` instead of a direct query
 to the users table. The performance differences are minimal. My technique
 will allow us to implement caching in BP_User_Query and have it trickle
 down.
 3. Contrary to what I say in the ticket description above, I didn't have
 to break backward compatibility with any filters. Hooray!
 4. Unit tests are passing.

 Second: performance implications. I've got a test database with 153 user
 accounts and 105558 activity items.

 Legacy query:

 {{{
 SELECT DISTINCT a.*, u.user_email, u.user_nicename, u.user_login,
 u.display_name FROM
 wp_bp_activity a LEFT JOIN wp_users u ON a.user_id = u.ID WHERE a.is_spam
 = 0 AND
 a.hide_sitewide = 0 AND a.type != 'activity_comment' ORDER BY
 a.date_recorded DESC LIMIT 0, 20
 (1,096.2ms)
 }}}

 New, split query:

 {{{
 SELECT DISTINCT a.id FROM wp_bp_activity a WHERE a.is_spam = 0 AND
 a.hide_sitewide = 0 AND
 a.type != 'activity_comment' ORDER BY a.date_recorded DESC LIMIT 0, 20
 (0.3ms)

 SELECT * FROM wp_bp_activity WHERE id IN
 (105616,105615,105614,105613,105612,105611,105610,105609,105608,105607,105606,105605,
 105604,105603,105602,105601,105600,105599,105598,105597) ORDER BY FIELD(
 id,
 105616,105615,105614,105613,105612,105611,105610,105609,105608,105607,105606,105605,
 105604,105603,105602,105601,105600,105599,105598,105597 )
 (0.4ms)

 SELECT
 wp_users.ID,wp_users.user_registered,wp_users.user_login,wp_users.user_nicename,
 wp_users.display_name,wp_users.user_email FROM wp_users WHERE 1=1 AND
 wp_users.ID IN
 (119,40,29,23,95,80,22,66,69,44,42,148,52,26,97,63,37,125,58) ORDER BY
 user_login ASC
  (0.3ms)
 }}}

 As you can see, we've gone from 1 query that takes over a second to 3
 queries that take just one millisecond. A 1000x improvement. (And note
 that this does not take into account any to-be-built caching mechanisms.
 With proper caching, the second and third queries could be cut down or
 avoided altogether in many cases.)

 Feedback welcome. Since this doesn't break any filters, I don't think
 there are any backward compatibility concerns, and I think we should
 commit right away to get some testing done.

--
Ticket URL: <https://buddypress.trac.wordpress.org/ticket/5349#comment:3>
BuddyPress Trac <http://buddypress.org/>
BuddyPress Trac


More information about the buddypress-trac mailing list