[buddypress-trac] [BuddyPress Trac] #6327: Improved caching for group membership
noreply at wordpress.org
Tue Mar 31 18:55:29 UTC 2015
#6327: Improved caching for group membership
Reporter: boonebgorges | Owner:
Type: defect (bug) | Status: new
Priority: normal | Milestone: 2.3
Component: Component - Groups | Version:
Severity: normal | Resolution:
Keywords: has-patch |
Comment (by boonebgorges):
Thank you very much for the feedback, DJPaul.
> The new actions in class-bp-groups-member.php ought to be committed
separately (as I'm sure you know/were going to do).
> I agree caching IDs and the objects individually is probably better, and
it feels like it's worth the time investment to get it right.
On reflection, I agree. [attachment:6327.2.patch] makes the necessary
changes. I've made further changes to the caching schema to match BP/WP's
general caching strategies even better (see the new global cache groups:
'bp_groups_memberships_for_user' (caches arrays of membership IDs on a
per-user basis), and 'bp_groups_memberships' (caches membership objects on
a per-membership-ID basis).
> I don't quite understand enough why, for this, we would want to select
everything and do the sorting in PHP, and everywhere else, do it across a
few SQL queries. Would the "cache IDs and individual objects" approach
mean changing this and doing in SQL?
Doing it in a few SQL queries is much easier than getting the caching
right :) Now that 2.patch implements a split cache strategy (cache the
IDs, then cache each individual object), we could theoretically choose not
to cache IDs at all, but that would mitigate most of the benefit of the
caching. So, if we are going to cache the ID query, we have to consider
the fact that there are different ways that the items might be requested,
based on the parameters. There are two general strategies for this:
a. Cache the maximal query, and then do sorting/filtering in PHP. That's
what I'm doing in this patch.
b. Generate a cache key based on the parameters, and then cache the IDs
for that combination of parameters. This is closer to what WP does in,
say, `get_terms()`: https://core.trac.wordpress.org/browser/tags/4.1.1/src
(b) is nice because it offloads all of the work to MySQL, which is faster
at filtering/sorting than PHP, especially when the number of items to
sort/filter becomes very large.
However, (b) pollutes the cache in a pretty severe way. Every possible
combination of parameter values will result in a separate cache entry. In
the case of `bp_get_user_groups()`, there are 7 parameters, each of which
has at least 2 possible values. So that's more than 2^7 possible cache
entries *for each user_id*. In practice, of course, there won't be nearly
this amount of cache pollution, but it's still significant. Perhaps more
important is that strategy (b) means that changing a parameter even
slightly results in a cache miss, which mitigates a good deal of the
benefit of caching in the first place.
Strategy (a), on the other hand, results in a single cache key for each
user, and guarantees more cache hits. The downside is reproducing SQL-type
logic for filtering/sorting in PHP, but given the fairly small number of
paramaters and the fairly small data sets we're working with here (people
are generally not members of more than a few dozen groups), I think it's a
Ticket URL: <https://buddypress.trac.wordpress.org/ticket/6327#comment:8>
BuddyPress Trac <http://buddypress.org/>
More information about the buddypress-trac