[buddypress-trac] [BuddyPress] #5210: Using DISTINCT in the select statement when retrieving activity data slows down the query
buddypress-trac
noreply at wordpress.org
Wed Oct 23 00:55:36 UTC 2013
#5210: Using DISTINCT in the select statement when retrieving activity data slows
down the query
-------------------------------+------------------------------
Reporter: kwerri | Owner:
Type: enhancement | Status: new
Priority: normal | Milestone: Awaiting Review
Component: Activity | Version: 1.8.1
Severity: normal | Resolution:
Keywords: reporter-feedback |
-------------------------------+------------------------------
Changes (by boonebgorges):
* keywords: => reporter-feedback
Comment:
The DISTINCT keyword was added in r7318. See #5118 for background.
I'm surprised to hear that merely adding DISTINCT is forcing the creation
of temp tables. The activity tables are heavily indexed. Can you share
some more details about the queries?
- What do the full queries look like? Are there lots of WHERE clauses?
- Can you run an EXPLAIN on the query and share it?
- How big is your activity table? What's your server environment like?
(mysql version, RAM)
I wonder if we could avoid the temp table problem if we did `SELECT
DISTINCT a.id, a.*`, limiting the scope of the DISTINCT. I'm seeing
conflicting data around the web. See, eg,
https://dev.mysql.com/doc/refman/5.0/en/distinct-optimization.html, which
says that the problem may stem from coupling with `ORDER BY`; yet
http://stackoverflow.com/questions/11746062/performance-and-sorting-and-
distinct-unique-between-mysql-and-php insists that the DISTINCT logic is
far more efficient in MySQL than processing in PHP.
Any more info or thoughts you could share would be helpful. Thanks for the
report.
--
Ticket URL: <https://buddypress.trac.wordpress.org/ticket/5210#comment:1>
BuddyPress <http://buddypress.org/>
BuddyPress
More information about the buddypress-trac
mailing list