[wp-hackers] Extending query_posts?

Scott johnson fuzzygroup at gmail.com
Wed Feb 1 06:05:47 GMT 2006


Another way to handle ordering by random ness is to hash a key in table and
then sort by the hash. Since the data distribution is basically fairly even,
that actually works ok (provided you clear the table periodically).

And yes my perspective is forever warped by terabyte + size mysql stuff.
Many things to me are fraught with danger (and I bloody well love MySQL just
its damn easy to run amok with it).

One thing I'd comment on is be careful with perfomance if your server is
getting any kind of load.  Use explain on your queries and watch ram
consumption.  When you use a lot of rand stuff even w/ little data involved
my gut feeling is *be careful*.  Be very, very careful.

S

On 2/1/06, Mattias Winther <mattias at winthernet.se> wrote:
>
>
> Well, my thought is to get completely random from all posts in matching
> categor(y|ies), so I'm somewhere in between. But, since ordering by random
> works without hurting up to about 1000 hits, and considering the amount of
> posts I have, combined with the amount of posts/day I write, I figure I'll
> have a few years before this becomes an actual problem for me personally...
> And I'm guessing that once I start hitting the numbers, I will indeed choose
> to pick between more recent posts.
>
> Since I'm modifying the SQL ordering, from what I understand, the posts
> are first selected from the criteria, and then ordered according to, in my
> case, random order, which should mean that with selective query_posts-usage
> one should be able to use the simple version in a lot of cases, even with
> large number of posts, without suffering any real performance problems.
> Optimizing seems like a great idea, and once I get my new iMac, I'll have
> the hardware to test on too. :) What is the easiest way to populate a test
> instance of the database with, say, 10000 posts?
>
> On Tue, 31 Jan 2006 20:13:53 -0500, "David Chait" <davebytes at comcast.net>
> wrote:
> > If you are trying to do anything like CG-SameCat, the approach I used
> was
> > to
> > not use random in the select, but select posts (well, post ID and title
> > really...) where the category matches, ordered by post ID (or date)
> > descending, and limit of some factor time the number of random items
> asked
> > for... Thus, you want say three random, but recent, related posts -- so
> > grab
> > the last 10 or 20 and then use PHP to pick a random few from the set.
> > Just
> > a thought.
> >
> > Completely random from entire posts table != completely random from
> recent
> > posts in matching category... ;)
> >
> > -d
> >
> > ----- Original Message -----
> > From: "Mattias Winther" <mattias at winthernet.se>
> > To: <wp-hackers at lists.automattic.com>
> > Sent: Tuesday, January 31, 2006 7:12 PM
> > Subject: Re: [wp-hackers] Extending query_posts?
> >
> >
> > |
> > | Thanks, great link!
> > |
> > | Well, as for my own blog, I'm using to generate recent posts in the
> same
> > category, meaning that I'll have quite the long way to go before 1000
> hits
> > is becoming an issue. Then again, I would like to do a nice plug-in that
> > can
> > be trusted to perform at about the same speed with a ton of rows.
> > |
> > | I guess it's time to brush up on my SQL.
> > |
> > | /Mattias
> > |
> > | On Tue, 31 Jan 2006 13:36:45 -0500, "David Chait"
> > <davebytes at comcast.net>
> > wrote:
> > | > Yes, RAND is awful for ORDER BY.
> > | >
> > | > http://jan.kneschke.de/projects/mysql/order-by-rand/
> > | >
> > | > At the end, he shows times for 100, 1000, ... , 1M records.  RAND
> > starts
> > | > to
> > | > 'hurt' even when just 1000 records.  He walks through building an
> > | > alternate
> > | > approach, still in SQL (though using sub-selects, so MySQL 4.1+).
> > | >
> > | > Do you need a completely random entry from the entire data set?  Or
> > would
> > | > a
> > | > random-but-recent entry work? (i.e., select post_id limit 100, pick
> a
> > | > random
> > | > entry or entries, then do the 'full' query with post_id in {list of
> > | > selected
> > | > ids}...)
> > | >
> > | > -d
> > | >
> > | > ----- Original Message -----
> > | > From: "Mattias Winther" <mattias at winthernet.se>
> > | > To: <wp-hackers at lists.automattic.com>
> > | > Sent: Tuesday, January 31, 2006 12:27 PM
> > | > Subject: Re: [wp-hackers] Extending query_posts?
> > | >
> > | >
> > | > |
> > | > | I can't find any specific references to the performance problems
> you
> > are
> > | > referring to... Anyone else out there who recognizes this,
> preferrably
> > | > with
> > | > some kind of link for further analysis? Or, someone with a huge
> > database
> > | > that can run the numbers?
> > | > |
> > | > | /Mattias
> > | > |
> > | > | On Mon, 30 Jan 2006 08:10:03 -0500, Scott johnson
> > <fuzzygroup at gmail.com>
> > | > wrote:
> > | > | > I'd be careful with that.  My understanding of randomization in
> > mysql
> > | > | > is that the performance is awful.  I haven't done it myself but
> > that's
> > | > | > hwat I understand.
> > | > | >
> > | > | > Scott
> > | >
> > | > _______________________________________________
> > | > wp-hackers mailing list
> > | > wp-hackers at lists.automattic.com
> > | > http://lists.automattic.com/mailman/listinfo/wp-hackers
> > |
> > | _______________________________________________
> > | wp-hackers mailing list
> > | wp-hackers at lists.automattic.com
> > | http://lists.automattic.com/mailman/listinfo/wp-hackers
> >
> > _______________________________________________
> > wp-hackers mailing list
> > wp-hackers at lists.automattic.com
> > http://lists.automattic.com/mailman/listinfo/wp-hackers
>
> _______________________________________________
> wp-hackers mailing list
> wp-hackers at lists.automattic.com
> http://lists.automattic.com/mailman/listinfo/wp-hackers
>



--
-------------------------------------------------------
J. Scott Johnson
Ookles launches 2/28/06 - have you signed up yet?
new startup: http://ookles.com/
blog: http://fuzzyblog.com/
podcast: http://techwarstories.com/
fuzzygroup at gmail.com
aim: fuzzygroup
cell: 857 222 6459
-------------------------------------------------------


More information about the wp-hackers mailing list