[wp-hackers] questioning the efficiency of using custom post types
Marcus.Pope at springbox.com
Tue Nov 1 21:14:19 UTC 2011
Holly crap-tastic injection of white space by my iPhone batman!
From: wp-hackers-bounces at lists.automattic.com [mailto:wp-hackers-bounces at lists.automattic.com] On Behalf Of Marcus Pope
Sent: Tuesday, November 01, 2011 1:10 PM
To: Christopher O'Connell
Cc: wp-hackers at lists.automattic.com
Subject: Re: [wp-hackers] questioning the efficiency of using custom post types
Agreed Christopher, both on flame wars and use cases for joins. You can't make a blanket statement about joins being slower in all cases as that purely depends on how you slice the data. Row level scans can be horrifically slow. You can improve this by adding indexes on your key table columns but when those indexes become extremely large themselves the benefit is lost. Joins allow you, in many cases, to break those relationships down into more manageable chunks on several levels. So the result is a faster query. But the notion that all data should be relational is a misnomer too and has proven more troublesome (inflexible) for the data set sizes wordpress typically deals with. kvp architectures tend to support far more variations of data structures than problem domain specific relationships, so the result for a generalized community is a better platform. And given the performance of modern databases and price of modern technology it will continue to become even easier as a g
In reality wordpress is still built on those performance realizing joins, just not more than one or two join for this particular case (meta data is joined with posts and possibly taxonomies in most of the queries, just typically not between the data itself that people put in the meta table)
On Oct 30, 2011, at 2:06 PM, "Christopher O'Connell" <jwriteclub at gmail.com> wrote:
> I don't want to start a flame war, but, storing the data as key value
> pairs defeats the whole purpose of using a relational database. If you
> want a K-V store, then user a K-V store.
> In general, Wordpress' approach works, because any given post meta
> table is relatively short and/or the querying is of only modest
> complexity. However, relational schema and normalization exists for a
> reason. Yes, they're extremely complex and often overkill for a
> project. If you have large, complex data, however, using a well
> normalized database will result in significant speed improvements
> assuming that you need to perform arbitrary queries over the data.
> How's this apply to Wordpress? The K-V version of post meta doesn't
> scale past a couple hundred million entries. Once this limit is
> reached, the fact that a non-K-V data-store is being used causes the
> DB to fall over. In this case, storing metadata in something more
> suited (such as a CouchDB or Amazon SimpleDB) works far, far better.
> ~ Christopher
> P.S. Before anyone takes me to task for this post, I *don't* think
> that the WordPress core should change. As already stated for small to
> "modestly large" amounts of data it works splendidly and does not
> require developers to worry about the integrity or quality of their data model.
> On Sun, Oct 30, 2011 at 11:40, Mike Bijon <mike at etchsoftware.com> wrote:
>> One of the really non-intuitive things about relational DBs like
>> MySQL is that looking up data spread across more tables is
>> less-efficient than fewer tables with lots of rows. It got me at
>> first too, but doing JOINs just isn't fast, and more tables means
>> more For an example of how efficient long tables like WPs are, take
>> a look at Brett Taylor's documentation of FriendFeed's speedup:
>> * http://bret.appspot.com/entry/how-friendfeed-uses-mysql
>> The longer tables also give us a lot of flexibility and prevent errors.
>> It's odd until you get to know MySQL, and the coding to deal with
>> meta-table isn't simple. And, since these things are already solved
>> in WordPress, it's also why we use WP over-and-over and why a lot of
>> good devs contribute to the project.
>> For an example of flexibility:
>> If you had a table of restaurants and later on you wanted to store
>> extra info about ones that served coffee, then... In a multi-table
>> system you could add columns to the restaurants table, or add a new
>> table just for those cafes. In a meta-table system you would add a
>> meta-row and then add the rows to the data table.
>> On a live site the multi-table changes could cause locking and
>> performance issues while they're happening, and they might leave you
>> with either empty columns for non-cafes or a whole other table to
>> join in. On a live site with the meta-tables the update would be the
>> same as any other insert and performance would be almost unchanged.
>> For an example of preventing errors:
>> Let's say you make some changes on your dev server and need to apply
>> them to the live site. On a multi-table system this could be adding
>> columns or adding tables. On a meta-table system this would be adding rows.
>> While the update queries run on the live site, the multi-table system
>> could return no data when queries see columns or tables "missing"
>> because they aren't added yet. The meta-table will return the "old"
>> data, so there may be missing fields, but no errors.
>> In the event someone makes a mistake in the update process (which
>> happens way more than it should, unless you're doing test-migrations
>> and have staging, test, pre-deploy and/or other versions of the
>> site)... The multi-table system will return errors to live users up
>> until the mistake is corrected. The meta-table system just leaves a
>> few fields blank and (as long as your devs are checking whether or
>> nat a value exists) live site visitors should just see some empty fields on the live pages.
>> wp-hackers mailing list
>> wp-hackers at lists.automattic.com
> wp-hackers mailing list
> wp-hackers at lists.automattic.com
wp-hackers mailing list
wp-hackers at lists.automattic.com
More information about the wp-hackers