[wp-hackers] Meta tables: Take 5

Mike Schinkel mikeschinkel at newclarity.net
Sat Jul 25 06:48:19 UTC 2009


On Jul 25, 2009, at 1:29 AM, Otto wrote:
> I considered enums as well, but the alter table problem I think makes
> a big difference. We don't have anything doing alters except on
> upgrades at present, making it alter the table like this seems
> problematic to me.

The only time ALTER would be needed would be in the case a plugin  
needed an ENUM value that doesn't already exist in the wp_meta table  
and then only on plugin activation.

We can further minimize the need for ALTER by including ENUM values  
for all the obvious objects that might need meta, i.e. comments,  
links, categories, tags, etc.
That would make it rare that ALTER would be needed; much more rare in  
fact than upgrades which *practically* everyone will need to run.

> I'd be happier with strings. We already use strings for types
> elsewhere (comment_type is comment, ping, etc.. post_type is page,
> post, attachment...).
>
> I don't think the overhead is going to be enough to make any
> difference except in huge scales. Perhaps some testing is in order?


That comfort comes with less than trivial performance implications.

Given the index I proposed, an ENUM(2)+BIGINT(20)+4 bytes, each key is  
26 bytes long. With a 4K page filled 2/3s full[1] you get 105 records  
per index page.  That allows MySQL to retrieve record pointers for  
around 11,030 records in two index reads.  Assuming 10 meta fields per  
object, that allows you to have 1000 records and stay in two index  
page reads.  However, you can get to 1.16 million records in only  
three index page reads, or about 115,900 object records (10 meta  
records per object record.) And for 4 index reads you can get 127.6  
million meta records, or 12.17 million object records.  Not bad....

Now, let's look the option we'd choose it we didn't challenge the  
default assumptions: VARCHAR(255)+BIGINT(20)+4, or 279 bytes per index  
key, or only 9 meta records per index page, or less than 1 object  
record per index page!!! That means for two index page reads, you only  
get 95 meta records, or 9 object records.  Three index page reads  
gives you only 937 meta records, or 93 object records.  Four index  
pages reads give you 9,176 meta records, or 917 object records.  With  
five index page reads you can get up to 89,809 meta records, but only  
8980 object records.  To get to 87,899 object records you need 6 index  
page reads, and to get to over 1 million object records you have to  
have 7 index page reads.  UGH!

Holy crap!  ENUMs give you 115,900 object records in only 3 index page  
reads yet strings require 6 index page reads for only 87,899 object  
records.  That's < 1/2 the performance, and you'll find it's actually  
a lot worse than that (int indexes always perform better than string  
indexes.)

That overhead from strings makes WordPress a non-option in high  
traffic sites in the 100k record range (without *lots* of expensive  
tuning), and noticeably slow for even moderate size sites (>1000  
object records.)  And what benefit for?  All to avoid having to do an  
ALTER?  The more I research this and write about it, the more clear  
that moving to ENUMs could be a huge win for WordPress scalability  
moving forward.

My guess the reason most people are not screaming about the current  
performance is that tables that use strings like the taxonomy tables  
have so few records for most sites that it rarely becomes an issue.   
But we are not architecting WordPress for sites with 25 blog posts,  
are we?  We are trying to ensure that it can scale to be able to be  
used for even the largest sites, right?

JMTCW, anyway.

-Mike Schinkel
WordPress Custom Plugins
http://mikeschinkel.com/custom-wordpress-plugins/

P.S. BTW, I just ran came across this[2]:  "Disadvantages of Creating  
Many Tables in the Same Database": "If you have many MyISAM tables in  
the same database directory, open, close, and create operations are slow

[1] http://stackoverflow.com/questions/781873/how-to-figure-out-size-of-indexes-in-mysql/781888
[2] http://dev.mysql.com/doc/refman/5.1/en/creating-many-tables.html




More information about the wp-hackers mailing list