[wp-hackers] A "terms" table

Mark Jaquith mark.wordpress at txfx.net
Mon Apr 16 03:25:33 GMT 2007

On Apr 15, 2007, at 3:41 PM, Matt Mullenweg wrote:

> Let me do my best to make the case for putting category data and  
> tag data in separate tables, and feel free to chime in if you think  
> I've missed any points.
> * We shouldn't ship anything with a data schema people disagree on,  
> because plugins and themes will be written against it.

Correction: we shouldn't ship something with a data schema that one  
person supports and a multitude of others think is terrible.  It's  
less about the disagreement and more about how the lone opinion is  
the one that was the determinant.

> * They're different things, so we should have them in different  
> tables.

This requires exposition (of which some is provided in the next  
point) and alone seems like a bit of a straw man.

> * Tags can have things like synonyms, and don't need things like  
> hierarchy.

Plus: The monolithic taxonomy table isn't flexible.  It requires  
three *_count columns, two *_private count columns and a bitwise type  
column.  In order to allow for Link and Post hierarchies to differ,  
we'd need two *_parent columns.  *_parent and category_description  
are unusable by tags.  It's a shared apartment with very little  
communal space.

> 1. [The current code] performs faster.
> On front-end display, we have added ZERO QUERIES to support tags.  
> The query that grabs categories is also grabbing tags and we're  
> sorting them out in the code.

> A separate tag naming table and post2tag table would require at  
> least 2 additional queries and/or joins to the front page, which  
> already think does too many queries and is too heavy.

> More importantly from a user's point of view, all that really  
> matters is that they have a box they can type tags in and that  
> their host doesn't tell them not to upgrade to 2.2 because it does  
> more queries.

I've addressed this before.  Look at UTW.  Tags don't need to add  
queries, for normal post views.  Tags can and should be cached as a  
serialized array of nicename/original-name pairs in postmeta.  No  
additional queries.  The API flushes the cache as the post is updated  
or its tags are updated.  We can repopulate the cache then or  
repopulate on the fly.

> 3. There should be no user- or plugin-facing problems with how it's  
> currently implemented, or if we decide to change it.

The API won't be complete.  Custom queries will have to be made.   
Things WILL break.

> I do think there is something intrinsically better about shipping  
> and iterating than noodling without release in search of the  
> "perfect" implementation.

Yes, I agree in theory, but this wouldn't be an iteration.  It'd be a  
complete rebuild.

> 4. I'm open
> I'm not personally tied to any code written thus far and if I think  
> the best thing is.

I'm relieved to hear this.

> If we do delay I think we should laser-focus on tags and now allow  
> other pet-issues to creep in, and I will fully expect people to put  
> in as much time writing code and fixing bugs as they have arguing  
> points on mailing lists, IRC, and trac. At the very least I hope  
> we've learned a bit more about getting these things out of the way  
> early rather than a week or two before a release. Also if something  
> is sitting in trac, take it to the hackers list early.

Hopefully next time we'll have four months instead of three! :-)   
Three was too short.  As for me, I'll put my code where my mouth is,  
and I know a few others that will do the same.

Mark Jaquith

Covered Web Services

More information about the wp-hackers mailing list