[wp-hackers] Taxonomy Schema Proposal

Jamie Talbot wphackers at jamietalbot.com
Mon Apr 16 23:12:11 GMT 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ryan Boren wrote:
> On 4/16/07, Jamie Talbot <wphackers at jamietalbot.com> wrote:
>> Basically the same, except that the term_relationships and
>> term_taxonomy tables are related by an
>> integer, instead of a tuple.  We're going to have to join the two
>> tables to get taxonomy data
>> anyway, and nothing beats joining on an int.  We also only have to
>> store the taxonomy field strings
>> once.  Net result of the same amount of fields, but a bit cleaner. 
>> Thoughts?
> 
> I had it like that in an earlier iteration, but how do plugins avoid
> battling for new taxonomy ints?  A string provides less chance of
> collision.  An int would be better for a join, however, so maybe we
> should go ahead and add a taxonomy table that will act as a broker for
> plugins to get a taxonomy id.
> 

You're right that it might be a bit trickier, especially as we can't bank on subqueries (yet).  But,
we can still minimise queries, using INSERT ... SELECT.

Say I want to add a post to a category, cities.  I already know the post_id, which is stored in
$post_id.  Internally, this might look like the following semi-pseudocode, though obviously we'd
break this up into functions:

// First, try to add the tag if it doesn't already exist.
// I'm assuming we'll have a unique key on slug, right?  This won't generate a duplicate.
// As of 4.1, we could use ON DUPLICATE KEY UPDATE, but that will have to wait.
REPLACE INTO $wpdb->terms (term_name, term_slug) VALUES ('City', 'city')

// Now add the term_taxonomy entry.
if ($count = $wpdb->query("SELECT tt.count FROM $wpdb->term_taxonomy AS tt INNER JOIN $wpdb->terms
AS t ON tt.term_id = t.term_id WHERE tt.taxonomy = 'post_category' AND t.term_slug = 'city')) {
	// We'll be updating whatever's there, and incrementing the count.  Not a huge fan of
	// storing counts, by the way, but accept that there might be valid optimisation reasons for 	//
doing so.
	UPDATE $wpdb->term_taxonomy SET desciption = 'Cities visited', count = $count + 1
} else {
	// There was no existing entry for this taxonomy type, so add a new one.  Parent kept as 0 		// for
now.
	INSERT INTO $wpdb->term_taxonomy SELECT term_id, 'post_category', 'Cities I\'ve visited', 0, 1 FROM
$wpdb->terms WHERE term_slug = 'city'
}

// Finally, associate the object with the new taxonomy term.  This query should always find
// the term_taxonomy_id correctly as we've just inserted or updated it.
INSERT INTO $wpdb->term_relationships (object_id, term_taxonomy_id) SELECT $post_id,
tt.term_taxonomy_id FROM $wpdb->term_taxonomy AS tt INNER JOIN $wpdb->terms AS t ON (t.term_id =
tt.term_id AND tt.taxonomy = 'post_category' AND t.term_slug = 'city')

You could of course rework this to add an extra query to determine the recently inserted term_id and
remove the need for the INSERT ... SELECTS, - not sure which would profile faster.  With correctly
applied indices and field sizes that match, I suspect it may be faster as above, even with many
terms.  I'm pretty sure that either of those approaches would be faster than a string join though.
At the end of the day, with 3 tables you're always going to need to do 3 or 4 queries.  But seeing
as this happens on the admin side, and we aren't overly worried about the performance of the admin
side, that shouldn't be a problem.

Cheers,

Jamie.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGJALLrovxfShShFARAtyRAJ4872/WgADBHA+EQ9WoLQOt0vkyzQCeJ+Zo
kED49DvXV4SaBSJZVoZ7Qgc=
=VvDi
-----END PGP SIGNATURE-----


More information about the wp-hackers mailing list