[wp-hackers] Grab a seat. On Delaying 2.2, separating tables

Robert Deaton false.hopes at gmail.com
Sun Apr 15 06:05:30 GMT 2007

Okay, I'll try to cut right to the point here, cause its getting late
and my brain functionality is quickly slipping away. WordPress 2.2 is
shipped for release very shortly. Unfortunately, we're cutting it
pretty close again, with big changes going in only a week before its
scheduled for final release. Not only does this mean we're once again
not leaving time for the testing that is likely needed (think WP 2.0),
it means we are only a week away from releasing into the wild what I
see to be a mistake.

WordPress 2.2 features the new WP core tagging system. That's an
inevitability, its not going anywhere, and that's not what I'm
suggesting. The problem is, like linkcategories, its been thrown into
the categories table as well. Now, without getting into specifics, it
is a given that tags and categories are not the same. So why are we
storing them in the same table? Post categories and link categories
are also not the same, so again, why?

I believe that we should take the time now, before a release of
WordPress is made with these database schema changes, to fix the
issue. I think that the categories table needs to be resplit into a
link categories table, a post categories table, and a tag table, each
set up to handle their own specific job instead of throwing them
together with legacy fields. Let's have a look at the categories

13 	$wp_queries="CREATE TABLE $wpdb->categories (
14 	  cat_ID bigint(20) NOT NULL auto_increment,

15 	  cat_name varchar(55) NOT NULL default '',
16 	  category_nicename varchar(200) NOT NULL default '',
^^ those two lines do seem rather inconsistent, don't you think?

17 	  category_description longtext NOT NULL,
18 	  category_parent bigint(20) NOT NULL default '0',
19 	  category_count bigint(20) NOT NULL default '0',

20 	  link_count bigint(20) NOT NULL default '0',
21 	  tag_count bigint(20) NOT NULL default '0',
^^ two different count fields only used for some of the things stored
in the table at different times

22 	  posts_private tinyint(1) NOT NULL default '0',
23 	  links_private tinyint(1) NOT NULL default '0',
^^ again

24 	  type tinyint NOT NULL default '1',
^^ a bitfield, instead of an enum? brilliant.

25 	  PRIMARY KEY  (cat_ID),
26 	  KEY category_nicename (category_nicename)
27 	) $charset_collate;

So, basically, its all lumped together. No wonder WP is notorious for
slow and poor queries.

While on the subject of splitting these out, I also believe that it is
about time that we created a proper schema and structure for managing
category hierarchies. I'm personally favoring the nested set model at
the moment, but really any proper schema that would allow us to manage
hierarchies easily with fewer queries would do.

The categories schema changes can probably wait for the next version,
as its a large undertaking and a lot to test, however I think that it
definitely wouldn't hurt to push back 2.2 and get these tables
separated and sorted out before they ship in a release and then we
have to worry about serious backwards compatibility issues when trying
to make these changes in the next version.

--Robert Deaton

