[wp-hackers] Time for a wp_post_relationships table?

Mike Schinkel mikeschinkel at newclarity.net
Tue Aug 3 22:42:24 UTC 2010


Hi all, 

This is a really long response to everyone's post since last night rather than me sending 10 replies to the list.

On Aug 3, 2010, at 10:06 AM, Casey Bisson wrote:
> I'm actually a bit sympathetic to your question, but many of the
> relationships you describe are taxonomy-style relationships.

Thanks for responding.  Can you explain why you think they are taxonomy-style relationships?  I looked at using taxonomy to start with but at least for most of these example I've not been able to get get functionality need from taxonomy terms that I get from post types. Taxonomies appear to me to be "true/false" flags that relate to the post, not independent entities in their own right such as the attorney+practice area and such as the movie+actor examples.  Can you explain how you'd be able to implement these merely with taxonomy relationships?

> I think what you're really looking for is how to durably attach a post
> (of any type) to a specific term/taxonomy pair. 

Maybe, but I currently don't see how that would work for the functionality.  Maybe you could provide SQL for proof-of-concept?

> Perhaps an extra field for post_id in the term_taxonomy table is what you need?

At first blush that could potentially work (although it feels like it's overloading the taxonomy system in ways that would cause more problems that it would solve.)

Still, here is a "proof-of-concept" query based on what I think you mean showing how to relate movie to actor based on taxonomy term 'role' 

SELECT
	movies.post_title AS movie,
	actors.post_title AS actor,
	roles.name AS role
FROM
	wp_posts actors
	INNER JOIN wp_term_relationships actor_movie ON actors.ID = actor_movie.object_id
	INNER JOIN wp_term_taxonomy movie_actor ON movie_actor.term_taxonomy_id = actor_movie.term_taxonomy_id
	INNER JOIN wp_terms roles ON roles.term_id = movie_actor.term_id
	INNER JOIN wp_posts movies ON movies.ID = movie_actor.post_id
WHERE
	movie_actor.taxonomy = 'role'

Is this what you meant?

Unfortunately the problem with this approach is that term_id+term_taxonomy_id is a primary key so you can only join to one post per role, effectively killing this idea unless we were to make that index non-unique but I think it is unique for a reason and changing it could cause unintended consequences.

I think thought, "Okay, maybe we could add a "post_id" to wp_term_relationships?"  Which would use this SQL as proof-of-concept:

EXPLAIN SELECT
	movies.post_title AS movie,
	actors.post_title AS actor,
	roles.name AS role
FROM
	wp_posts actors
	INNER JOIN wp_term_relationships actor_movie ON actors.ID = actor_movie.object_id
	INNER JOIN wp_term_taxonomy movie_actor ON movie_actor.term_taxonomy_id = actor_movie.term_taxonomy_id
	INNER JOIN wp_posts movies ON movies.ID = actor_movie.post_id
	INNER JOIN wp_terms roles ON roles.term_id = movie_actor.term_id
WHERE
	movie_actor.taxonomy = 'role'

Unfortunately, that TOO has a blocking unique key, this time object_id+term_taxomony_id.  And that too I fear would be a bad idea to make non-unique (but maybe I'm too cautious?)

Of the two, ignoring the unique key problem, the latter would make more sense to me.


> Some WP themes and sites have used the category/taxonomy description field
> very well.

I did exactly that on my conference website example.  I implemented "Rooms" and "Time Slots" as taxonomy terms and stored metadata in the description. But it was a hack and not scalable in complexity, and really felt like it as I worked with it.

> I see Michael Pretty's scenario to be an extension of that (more
> sophisticated, but not an entirely different animal).

I sure wish Michael would weight in because I think only he can really explain his circumstances.

> I can imagine that we could enable many uses of WP by moving that 
> description field into a new post type and replacing the that field in the
> table with a post_id number that links them.

That's an interesting idea but I think it would be blocked by the unique key requirement.

On Aug 3, 2010, at 11:37 AM, Otto wrote:
> On Tue, Aug 3, 2010 at 3:05 AM, Mike Schinkel
> <mikeschinkel at newclarity.net> wrote:
>> Why would we want to devise their own methods rather than use a standard approach built into core?  (It's an honest question.)
> 
> Two reasons.
> 1. Competition breeds better solutions.

Moderation in all things.   Competition is good until a definite pattern emerges and then it's better to collaborate on a standard so that competitors can compete at a higher level.  Throughout history there has been competitive chaos as competition improves the breed and then fundamental patterns emerge, standards are created, and chaos begins again at the next level.  But without those lower level standards being established, chaos simple continues. I proposed this idea because I sensed that the post-to-post requirement is an emerging need which is validated by many people commenting to the affirmative on this post albeit with all kind of different solutions.

> 2. The generic case is not a good fit for all use cases.

On point #2, you are wrong. :)

Seriously, you've not identified a use-case that it's not good for.  And since it's such a fundamental pattern in relational database it's hard for it to not apply. 

>> You are getting too use-case specific; they don't need to specify what the relationships mean only that they are related. Plugins can add the "meaning" layer but WordPress should probably offer the core relationship support.
> 
> The relationships have to be meaningful for there to be
> interoperability between differing implementations.

Taxonomies and terms have no meaning yet they are interoperable.  I simple see that as a basely assertion.

>> I think maybe I simply value standardization as a stepping stone for greater things and you don't (right?) and that difference in how we value standardization colors any discussion we have related to it.
> 
> I think standardization is a good thing, but the standard must be
> well-defined and have a well-defined purpose for it to be useful. This
> particular case just seems very vague and hand-wavy to me.

Honestly I think it is vague to you because you (probably) don't work with CMS use-cases.  In a blogging world it is vague and hand-wavy.  For CMS requirements it is extremely concrete; every site i'm working on needs it.

> I don't see how any of these relate to a table which relates posts to
> taxonomies (and to each other). Posts get related to terms inside
> taxonomies, not to taxonomies themselves.

Huh?  You must misunderstand the request.  It's not posts to taxonomies, it's posts to posts with taxonomy as a classifier.  The default case is no classification because most use cases only have a simple relationship.   For example, an attorney is associated with a practice area.  A movie is associate with an actor.   A doctor with a hospital department.  A restaurant with a location.  And in all cases there is significant information for each entity that makes custom post types the most 

BTW, when people say it should be custom taxonomy instead of custom post type I think it's because they view custom post types in a different manner.  I see lots of people who are creating custom post types that are just different types of posts instead of fundamentally different entities.  An attorney is not a type of practice area or vice versa.  A movie is not a type of actor of vice versa.   A doctor  is not a type of hospital department or vice versa.  A restaurant is not a type of  location of vice versa.  But I've seen people write about custom post types like "review", "opinion", "news", "case-study" etc. and they are all really just documents, aka most could be implemented as "posts" with different taxonomy terms. So I think that colors how people view usage of custom post types and custom taxonomies.

> The cases above seem like you want to be able to relate terms in one
> taxonomy to terms in another taxonomy, not to relate posts to whole
> taxonomies. IN which case I'd say your initial definitions for what
> constitutes a post vs. a term are wrong. If you do want to make
> something where a "thing" has both a post and a term for that thing,
> then you should do it with a custom post type and tie the term from
> the taxonomy to it. Let's consider a simple case of photo management.

Again, I don't want to relate taxonomies, I want to relate posts.  Let's use a simpler example for which all my points would still apply.  See, not taxonomies:

CREATE TABLE wp_post_relationships (
  post_id bigint(20) unsigned NOT NULL,
  related_post_id bigint(20) unsigned NOT NULL,
  PRIMARY KEY (parent_id,post_id),
  KEY post_id (post_id)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

> Photos are posts (attachments in our specific case, but let's just say
> "photos"). They have people in them. Every person is a term in the
> "people" taxonomy.
> But people have other information about them too. So we make a
> "people" custom post type to hold them. Maybe a bio, avatar, etc. To
> associate the person to the taxonomy, we simply give that post of them
> in "people" the term in the "people" taxonomy, thus relating the two.
> So now, if I'm looking at a photo, I can see a list of the people in
> the photo by looking at the terms in the people taxonomy. Click on one
> of those and I can see all the posts associated with that person. Only
> one of those posts is of the "people" type, and that's the one that
> gives more information about that person.

Ignoring that I think you misunderstand the fundamental aspect of the suggestion, what I think you suggest ends up requiring a significant amount of complexity, at least four table joined where one table would do.

> My point is that posts should connect to terms, and relate to other
> posts by virtue of having the same terms in a taxonomy, even across
> post-type  boundaries. This is a much more flexible way of connecting
> posts to other posts based on specific shared characteristics.

This is exactly what I implemented, and what I've found to be problematic trying to maintain synchronization between all the records.

Take a look at this thread (which generated very little response, btw):

http://lists.automattic.com/pipermail/wp-hackers/2010-July/033507.html

> For your lawyer example (first one):
> Post types: attorney, article, case study, event.
> Taxonomies: attorney, practice area.

> 
> Any of those post types can have terms attached to it from both
> taxonomies. There's a specific restriction (in your plugin code) to
> only allow one attorney post type per attorney term and only one
> attorney term in each attorney post type.


That's exactly what I have implemented.  Exactly.  And it is causing problems with corrupted data because there are so many cases where they can get out of sync.  Not to mention the three extra table joins required.
> 
> The way you seem to want to do it is to have a whole taxonomy for each
> attorney (I think), which is difficult and makes little sense to me.

Nope. You misunderstood. I currently have one term record of taxonomy "attorney" for every post record of post type "attorney."  And it is problematic.  

Note I wouldn't have even brought it up if it had not been for Michael Pretty proactively asserting the same after he implemented much the same as I had just implemented.  But after working with it for a little while I can see where it will cause problems just like he said it does.  Hence my proposal.

> Actually, it is. The postmeta table is indexed by post_id too. If
> you're searching through postmeta, you should always be tying to a
> specific post.

Except for when you are not.  I guess it's possible to maintain parallel postmeta such as this but I really hope you are not proposing trying to keep those in sync too:

post_id -- meta_key -- meta_value
123       -- actor         -- 456
456       -- movie       -- 123

Ugh.

> You are correct that postmeta is a crap way of relating posts to one
> another. But that is exactly what taxonomies and terms were designed
> to do. I think that because you're defining things as a term instead
> of a post type (or vice-versa), I think you're ending up with a
> distorted view of what the necessary queries should be.

I appreciate that taxonomy may have been designed for that but (currently at least) it's not working very well for it.  Honestly I think you misunderstand the use-cases. 

> In your movie example, a movie would be a post type. So would an
> actor. These would hold info about them. But they'd both also be
> taxonomies (movie and actor). How would the two different items be
> related? Term_relationships. A Movie post type would have actors
> assigned to it as terms related to it. And it'd have one term from the
> movies taxonomy related to it as well. An actor would be reversed,
> he's in a bunch of movies (and so is tagged with all those movie
> terms), and he's an actor (and so has his actor term tagged to him
> too).

Yes, I have a plugin implemented for that (which I've used for Attorneys and Practice Areas.)  It's problematic to maintain the synchronization between the two and not extremely scalable because of the larger number of joins and the type of joins, size of keys, etc. query analyzer output, etc.

On Aug 3, 2010, at 12:16 PM, Otto wrote:
> On Tue, Aug 3, 2010 at 3:05 AM, Mike Schinkel
> <mikeschinkel at newclarity.net> wrote:
>> 4.) A movie website could have movies and person-roles where a person-role could be an actor, producer, cinematographer, casting, film editing, costumer, set designer, composer, etc.
> 
> I think I'm starting to see what you're talking about here, and why
> you need the table the way you want it.
> 
> In your design for this one, for example, movies and persons would
> both be a custom post type. The person post would then be associated
> directly to the movie post and you'd create a "role" taxonomy with
> types of relationships to associate the two together in a meaningful
> way.
> 
> The problem is that I don't think this is the right way to do it. It
> seems to me that this is over-generalizing. Instead of "role" being a
> taxonomy and having things like "actor" be terms in it, I think using
> "actor" as a taxonomy is a better way to go.

So you are proposing a different taxonomy for each of: actor, producer, cinematographer, casting agent, production designer, set designer, film editor, art director, costume designer, etc.?  Seems like they'd make so much more sense as a single taxonomy "role" with multiple terms (and yes, I used the post type of "actor" when I should have said "person" but then I would have needed to explain even more.)

> Consider your setup. If I wanted to display a list of credits for a
> movie, I have to go pull all the related posts for that movie and
> possibly parse them to determine names and such. It's a select into
> the posts table based on the join, followed by a lot of code to do
> parsing.

And that's difficult because...?

> But if each role is a taxonomy itself, then the taxonomy name and
> terms both have displayable versions that I can use directly. It's no
> worse than displaying tags. And if I do want to drill deeper, I can
> get related posts by virtue of them sharing terms, and use custom post
> types to determine what each type of post is.

The only reason you can do that now is because there is code implemented in core to allow that.  But with this proposal it presumes that functional code would be added to allow a developer/themer to do the same but much more simply than with the taxonomy system.

> I don't think it's necessarily crazy to allow a "thing" to be both a
> post type *and* a taxonomy as well.

It's not at all crazy.  That was my first reaction and what I implemented.  However, I've found it problematic as did Michael Pretty and after working with it all the benefits I perceived that I would get from mirroring taxonomy posts are really not all that valuable.  Turns out it's not that hard to add post metaboxes to select from a list of related posts and it's not hard to retrieve related posts for post display.

> Managing your restrictions on a per-post basis
> is simpler than having to select a large number of posts 
> and then filter through them in different ways for simple tasks.

I don't know what that means nor how any of that relates to the problem at hand.

> Keeping the correct terms associated with the correct custom-posts is
> not difficult if you abstract out the creation of the posts. For
> example, when I make a new actor post, I'd automatically make the
> proper term for that actor and auto-tag the post for him accordingly.
> The plugin can do this. And then I simply don't give any way to remove
> that term from the actor. The taxonomy can be completely managed
> behind the scenes, the user doesn't have to manually manage it.


I already do that.  But I'm running into too many cases where duplicates are attempting to be created and fail because of duplicates or deleted terms. I'm also concerned that Michael, who works on some big systems, has said it was unworkable and I don't want to have to deal with a situation where I get a lot of sites deployed and clients are constantly running into data corruption scenarios because they've added a plugin that breaks the fragile bonds. And lastly I'm concerned about performance for large sites.

Worst case I'll probably live with this scenario as I loathe to add a non-standard table.  But I expect others will eventually bump into the problems I've been having and will finally address the issue, albeit later than sooner. Sometime it really hurts to be an early adopter.

On Aug 3, 2010, at 12:19 PM, Daniel Cameron wrote:
> We have about 10 post-types and a single taxonomy called 'nexus' that allows
> us to connect any of our posts with each other. The purpose for the
> connections is to build a custom menu of affiliated/associated posts. Also,
> since we're using simple post titles for terms I've built a function that
> uses wp_set_object_terms to automatically create the first affiliation on
> publish.

Mind sharing what your custom post types are?

On Aug 3, 2010, at 12:27 PM, Jeremy Clarke wrote:
> FWIW, I think a built-in system for associating metadata to taxonomy terms
> would solve/simplify a LOT of these scenarios, and make it easier to use the
> taxonomy system as intended without getting involved in post-on-post
> relationships.

Take a look at my comments about the limits of the taxonomy system above because of unique keys.  Do let me know if you see something I'm missing.

> If actors/attorneys/people, in their term format, could have more than one
> data field associated to them then you don't need the special single
> page-type post about them to exist, you can just depend on the term listing
> to show that content at the top with the related posts listed below. E.g.
> Brad Pitt is a term in the actors taxonomy and the term has his bio, photo
> etc. as meta fields. You then just use the get_category_link() style url to
> link to his page, where you show all the info as it if was a post at the
> top, along with other posts/post_types listed below.

In all main cases (attorney, practice area, restaurant, location, etc.) I have at least 5 custom fields now and probably more later. *Other* than relating the posts the custom post types work *extremely* well for the use-cases.

> This would make it a lot easier to have one core post type, in this case
> movies, and let the relationships happen with the detailed and
> metadata-powered taxonomy terms. This wouldn't solve the whole problem when
> you have multiple post types and taxonomies that all have to be related
> together,  but those scenarios will be a nightmare no matter what we do.

I don't agree.  Actors are independent entities which has a lot of information associated with them; have them be a taxonomy term with essentially a binary "related/not-related" just doesn't meet the needs of any system that's more than trivial.

Or maybe I'm missing something?

BTW, I don't think many of those scenarios would be a nightmare with a wp_post_relationships table.

> Of course, if done properly this would make terms feel a lot more like
> posts, but why would that be such a bad thing? They already have a screen to
> edit them like posts do, that screen just sucks right now.
> 
> Things that would make this a reality (mostly doable via. plugin right now,
> but would be much more likely to be used if standardized):
> 
>   - Visual editor and other enhancements in term editing screen.
>   - Standard means of associating meta to terms (term_meta table)
>   - Easy API for defining default custom fields to show up in term editing
>   UI (also needed for posts!)

Actually, a while back I argued that we deprecate wp_term_taxonomy and replace it with a wp_posts->post_type=="term_taxonomy" but nobody seemed interested at the time. ;-)

> IMHO this is a much more obvious need than post-to-post relationships and
> most large cms-y sites could think of ways that term_meta would be useful.
> It's an obvious need that also happens to solve a lot of the simpler class
> of problem Mike is frustrated with.


I think each person's "obvious need" relates to the problems they are recently trying to solve.  I also see benefit to what you propose in addition to a wp_post_relationships table, but not instead of. 

BTW, I'm not frustrated.  If people don't agree I'll just lament the lost opportunity for interoperability and having no one build on a standard post-to-post structure  but I'll go about my own business and just implement what I need.

On Aug 3, 2010, at 1:39 PM, scribu wrote:
> On Tue, Aug 3, 2010 at 7:35 PM, Jake Goldman <
> wphackers at jakemgold.myfastmail.com> wrote:
> 
>> I just released a plug-in that adds full, post-like meta data functionality
>> to taxonomy terms. I'd needed this in several projects recently, and finally
>> decided I should just package it up as an independent plug-in.
>> 
> 
> It seems you're not the only one. There seem to be 3 plugins that do the
> same thing ATM:
> 
> http://wordpress.org/extend/plugins/simple-term-meta/
> http://wordpress.org/extend/plugins/taxonomy-metadata/
> http://wordpress.org/extend/plugins/meta-for-taxonomies/

Now there's an example of 1.) there being a solid need and 2.) divergently incompatible solutions.  Take a look at the three different implementations!

CREATE TABLE wp_termmeta (
	meta_id bigint(20) unsigned NOT NULL AUTO_INCREMENT,
	term_id bigint(20) unsigned NOT NULL DEFAULT '0',
	meta_key varchar(255) DEFAULT NULL,
	meta_value longtext,
	PRIMARY KEY (meta_id),
	KEY term_id (term_id),
	KEY meta_key (meta_key)	  
);		
CREATE TABLE wp_taxonomymeta (
	meta_id bigint(20) unsigned NOT NULL auto_increment,
	taxonomy_id bigint(20) unsigned NOT NULL default '0',
	meta_key varchar(255) default NULL,
	meta_value longtext,
	PRIMARY KEY  (meta_id),
	KEY taxonomy_id (taxonomy_id),
	KEY meta_key (meta_key)
); 
CREATE TABLE wp_term_taxonomy_meta (
	meta_id int(20) NOT NULL auto_increment,
	term_taxonomy_id INT( 20 ) NOT NULL ,
	meta_key VARCHAR( 255 ) NOT NULL ,
	meta_value LONGTEXT NOT NULL,
	PRIMARY KEY  (meta_id),
	KEY term_taxonomy_id (term_taxonomy_id),
	KEY meta_key (meta_key)
);

What's amazing it they are all similar plugins but the all implemented the new tables using different table names, field names, etc.  Here's an example where it would also make sense (in my mind) to address taxonomy meta so that other plugins could build on it (or better, replace wp_term_taxonomy with wp_posts->post_type=="term_taxonomy" ;-)

On Aug 3, 2010, at 2:08 PM, Steve Bruner [SlipFire] wrote:
> In RoloPress, we added the relationship between Companies and Contacts
> through custom taxonomies, and it worked great.   You can check it out the
> demo here:http://demo.rolopress.com/  Or download it here:
> http://rolopress.com/

I just scanned through rolopress-core looking for add_action() and add_filter() and didn't see any use of any filters you would need to do this on wp_insert_post() or wp_update_post().  I also scanned for "taxonomy" and also didn't see any code that did that. Finally I scanned for "postmeta" and only found it in 3 places, places where you are doing querying.

Can point me to where in the RoloPress code where you actually maintain synchronization?  Or are you requiring the end user to maintain synchronization?

On Aug 3, 2010, at 2:41 PM, Lox wrote:
> I'll point to that plugin that implements a post2post feature (from
> trac) : http://wordpress.org/extend/plugins/posts-to-posts/
> 
> I know it uses a "post meta" approach but you could create such a
> plug-in that creates your needed table and functions to use it... Then
> any plug-in dev that need that approach can just install your plug-in
> and benefit from it.

Thanks.  FYI, I've already created the plugin to mirror taxonomies.  I was instead proposing this because I think it would be valuable for the WordPress community and not just my specific needs.

> Otto, mirroring custom posts to taxonomies can be a bad ass, I mean,
> creating needs to be handled, as well as modification and deletion. So
> the question is: can't wordpress provides helpers for custom post
> mirroring to a taxonomy? Like adding a "mirror_taxonomy_id" parameter
> to custom post_type....

Lox, I have the code for that (actually it uses  "_mirrored_term_id" in wp_postmeta.)  Evidently many others have similar code too.  I wonder, however, if anyone else's code it actually more robust than mine of if they've just not come across the potential issues with it yet?

On Aug 3, 2010, at 4:14 PM, scribu wrote:
> I think I'm going to make my Posts 2 Posts plugin use a glue taxonomy, as
> described by Daniel Cameron and Otto.
> 
> It should prevent the scaling issues described by Mike Schinkel.
> 
> With the approach I'm envisioning, the only thing mirrored will be the post
> id.

Can you show some proof-of-concept SQL so I can see what you are intending?  Show queries for 1.) actors in a given movie, and 2.) movies that the actor acted in, please? 

Thanks in advance.

-Mike


More information about the wp-hackers mailing list