ClassicPress PetitionsClassicPress Petitions
This is a read-only archive. Post or comment on the live version of this page on our forums.

Add an Object-Relationship Table

October 30, 2018 · 23:37 · Mike Schinkel
Description

The two things missing from WordPress to make it a stellar CMS is Fields functionality and Posts (and other objects) Relationships functionality.

I requested this 8 years ago for WordPress, and we still don't have one (one of the reasons we can't have nice things.)

https://core.trac.wordpress.org/ticket/14513

Back then I called it posts_relationships but really it should be object_relationships to support many-to-many relationships for post-to-post, post-to-user, post-to-taxonomy, post-to-term, user-to-user, user-to-taxonomy, etc. etc.

Voters
+33 more
Tags
Request: Add feature
Discussion
James Nylen

I need to look deeper into the custom fields work that has already been started, but I imagined this would be part of it.

Fabian Wolf

The post relationship is being used by quite a lot plugins, including WPML, which mostly is accomplished by exactly the same method Mike is suggesting, ie. an addittional table for connecting objects which each other. The other way, which also quite a few plugins are using, is by adding the connection as a meta field.

James Nylen

No more abuse of meta fields please! We should be looking for common items to move out of the disastrous postmeta table.

Mike Schinkel

Yes, what @James said. MySQL cannot index the wp_postmeta.meta_value field so any foreign keys (e.g. ID, term_id, etc.) located there that that need to be used in the right side of a SQL join ends up being too slow to be viable for larger sites.

Artem Frolov

Great! I almost created a duplicate. Just wanted to suggest wp-lib-posts-to-posts lib as ready-made solution https://github.com/scribu/wp-lib-posts-to-posts.

Mike Schinkel

Artem - Glad you found this.

I am wary of Posts-2-Posts because when we needed something for a client it was missing some key features so we had to build our own. Basically what we needed was some form of classification for an association as opposed to just a relationship, and ideally also relationship meta too.

For example, if you have an Order post and an Address post you need to know if it is a billing or shipping address.

If you have a Person post and want to relate one person to another, you might need to know who is the manager of whom.

I would certainly like to get those needs addressed and am not sure Post2Posts is the best base to start from. #jmtcw

Artem Frolov

Yes, p2p resolves those needs definitely (and even more).

The docs for references https://github.com/scribu/wp-posts-to-posts/wiki

It has two tables in its model: p2p and p2pmeta.
First, defines the connection between something and something (post id or user id, etc.), type of this connection and gives it id.
Second table for meta attached to connection id

Please find the model here https://github.com/scribu/wp-posts-to-posts/wiki/Data-organization

So having this model you can create any type of connections between posts, users, and terms and store the meta related to the connection.

We used it widely, for example, for payment orders. Even for Escrow orders, where we need to connect Order post with Project post with Employer user with Employee user. And set connection meta of course.

It's a universal solution, masterpiece, so, please guys, don't refuse it until you know it.

Thanks!

Mike Schinkel

@Artem - Thanks for the update. It appears they have indeed made some additions since I last evaluated it.

Still, it does not appear to allow object-to-object relationship but instead only post-2-post, post-2-user and user-2-user. For example it does not seem to support many-to-many post-to-comment, post-to-blog, post-to-network, or post-to-custom table. Those are things I would definitely want to see.

As for this plugin it also it seems it has not been updated significantly in years:

https://github.com/scribu/wp-posts-to-posts/graphs/contributors

There are a large number of open pull requests for 5 years:

https://github.com/scribu/wp-posts-to-posts/issues

And a large number of open issues:

https://github.com/scribu/wp-posts-to-posts/issues

It also seems that it is not being updated and the author himself has suggested forking it:

https://github.com/scribu/wp-posts-to-posts/issues/535

Maybe it could be the base we fork and then add to? I know scribu left the WordPress community and so I think the only way forward with that plugin if that plugin is acceptable is to fork it.

Artem Frolov

@Mike, many-to-many post-to-comment, post-to-blog, post-to-network, or post-to-custom - supported for sure. The data model allows this. We use this ability.

cp_p2p:

  • p2p_id : 999
  • p2p_from : custom table 1 row id
  • p2p_to : custom table 2 row id
  • p2p_type : 'custom-1-to-custom-2'

cp_p2pmeta:

  • p2p_id : 999
  • meta_id : 1
  • meta_key : 'foo1'
  • meta_value : 'bar1'

It resolves everything. No matter whether it posts, users, terms, custom data - any random table with any random table, one-to-one, many-to-many, one-to-many. The key stuff here is in 'p2p_type' column which defines "what" is connected with "what".

It's actually pretty stable and self-sustained. Maybe this is why it's not updated for the last 5 years. We didn't need to change anything and it just worked for us last 7 years with no issues.

The referenced issues are mostly related to feature requests which are not consistent or not suitable to the dry structure of the lib. There are only three minor issues.

Anyway, this is an excellent base to start with. To fork it or not to fork is the second question. First of all, it's suitable.

Mike Schinkel

@Artem,

p2p_type : 'custom-1-to-custom-2'

I personally would like to see that structured so that two or more plugins written by different developers could be likely to work together, which would not be the case is developers just relied on their own developed conventions.

That is something I have learned over my decades of coding; developers can usually always hack something together, but to get something standardized where non-coordinating developers can write compatible code, that is where the value is.

But yeah, such structure can probably be added to it if we fork it.

Artem Frolov

Maybe I didn't get you, but there is no need any conventions. You can call connection type as you want and use it in your plugin. In example it named 'custom-1-to-custom-2', but can be 'favorite-listings', 'transaction' or whatever. You can even add unique prefixes to avoid collisions. This is an abstract thing, which you use inside your code for further requests.

Mike Schinkel

@Artem - How would my plugin know that one of your plugin's P2P connections were intended to be between user and the primary key of your custom table? Or between comments and multisite blogs? Or between posts and taxonomies?

I am assuming based on what I saw on the docs and what you have said that there is nothing in the relationship definition that definitively identifies that objective type of the relationship and that type is just a subject name that you as a site developer create and make sure to use correctly.

Or am I misunderstanding?

Artem Frolov

There is actually something that definitely identifies the object types in connection. It's a side object P2P_Side - https://github.com/scribu/wp-lib-posts-to-posts/blob/master/side.php. So your plugin can recognize the object by filtering registered connection using filter 'p2p_connection_type_args' https://github.com/scribu/wp-lib-posts-to-posts/blob/master/connection-type-factory.php#L44. But p2p_type is just a handle or simple name of connection for internal use.

Mike Schinkel

@Artem - So it seems P2P has a recognizer for Post, User and Attachment, but nothing else standardized. IMO that would need to be extended.

It also seems to recognize() by criteria that could easily generate a false positive. If I had a custom table with an primary key of 100 and there is also a post with an ID of 100 it appears that it could recognize my custom table as a WP_Post:

https://github.com/scribu/wp-lib-posts-to-posts/blob/master/side-post.php#L135

My preference would be to have something more robust if we were going to add to CP. Of course, one viable option could certainly be to based on a fork of P2P but improving on it where P2P is problematic.

Artem Frolov

@Mike surely it can be extended by adding new P2P_Side types and recognizer will work as expected without collisions with wp(cp)_posts. The wp-lib-posts-to-posts is the lib and shouldn't include all possible types. Instead, it provides a standardized way to register and use them.

PS: The false positive can't happen since you can't associate your custom table with posts table side. You have to register new P2P_Side for your table, which includes your custom recognizer.

Mike Schinkel

@Artem,
> recognizer will work as expected without collisions with wp(cp)_posts.

How can the recognizer avoid collisions if connection registration does not know enough to determine?

See, here is one of the issues I see with the current design. From:
https://github.com/scribu/wp-posts-to-posts/wiki/p2p_register_connection_type they have:

'from' - string|array The first end of the connection: post type name or 'user'

It says post type name or user. It does not say I can specify that I want to link, for example, comments to comments. How would that be done?

This is not to say P2P could not be extended if forked, just that it seems to have issues which is why we did not use it in the past.

Or maybe it is possible, but I don't see it. Maybe you should prepare a Gist as an example so it will be more clear, to be and to others? It would be helpful to see both how connections could be registered and then how Side classes would be implemented for at least a few additional object types.

As for standardization, there is a sweet spot between not including enough (which I think WordPress is extremely guilty of) and trying to include everything possible. IMO if they are not included they will not be standardized, and more than just Post, User and Attachment would benefit from standardization. Again, IMO.

Artem Frolov

Ok. It says post type name or user just because of there only three standard sides provided (for post user and attachment). But it's not limited. They haven't written a doc on how to add more sides and object types. Well, not a big deal, thanks for lib itself. There is a class diagram at least https://github.com/scribu/wp-posts-to-posts/wiki/Class-diagrams to see how it works and where it can be extended. To be consistent and clear I'll surely create a gist with example.

Artem Frolov

@Mike, please find an example here https://github.com/dikiyforester/custom-p2p-object-type. I hope this can help.

Greg Schoppe

I've been writing an object relationships plugin that attempts to emulate core standards for a couple of months now, and have the entire core functionality written. What would need to be added for completion is the registration and interface components. https://snippets.cacher.io/snippet/4d216577af5b713c1ac4

Mike Schinkel

@artem - Thank you so much for going to the effort to creating the custom “side” plugin so we could better understand how P2P extensibility mechanisms work.

I can tell you are obviously extremely enthusiastic about this plugin and have evidently used it extensively with great success. OTOH, as the author of this petition, someone who has been working around lack of relationships in WordPress since 2010 and an advocate of getting a solid solution in WordPress core previously (and now ClassicPress) I have a strong vision for what I would like to see added to ClassicPress, and unfortunately it is not P2P.



But also please realize that I am just like you; someone looking in from the outside who is not running ClassicPress so my opinion is just my input; my deciding I don’t think P2P should be included in ClassicPress is nothing more than my opinion, although I will be lobbying for an alternate solution.

To better understand what some of my more details objections are let me explain. Thus far I have spent about 3 hours reviewing your code plus working with the latest P2P since you posted your code here and these are my thoughts:

  • I first looked at the source code for your custom “Side” for over an hour before I realized I was going to have to install it and run it through XDEBUG before I could understand what it was doing. Even then it took me quite a white to wrap my head around how it worked. So I think this “Side” extensibility mechanism is overly complicated and I would prefer a much simplified extensibility mechanism. I would also want to build in 90+% of expected use-cases so that most people would not need to go to the effort to implement a “side.” And by building them in we could standardize on them so different plugin developers could depend on them being available and meaning the same thing.

  • The plugin requires scb-framework, and I don’t think we want to automatically include any external libraries or frameworks. (But if we do, then I want to include WPLib!)

  • Posts-2-Posts/p2p is a reasonable brand name when the functionality is to be provided by a plugin, but IMO it is confusing naming for a standard functionality. For example, what is a “p2p_type?” And what is a “P2P_Side?” And The “connection” terminology is made for me to reason-about; what is a connection? (I think of connecting on the network, not relating tables in a DB.) I’d much prefer to think in terms of “relationships” vs. “connections.”

  • P2P’s functionality is implemented mostly in functions whereas I think they would be better in classes with some helper functions like register_relationship(). P2P also has many “internal” functions which would be better as private methods of a class, and thus really protected from outside use.

  • Numerous P2P function return stdClass objects rather than instances of a declared class. For 2018, I think we’d be better off defining and declaring named classes.

  • P2P “Sides” must be created w/o PHP namespaces and names must start with P2P_Side_, and it uses a direct type name to class name naming convention which is too limiting IMO for inclusion in CP.

  • P2P “types” have a flat namespace: post names, ‘user’, and then “whatever” where the latter requires a P2P_Side_Whatever class to be defined. From painful experience related to this I’d much rather seen a two level “object type” baked into CP in a manner that could be identified via a single string like “post:post”, “post:page”, “post:{cpt}”, “user:{role}”, “table:{table_name}”, etc.

  • The external “Side” class (P2P_Side_Custom) does not reference a finite list of well-known object types (posts, users, comments, custom table, site, network, etc.) so it is practically impossible for a 3rd party plugins to leverage them as the Side functionality only recognizes (i.e. “Let me look at it and I will tell you if it is okay”), it does not identify (i.e. “Here is the identify you can use to look for it yourself.”) To me the latter is critical and the former, not sufficient.

  • Not critical important, but when you specified ’custom’ internally P2P first recognizes it as a post type and instantiates a P2P_Side_Post instance (see lines 57 to 65 in in create_side(): https://github.com/scribu/wp-lib-posts-to-posts/blob/master/connection-type-factory.php#L54) It later changes it to an instance of P2P_Item_Custom which seems like unecessary extra work, especially if a lot of connections are created in a loop.

  • All that said, P2P is a solid effort, a useful plugin that exists in the wild, and a great source for prior art so we don’t have to reinvent all the details as we leverage ~8.5 years of experience with P2P and other’s in parallel efforts (such as mine https://bitbucket.org/newclarity/sunrise-1/src/1a3df930085b154c7089c024d87e187ddbe7746a/modules/relationships/?at=master)

I hope you won’t be too upset with me for not having the same high opinion of P2P that you have. I just think we can do a lot better.

Mike Schinkel

@Greg - Thanks for including a link to your work. Reviewing it, I think you've got a really solid effort there, and what you have done is a lot more inline with what I have been envisioning.

There are, however, numerous issues I personally would want to see improved and expanded. Some of them include:

  1. Follow WordPress coding standards more closely, e.g. WP_Object_Relationships vs. WPObjectRelationships and get_instance() vs. Instance(), as well as other areas.

  2. Fix the non-viable i18l translation code, e.g. you cannot used a $domain property because the translation scanning software won't recognize it; it must be hardcode. Besides, for inclusion in CP I am guess we'd want a different domain anyway.

  3. Drop the use of class constants unless and until CP identified and defines a constant class constant naming and usage standard, and just hardcode these values.

  4. Implement the installation in a manner that would in integrated into CP's installation, if and when it because an merge candidate, of course.

  5. Consider using a different naming scheme besides object_1 and object_2 as they are harder to reason about and too easy to misread. Options to consider are from and to, parent and child, and my current preference left and right, as well as other potentials.

  6. Implement meta using WordPress' built-in meta functions instead of using roll-your-own functions for that same purpose.

  7. Add convenience functions such as register_relationship() that call the class methods so that it is less intimidating for people not yet comfortable with using classes in the sites they build.

  8. Likely break out some of the functionality into additional classes.

  9. Implement reciprocal relationships.

  10. Add relationship query capabilities to WP_Query

  11. Get CP to recognize "object types" and create standardized names for them (although many already exist): post, user, comment, site, network, table, and so on.

  12. Use those “object types” to allow specifying that which a relationship related to using a single string like “post:post”, “post:page”, “post:{cpt}”, “user:{role}”, “table:{table_name}”, etc. that could also be converted to a object instance (though that example is really old.)

Do you have this in a Git repository somewhere instead of just in a Gist?

Would you be interested in collaborating on an implementation we could present to the CP community and get feedback?

-Mike

Artem Frolov

@Mike thank you for your time and so detailed review.

We're probably looking on the same thing from the different angles.

I would not spend our time anymore on discussing surface things like terminus, namespaces, coding standards or extending mechanisms, etc. It's good for PRs, not here.
And strictly would not discuss my personal sympathies or upsets.

I still support your petition, I like its idea and with it a good luck.

I'm pretty sure, that applied solution (whatever it would be and if it actually will be) will be good enough and have the same or better data model.

Otherwise it can not be ;-)

PS: Sorry for flooding your petition.

Mike Schinkel

@artem - No worries at all. Discussion is what they petition comments are for, I think. Thanks for pushing me to explore P2P since I had not lately. And thanks for the courteous reply.

Greg Schoppe

@Mike I'm glad you like the general direction my proposed implementation is taking. It's definitely an early effort, so most of your criticism is definitely warranted. My original goal was to implement a feature plugin that could be proposed for merge into WordPress Core, but I see no reason not to make it available to WordPress and ClassicPress, knowing that ClassicPress is more likely to see the value it offers to professionals.

I responded to a few of your points below:

  1. Better Code Standards adherence is definitely an issue. I tried to match existing trends for function arguments as well as possible, but things like Snake Case and instance name will definitely need to be tightened up.
  2. i18n is definitely a weak point for me, so yeah, I'm sure it will need a thorough refactoring. I had no idea the PHP was parsed by something other than a PHP interpreter in the process, so the variable was an attempt to make that easily changeable in the future. mea culpa.
  3. The Class Constants would eventually want to be properties of the $wpdb class, like the other WordPress table names are. This was just a stopgap until I could take a look at whether a plugin could incorporate additional properties into $wpdb at runtime. (it can)
  4. Installation would definitely need to be reworked for a merge proposal, but until it gets to that point, I've been designing with a feature plugin as the end goal.
  5. object_1 and object_2 were definitely not my first pick. left and right might work, but it is important not to enforce any idea of primary and secondary, since at the moment, the class is only designed for bidirectional relationships (more on this later), so object_1 is really just whichever object had the lower id. in non-bidirectional mode (as of yet unimplemented), the order of object_1 and object_2 would become important, so it would be good to have a convention that could make sense for both cases... left and right could work well for that.
  6. The meta functions implemented were definitely first attempts, based on not initially knowing how to register a new table with wpdb. researching now, this looks pretty straightforward, so these functions could probably be replaced by calls to the standard metadata functions. I would need to review them to make sure there wasnt some other compelling reason to re-implement these.
  7. agreed, this was a planned part of phase 2
  8. My original plan involved creating one class to define the core functionality of relationships (most of what i have mostly written so far), another to handle registering a relationship type and adding the UI to the admin panel, and a third to implement a relationship query, and a fourth to handle injecting new relationship_query functionality into post/term/user/etc queries. Do you foresee any smaller delineations being needed?
  9. Actually, all relationships are currently reciprocal. What will need to be implemented are non-reciprocal relationships. also, ordering for weights in relationships will likely need to be revisited, as the current implementation requires a separate sorting step after relationships are returned from the DB (since with reciprocal relationships there is no way to know whether your value is in object_1 or object_2).
  10. see above, this definitely still needs to be implemented.
  11. although I'm not hard against the idea of defined object_types, there may be some value to being able to extend the capabilities of the relationship table, just by entering a new string as type... this would match with the functionality seen in posts, where you can create a post of any type, regardless of whether that type has been registered.
  12. this gets into the fine details of register_relationship_type, which hasn't been built yet, but it is probably going to be important to offer a little more flexibility than the proposed string format would allow. Relationships may need to allow for more than one type on either side... for example, a post may want to link "related content" that chooses from two custom post types. I can even imagine situations where you would want to link in users and terms as options for one side of a relationship (since they both behave as archive pages).

I currently have this code tracked in a private bitbucket repo, but I'll look into moving to a public github repo. I'd definitely be interested in collaborating to get this feature fleshed out and ready to present to both CP and WP.

Mike Schinkel

@Greg - Great to hear back from you. I think we are very much on the same page given your answers. But let me respond to just a few of the ones where you asked questions or where I think my points where evidently unclear:

3.) You can definitely add properties to $wpdb; we do so in our library.

5.) I like left and right because of left join and right join; e.g. it does not imply anything about the relationship, in only describes the order in which the objects were specified. object_1 and object_2 does that, but are harder to discern visually when scanning, so glad you like left and right; those might be our choices if someone does not come up with something else.

6.) Yeah, I think there would be strong reason not to reinvent meta functions for this, since post, user, term and comment meta all use the lower level meta functions.

8.) I actually did not have specific classes in mind. I tend to discover classes that I need as I work on a project. Rarely can I envision 100% what is needed in advance. When I try to, I am usually wrong and have to refactor so I just don't try to decided in advance anymore. Thus my comment was more of an abstract reference to multiple classes than anything specific.

11.) I think my wording did not make clear what I envisioned. Instead of a finite set of object types like we have a finite set of Post Formats (that "seemed like a good idea at the time" that nobody uses anymore, but I digress...) I was envisioning a set of known object types to be pre-defined but also allow custom object types, pretty much how we have pre-defined post types in WordPress and then also custom post types. Along these lines I am thinking we'd have both register_relationship() and register_relationship_type().

12.) Actually, what I was referring to was the identifier for the object type(s) used in a relationship, not the attributes of the relationship itself, just like a post_type is the identifier for a post type but you don't encode everything about the post type into the string; that is what the $args array is for:

register_relationship( 'post:order, 'post:order_item', array(
     'name' => 'Orders to Order Items',    
     'cardinality' => 'one-to-many',
     'reciprocal' => false,
));

Actually, looking at your code I just realize that your code does not make a distinction between defining what I would call a relationship (i.e. "Order to Order Items" as above) and apply that relationship to two related objects. I think separating those into two distinct process where the first is analogous to register_post_type() and the second to wp_insert_post()/wp_update_post() will be very important.

Anyway if we will collaborate on this I think we definitely need a public repo for it, and ideally on GitHub. I could add to my WPLib organization and give you committer access, or maybe ClassicPress would host it on their org and give us both commit access? (Though I don't want to blindside @nylen and make him feel obligated to set this up on their GitHub account if he is not ready to do so.)

Nodws

Needed this so bad last year! ended up creating another table to handle the relationships

invisnet

Lots of detail here, but mainly relating to current implementations.
If we're going to do this I'd like to actually use the database.
Ideally we'd have a table per relationship pair (there aren't that many), a 3-way flag for the direction of the relationship, and then some mechanism to allow transitive relationships.
To me, the actual names of classes and the various case styles of constants and so forth are much less important than a solid foundation.

James Nylen

Mike, Greg: you both have access to https://github.com/ClassicPress/object-relationships now.

Greg Schoppe

Thanks, James... I'll try to get an initial, structural commit up there tomorrow with the existing codebase (Just got back from Thanksgiving travel)

Greg Schoppe

@invisnet, I'm not sure I follow.. what are you describing as "relationship pairs"? I imagine that there may be situations where we want to link posts to a mixed list of users and posts (just one possible confounding situation). If I am understanding what you are describing, it seems like that use case would be impossible in your implementation.

invisnet

OK, the pairs would be something like:

  • post/post
  • post/user
  • term/term
  • user/user

and so on.

Ultimately we could do the same for post/term and post/comment so there's a common structure to all the relationships, but to do that in a way that doesn't break a lot of sites would be quite a challenge.

Greg Schoppe

The issue I see with the idea of a dedicated table for each of these relationships is that we currently have the objects of post, term, user, comment, site, network, and table, and plugins may need to be able to extend support to additional object types, such as gravity forms submissions. So, even without taking into account needs for custom relationship objects, we have 27 additional tables for all the possible pairings, and we wouldn't be able to support mixed relationships, where a post might be related to both users and posts in the same relationship type.

I get the reasoning, to allow for foreign keys on each of the tables, but I think it would massively complicate the database structure, to support a feature that will be pretty effectively supported by a table that stores both object id and object type for each side of the relationship.

invisnet

Well yes, there are in theory 27 pairs, but there are only about 10 that are interesting, mostly term/whatever (unless we change existing relationships like post/user).

  1. What's wrong with having 27 tables anyway? It's a database - that's not a lot of tables.
  2. I don't understand the problem with relating a post to both users and posts - there'd be entries in 2 tables, just query as needed - it's only a join.
  3. If we really want to define a relationship as a separate thing then let's create it as an object and add its id to each entry in the pairs tables, and to go full meta, add a relationship/relationship table.
  4. It would be far better for additional object types as the data wouldn't get spammed throughout the rest of the database - one table for object type, tables for the relationships that make sense - deleted when the plugin is uninstalled.
  5. With predictable table names the SQL is simplified.
  6. Performance! We might actually get some rather than having to rely on caching everything.
  7. Security! In future it'd be nice to have some. With multiple tables we can do column-level access control, whereas with a single table it'd need to be row-level (just shoot me now).

One of my biggest gripes about WP/CP is that we're sitting on top of a relational database and almost entirely treating it as a dumb key/value store, and as an added bonus, serialised data too. If we're going to go down that path then we should look at NoSQL instead.

Greg Schoppe

> there are only about 10 that are interesting

I disagree with this. Relationships have the additional context of the relationship_type (similar to a post type), which could make a second form of post->term relationship quite interesting... for example, you could define pages that list certain categories most recent posts on them. this is a post->term relationship, but wouldn't necessarily make sense to make the post part of the category. Once relationships are an option, we'll see all sorts of connections being made. We shouldn't limit these by our initial preconceptions.

In general, it sounds like you are advocating for a complete DB rewrite of standard WP tropes, into something more like Drupal. I'm not against that, but it seems more relevant for a greenfield CMS project than for a feature request on a pretty long-in-the-tooth project that focuses on stable back compatibility. That route is a lot more work than implementing relationships with existing tropes, and would require a complete split with the existing plugin base.

Short of that sort of complete rewrite, making this sort of change in a piecemeal way just creates a very confusing structure for developers to try to drop into, and gives a lot of reason for people to remain with consistent CMS systems like Drupal rather than wading into a project with a fragmented concept of data storage.

Mike Schinkel

@invisnet - I have to concur with Greg. The 10 "interesting" relationships ignore future ones we might add that become interesting, but more importantly ignore ones that a developer might add for their own projects. Their extensions should be able to participate in object relationships too.

WordPress itself went this direction with meta, and IMO that decision was a meta-shame (I was involved in the debate over this, way back when, and I was a dissenter to the direction they went.)

As for a complete rewrite to be like Drupal. Please let us not do that. That is one of the reasons I have found WordPress to be so much better than Drupal. Not to mention how deployment for that kind of thing can be a nightmare.

invisnet

I'm not advocating for changing what's there, nor am I wanting to rewrite everything to be more like Drupal.

I am advocating doing a better job than WP have done in the past - we're talking about a major new feature

Just because we've inherited something doesn't mean we have to keep doing things the same way, especially if a new way allows us to increase our security options. We shouldn't treat security as an afterthought in the same way WP appear to.

I'm also not suggesting anything is restricted - not sure where that idea came from. register_relationship() or whatever it gets called would handle creating tables as needed, so while the default install would only have 10 or so relationships defined, if someone wanted to do something esoteric they could.

As for backwards compatibility with plugins, if we were to change anything - and I don't think we should - only those that access the db directly would have a problem; they shouldn't be doing that in the first place, but I understand why they do. If we want to support other databases - and I'm very much in favour of that - we're going to have to deal with that mess anyway.

TL;DR: To me this sounds like "WP did it this way, we should too". We should certainly learn from their mistakes, but we should also remember that we're not WP.

Mike Schinkel

@invisnet -
> "so while the default install would only have 10 or so relationships defined, if someone wanted to do something esoteric they could."

Ah, sorry for my misunderstanding. Here I do concur.

> "As for backwards compatibility with plugins, only those that access the db directly would have a problem; they shouldn't be doing that in the first place, but I understand why they do."

Well,... As an aside, I write a lot of custom plugins, and I'd say a large amount of my code accesses the DB directly. Why? Because either 1.) what I need done is simply not possible using only the API functions in PHP provided by WordPress, 2.) it is not performant enough to use the provided API functions, 3.) I need to use custom tables and I need to relate them to existing tables (Note: I only add tables as a last resort when none of the tables provided by WordPress can be used instead.)

We could reduce the number of times someone needs to do #1 by adding more API functions, for sure, and I'd love to see that. #2 is a case-by-case basis which is often combined with #3. And #3 means either adding general purpose versions of those types of tables we need which could minimize the need to create custom tables but never eliminate it.

So IMO a blanket "You are doing it wrong if you access the DB directly" is unrealistic for people trying to meet client needs and can only be minimized but never eliminated.

> "TL;DR: To me this sounds like "WP did it this way, we should too". We should certainly learn from their mistakes, but we should also remember that we're not WP."

I can't speak for Greg, but reading his comments I didn't see that he implied that, and I explicitly stated the opposite, i.e. create one relationship table instead of many like WordPress did with meta.

That said, IMO ClassisPress won't be a "Classic" version of "Press" if it does not maintain at least reasonable compatibility with WordPress, otherwise then it is really something else.

And that means not going too far out away from WordPress, such as replacing existing tables with NoSQL, though an option for using NoSQL might make sense.

BTW, one of my hopes is that we can get WordPress to adopt some of the work we do for ClassicPress so the forks do not have to diverge too much and plugins can continue to easily work on both platforms.

> "One of my biggest gripes about WP/CP is that we're sitting on top of a relational database and almost entirely treating it as a dumb key/value store"

When I came to WordPress, I hated the key/value store. After all, I starting my career as a relational database instructor.

But now with almost 10 years of working with WordPress I have come to feel that is it one of WP's better features. It is one of the reasons that so many plugins can work together w/o conflict. It is also a major reason why WP is so much easier to deploy than Drupal, for example.

Back in the 80's and 90's I'd spent lots of time trying to get my database schemas correct, and they never was. I was constantly having to change them as new requirements arose, in part because relational database does not model real-world entities without tradeoffs, and those tradeoffs often require schema changes when the requirements change.

But WordPress' key-value store has been mostly immune to the need for constant schema changes. And that has allowed me as a developer to focus more on the requirements of the solution instead of how I store its data.

I am not saying there is no value in a nicely designed 3rd normal form relational database -- there definitely is, especially for performaces -- but there is also a value to key/value stores as I described above, and I would hate to loose those benefits in ClassicPress.

> "and as an added bonus, serialised data too."

I am 1000% percent with you there, though! Except in wp_options. It makes sense to me in wp_options (for the most part.)

P.S. All the above are just one man's opinion. I have no authority to make decisions related to ClassicPress, I am just someone who wants to contribute to make it better in the ways I see that if could be better.

James Nylen

Note: this repository has moved to https://github.com/ClassicPress-research/object-relationships after some discussion on Slack. This is functionally the same, but makes the plugin's "semi-official" status clearer.

Greg Schoppe

@mike I've committed a basic file structure and a somewhat cleaned up version of my POC code to the repo that should have some level of clarity on the current state of the code and the features that still need to be implemented wholecloth or extended

Greg Schoppe

It should also be noted that relationships may need meta (think actor->movie with a meta value of 'role'), so this has some crossover to the Fields API project.

https://petitions.classicpress.net/posts/76/finish-the-custom-fields-api

Mike Schinkel

@greg That's awesome. I have unfortunately had to take over a project that is late and am rushing to complete it so I will be out of pocket for at least a few days?

Are you on the ClassicPress Slack? It would be great to be able to have to real-time discussions with you there once I come up for air.

Pedro de Carvalho

maybe i'm misunderstanding the code, but it seems that there will be one table for all the relationships?

https://github.com/ClassicPress-research/object-relationships/blob/master/includes/class-wp-relationship.php#L27

Greg Schoppe

@pedro, you are correct that the current POC uses a single wp_relationships table and a single wp_relationshipmeta table, similar to the structure already used in wp_posts, wp_terms, wp_users. Relationships are intended to be a base object type, so they follow the patterns that are already used for all other objects.

In my opinion, the concept of monolithic object stores in WP/CP is way too deeply rooted to start isolating features on a piecemeal basis. Doing so would simply confuse developers who are used to standard WP/CP tropes.

We will never be able to isolate post_types or taxonomies, without losing back compatibility to literally every plugin and theme, and I believe there is a lot of value to conceptual consistency across the project's codebase.

Pedro de Carvalho

thanks @greg for the feeeback. But this will lead to a huge table. And one with variable fields, which will only make it slower.

If a site has 1000 products across 100 categories, plus post, contact, etc. all the product views will lock the huge table. Also this forces other relationships into a single huge table just for conceptual ( and in my opinion, wrong wp approach ) consistency.

i didn't quite get why breaking this new table into distinct ones would break back-compatibility. can you explain it further please?

If you find the time, can you describe the fields? seems you could get away without the varchar for the object_type.

Greg Schoppe

@pedro I'm not entirely sure we have the same view of what a relationships table would be used for. 1000 products in 100 categories wouldn't necessarily use the relationships table at all, since we already have the special "grouping" concept of a term.

The relationships table would exist to create a discrete, non-heirarchical connection between specific objects. This is not a replacement of any existing WP/CP functionality, as the concept currently just doesn't exist in the WP/CP Schema, so no default functionality uses it.

Also, a varchar with a max-length of 191 characters is under the key size for b-tree keys on a utf8_mb4 table, so it really won't slow down queries to any significant degree.

I'm also not certain what locking operations you imagine happening on this table, other than on object_save events, which would initiate a handful of row-level locks, that don't prevent read.

To use your shop example, it might be used to create a list of "related_products" on each product page. For a specific product, This might be a set of <10 entries in the object_relationship table for each product, each with the relationship_type of "related_products", object_left_type of 'post', and object_right_type of "post". In this use case, the fully-packed scenario would be around 5000 rows. (10 bidirectional connections on 1000 nodes = 10000/2 connections)

Since relationship_type, object_left_type, and object_right_type are all keyed in the table, the MySQL/MariaDB engine will optimize the query to first isolate relationship_type (unless it decides another route is more optimal, based on your specific b-tree sizes), and will immediately ignore any row not containing that type. After this first optimization, the search set would already be reduced to only our 5000 rows. Then, we use the combined object_x_type - object_x_id keys to limit to only those rows that involve the current object in either the left or right column (to account for bidirectionality), and we retrieve only the relevant rows, extremely quickly.

Describing it here, it might even make more sense to make the combined keys relationship_type - object_x_type - object_x_id, since all three will be present in any common query.

Mike Schinkel

@Pedro - I concur with @Greg on this.

The one area I do question is the length of the object_type key. Taxonomy is a varchar(32) and I would expect we could make object_type the same.

Greg Schoppe

@Mike, I was actually just thinking about that... we could reasonably reduce all the varchars in the base table to varchar(32).

As I was describing the keys I was realizing that varchar(191) is the max length for a simple key in utf8_mb4, but the combined keys likely need a much shorter length.... so yes, varchar(32) should be more than sufficient for labelling purposes.

I considered varchar(20), like post_types, but with two existing, competing standards for key length convention ( 20 vs 32 ), we should probably default to the more permissive option.