[wp-trac] [WordPress Trac] #18276: Implement URL Routing System to "Front-End" WordPress' existing Rewrite System

WordPress Trac wp-trac at lists.automattic.com
Thu Jul 28 07:35:05 UTC 2011


#18276: Implement URL Routing System to "Front-End" WordPress' existing Rewrite
System
--------------------------+------------------------------------
 Reporter:  mikeschinkel  |      Owner:
     Type:  enhancement   |     Status:  new
 Priority:  normal        |  Milestone:  Awaiting Review
Component:  Permalinks    |    Version:  3.2.1
 Severity:  normal        |   Keywords:  dev-feedback has-patch
--------------------------+------------------------------------
 As per [http://wpdevel.wordpress.com/2011/07/27/wordpress-3-3-proposed-
 scope/#comment-22028 scribu's reply to my comment on "WordPress 3.3
 Proposed Scope"] I am attaching a plugin I called ''"WP URL Routes"'' in
 file `wp-url-routes.php` which is a '''proof-of-concept''' that works as a
 plugin today even though it was designed to be integrated into core
 ''(i.e. it does not use function name prefixes.)''

 WP URL Routes uses `register_query_var()` and `register_url_path()`
 functions to build a tree of `URL_Node` objects where each URL Node
 ultimately contains the metadata required to represent a path segment
 where the path `'/'` is the root URL Node.  It is designed so that path
 matching can be done not only with regular expressions like the WordPress
 Rewrite system but also using keyed arrays, database lookups, potentially
 specific callbacks, as well as global hooks. For high traffic systems WP
 URL Routes could be optimized in a variety of ways including using
 Memcached and more.

 WP URL Routes is designed to be as easy as possible for a themer to add to
 a theme's `function.php` in an `'init'` hook. Here is what such an
 `'init'` hook might look like to define the oft requested
 `/CATEGORY/POST/` url structure:

 {{{
 add_action( 'init', 'mysite_url_init' );
 function mysite_url_init() {
   // Allow categories to be specified at the beginning of a path
   register_url_path( '%category_name%/%name%' );
 }
 }}}

 A very important aspect of its architecture is that about the
 '''''only''''' thing that the WP URL Routes concept overrides is that it
 effectively replaces `$wp->parse_request()` by subclassing the `WP` class
 ''(which of course WordPress core need not do.)'' It actually might be
 easier just to show you the code for the relevant parts of the
 `WP_Urls_WP` class rather than try to explain it ''(see below)'':

 {{{
 /*
  * Extends the WP class in WordPress and assigns an instance to the global
 $wp variable.
  * Notes: This is needed because WordPress does not (yet?) have a hook for
 $wp->parse_request() as proposed in trac ticket #XXXXX
 */
 class WP_Urls_WP extends WP {
   static function on_load() {
     // 'setup_theme' is 1st hook run after WP is created.
     add_action( 'setup_theme', array( __CLASS__, 'setup_theme' ) );
   }
   static function setup_theme() {
     global $wp;
     $wp = new WP_Urls_WP();  // Replace the global $wp
   }
   function parse_request( $extra_query_vars = '' ) {
     if ( apply_filters( 'wp_parse_request', false, $extra_query_vars ) ) {
       WP_Urls::$result = 'routed';
     } else {
       WP_Urls::$result = 'fallback';
       if ( WP_Urls::$fallback ) {
         parent::parse_request($extra_query_vars); // Delegate to WP class
       } else {
         wp_die( 'URL Routing failed.' );
       }
     }
     return;
   }
 }
 WP_Urls_WP::on_load();

 }}}

 WP URL Routes is ''currently'' designed to ''"front-end"'' the WordPress
 Rewrite system so that any URL path patterns defined take precedence over
 the standard rewrite system but if no URL path patterns match the HTTP
 request's URL then the WordPress Rewrite system takes over. And it is an
 option for a developer designing a custom CMS system with WordPress to
 bypass the WordPress Rewrite system completely on a failed match so they
 can fully control the URLs of their site ''(if the WordPress team chooses
 to use this proof-of-concept as a base for WordPress 3.3 URL routing they
 could conceivably have a "no-backward-compatibility" mode for URL routing
 that could be enabled with a constant in `/wp-config.php`.)''

 WP URL Routes allows URL paths to be defined using exactly the same URL
 path format found on the Permalinks page, i.e. `%year%/%month%/%day%` for
 example ''(although I don't think this route is implemented in my plugin
 just yet.)'' It also relies heavily on `array_merge()` to allow for
 several levels of meta data to finally be merged down to the actual
 metadata for each URL Node.

 Let me illustrate with `%pagename%`. With the existing WordPress Rewrite
 system `$wp->query_vars` ends up with `['pagename']` being set to the URL
 slug ''(single or multi-path segments)'' and `['page']` is set to `''`
 ''(this latter is unimportant, but we want to match WordPress' Rewrite
 behavior exactly.)'' So here is how we might define the `%pagename%` query
 variable and the `%pagename%` path:

 {{{
 add_action( 'init', 'mysite_url_init' );
 function mysite_url_init() {
   register_query_var( '%pagename%', array(
     '@validate'       => 'is_valid_page', // Callback ,'@' means don't put
 into query_var
     '@multi_segment'  => true,            // Because /foo/bar/baz/ is a
 valid path
     'page'            => '',              // This is match WordPress'
 behavior
     'pagename'        => '%this%',        // '%this%' gets replaces by
 current path segement
   ));
   register_url_path( '%pagename%' );
 }
 }}}

 But that requires a lot of learning on the themer's part so I hardcoded
 the meta data for `%pagename%` into the function `register_query_var()` on
 a `switch-case` statement; here's the `case`:

 {{{
 case 'pagename':
 $defaults = array(
   '@validate'       => 'is_valid_page', // Callback ,'@' means don't put
 into query_var
   '@multi_segment'  => true,            // Because /foo/bar/baz/ is a
 valid path
   'page'            => '',              // This is match WordPress'
 behavior
   'pagename'        => '%this%',        // '%this%' gets replaces by
 current path segement
 );
 break;
 }}}

 Which means we can simplify it for the themer to be like this:

 {{{
 add_action( 'init', 'mysite_url_init' );
 function mysite_url_init() {
   register_query_var( '%pagename%' );
   register_url_path( '%pagename%' );
 }
 }}}

 Of course the plugin can check to see if the query variable `'%pagename%'`
 found in the path `'%pagename%'` has been registered yet and if not
 register it, so our `'init'` hook simply becomes:

 {{{
 add_action( 'init', 'mysite_url_init' );
 function mysite_url_init() {
   register_url_path( '%pagename%' );
 }
 }}}

 '''Very easy for the themer, no?  Of course, for the person that really
 needs power they can build all the metadata from scratch, but the
 functions `register_query_var()` and `__register_url_path()` contain all
 the default metadata for common query variables and common paths.'''

 What you have here is a fully''(?)'' working URL routing engine but only a
 handful of the standard query variables and standard paths have been
 defined yet. Remember, this is a '''proof-of-concept''', not something
 ready to be included into WordPress core ''(though I'll be happy to help
 get it ready for core once I get agreement from the team that it is
 wanted.)''

 A few other points to argue for this approach:

  1. The core code is working today.

  2. Building a tree of URL path pattern nodes is a very close fit to the
 structure of the URL path that is it modeling. Can we really do better?

  3. It is very flexible in it's URL matching and does not rely solely on
 RegEx.

  4. It really fits the WordPress architecture because it's entire goal is
 to populate `$wp->query_vars` correctly, '''''and nothing more'''''.

  5. It integrates with existing WordPress architecture with a very tiny
 amount of changes; only URL rewrites are affected and probably only the
 lower level hooks.

  6. There will be very few hooks that will need to be deprecated and
 warnings can be generated for those hooks with WP_DEBUG is defined.

  7. The tree of URL Nodes also provides metadata needed for ''automated
 breadbrumb generation'' and for ''automated sitemap generation'' (both for
 XML Sitemaps and sitemaps for humans.)

 ----

 To try it the attached plugin copy the `wp-url-routes.php` file into your
 site's `/wp-content/plugins/` directory and then activate the ''"WP URL
 Routes"'' plugin. Also be sure to `define('WP_DEBUG',true);` in `/wp-
 config.php`

 The two URL paths defined for the demo are `'%category_name%/%name%'` and
 `'%pagename%'` so type in any a URL that should match one of these and you
 should see ''"URL Routing Result: routed"'' displayed in the top left
 corner. What follows is the code for the test config you will find at the
 bottom of the `wp-url-routes.php` file:

 {{{
 /*
  * Define OMIT_URL_ROUTES_TEST_CONFIG if you want to omit this test
 configuration.
  */
 if ( ! defined( 'OMIT_URL_ROUTES_TEST_CONFIG') ) {
   add_action( 'init', '_wp_url_routes_test_config' );
   function _wp_url_routes_test_config() {
     register_url_path( '%category_name%/%name%' );
     register_url_path( '%pagename%' );
   }
 }
 }}}

 I'd love to see this used as a base to finally ''"fix"'' the URL routing
 in WordPress to make it performant and to provide full flexibility. Again,
 it's a proof-of-concept so it is just a starting point and we can evolve
 it significantly if needed. However, if the team is not interested I'll be
 publishing this on wordpress.org sometime in the next 90-120 days, albeit
 with a different name. But it really needs to be integrated with core and
 not be a plugin in order to provide full value to everyone who needs
 something like this.

-- 
Ticket URL: <http://core.trac.wordpress.org/ticket/18276>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list