<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="content-type" content="text/html; charset=utf-8" />
<title>[27359] trunk: Introduce get_site_by_path() and further rewrite the site detection process for multisite.</title>
</head>
<body>

<style type="text/css"><!--
#msg dl.meta { border: 1px #006 solid; background: #369; padding: 6px; color: #fff; }
#msg dl.meta dt { float: left; width: 6em; font-weight: bold; }
#msg dt:after { content:':';}
#msg dl, #msg dt, #msg ul, #msg li, #header, #footer, #logmsg { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt;  }
#msg dl a { font-weight: bold}
#msg dl a:link    { color:#fc3; }
#msg dl a:active  { color:#ff0; }
#msg dl a:visited { color:#cc6; }
h3 { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; font-weight: bold; }
#msg pre { overflow: auto; background: #ffc; border: 1px #fa0 solid; padding: 6px; }
#logmsg { background: #ffc; border: 1px #fa0 solid; padding: 1em 1em 0 1em; }
#logmsg p, #logmsg pre, #logmsg blockquote { margin: 0 0 1em 0; }
#logmsg p, #logmsg li, #logmsg dt, #logmsg dd { line-height: 14pt; }
#logmsg h1, #logmsg h2, #logmsg h3, #logmsg h4, #logmsg h5, #logmsg h6 { margin: .5em 0; }
#logmsg h1:first-child, #logmsg h2:first-child, #logmsg h3:first-child, #logmsg h4:first-child, #logmsg h5:first-child, #logmsg h6:first-child { margin-top: 0; }
#logmsg ul, #logmsg ol { padding: 0; list-style-position: inside; margin: 0 0 0 1em; }
#logmsg ul { text-indent: -1em; padding-left: 1em; }#logmsg ol { text-indent: -1.5em; padding-left: 1.5em; }
#logmsg > ul, #logmsg > ol { margin: 0 0 1em 0; }
#logmsg pre { background: #eee; padding: 1em; }
#logmsg blockquote { border: 1px solid #fa0; border-left-width: 10px; padding: 1em 1em 0 1em; background: white;}
#logmsg dl { margin: 0; }
#logmsg dt { font-weight: bold; }
#logmsg dd { margin: 0; padding: 0 0 0.5em 0; }
#logmsg dd:before { content:'\00bb';}
#logmsg table { border-spacing: 0px; border-collapse: collapse; border-top: 4px solid #fa0; border-bottom: 1px solid #fa0; background: #fff; }
#logmsg table th { text-align: left; font-weight: normal; padding: 0.2em 0.5em; border-top: 1px dotted #fa0; }
#logmsg table td { text-align: right; border-top: 1px dotted #fa0; padding: 0.2em 0.5em; }
#logmsg table thead th { text-align: center; border-bottom: 1px solid #fa0; }
#logmsg table th.Corner { text-align: left; }
#logmsg hr { border: none 0; border-top: 2px dashed #fa0; height: 1px; }
#header, #footer { color: #fff; background: #636; border: 1px #300 solid; padding: 6px; }
#patch { width: 100%; }
#patch h4 {font-family: verdana,arial,helvetica,sans-serif;font-size:10pt;padding:8px;background:#369;color:#fff;margin:0;}
#patch .propset h4, #patch .binary h4 {margin:0;}
#patch pre {padding:0;line-height:1.2em;margin:0;}
#patch .diff {width:100%;background:#eee;padding: 0 0 10px 0;overflow:auto;}
#patch .propset .diff, #patch .binary .diff  {padding:10px 0;}
#patch span {display:block;padding:0 10px;}
#patch .modfile, #patch .addfile, #patch .delfile, #patch .propset, #patch .binary, #patch .copfile {border:1px solid #ccc;margin:10px 0;}
#patch ins {background:#dfd;text-decoration:none;display:block;padding:0 10px;}
#patch del {background:#fdd;text-decoration:none;display:block;padding:0 10px;}
#patch .lines, .info {color:#888;background:#fff;}
--></style>
<div id="msg">
<dl class="meta">
<dt>Revision</dt> <dd><a href="http://core.trac.wordpress.org/changeset/27359">27359</a></dd>
<dt>Author</dt> <dd>nacin</dd>
<dt>Date</dt> <dd>2014-03-02 22:24:50 +0000 (Sun, 02 Mar 2014)</dd>
</dl>

<h3>Log Message</h3>
<pre>Introduce get_site_by_path() and further rewrite the site detection process for multisite.

This is the first big step to supporting arbitrary domains and paths. In this new approach, sites are detected first where possible, then the network is inferred. Allows filtering for arbitrary path segments, smooths out some weirdness, and removes various restrictions. A sunrise plugin could do much of its work by adding filters, if those are even needed.

see <a href="http://core.trac.wordpress.org/ticket/27003">#27003</a>.</pre>

<h3>Modified Paths</h3>
<ul>
<li><a href="#trunksrcwpadminincludesschemaphp">trunk/src/wp-admin/includes/schema.php</a></li>
<li><a href="#trunksrcwpincludesmsloadphp">trunk/src/wp-includes/ms-load.php</a></li>
<li><a href="#trunksrcwpincludesmssettingsphp">trunk/src/wp-includes/ms-settings.php</a></li>
<li><a href="#trunktestsphpunittestsmsphp">trunk/tests/phpunit/tests/ms.php</a></li>
</ul>

</div>
<div id="patch">
<h3>Diff</h3>
<a id="trunksrcwpadminincludesschemaphp"></a>
<div class="modfile"><h4>Modified: trunk/src/wp-admin/includes/schema.php (27358 => 27359)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/src/wp-admin/includes/schema.php   2014-03-02 22:22:41 UTC (rev 27358)
+++ trunk/src/wp-admin/includes/schema.php      2014-03-02 22:24:50 UTC (rev 27359)
</span><span class="lines">@@ -889,6 +889,8 @@
</span><span class="cx">          $wpdb->insert( $wpdb->site, array( 'domain' => $domain, 'path' => $path, 'id' => $network_id ) );
</span><span class="cx">  }
</span><span class="cx"> 
</span><ins>+       wp_cache_delete( 'networks_have_paths', 'site-options' );
+
</ins><span class="cx">   if ( !is_multisite() ) {
</span><span class="cx">          $site_admins = array( $site_user->user_login );
</span><span class="cx">          $users = get_users( array( 'fields' => array( 'ID', 'user_login' ) ) );
</span></span></pre></div>
<a id="trunksrcwpincludesmsloadphp"></a>
<div class="modfile"><h4>Modified: trunk/src/wp-includes/ms-load.php (27358 => 27359)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/src/wp-includes/ms-load.php        2014-03-02 22:22:41 UTC (rev 27358)
+++ trunk/src/wp-includes/ms-load.php   2014-03-02 22:24:50 UTC (rev 27359)
</span><span class="lines">@@ -115,6 +115,8 @@
</span><span class="cx"> /**
</span><span class="cx">  * Sets current site name.
</span><span class="cx">  *
</span><ins>+ * @todo deprecate
+ *
</ins><span class="cx">  * @access private
</span><span class="cx">  * @since 3.0.0
</span><span class="cx">  * @return object $current_site object with site_name
</span><span class="lines">@@ -138,15 +140,14 @@
</span><span class="cx">  *
</span><span class="cx">  * @since 3.9.0
</span><span class="cx">  *
</span><del>- * @param string $domain Domain to check.
- * @param string $path   Path to check.
</del><ins>+ * @param string $domain   Domain to check.
+ * @param string $path     Path to check.
+ * @param int    $segments Path segments to use. Defaults to null, or the full path.
</ins><span class="cx">  * @return object|bool Network object if successful. False when no network is found.
</span><span class="cx">  */
</span><del>-function get_network_by_path( $domain, $path ) {
</del><ins>+function get_network_by_path( $domain, $path, $segments = null ) {
</ins><span class="cx">   global $wpdb;
</span><span class="cx"> 
</span><del>-       $network_id = false;
-
</del><span class="cx">   $domains = $exact_domains = array( $domain );
</span><span class="cx">  $pieces = explode( '.', $domain );
</span><span class="cx"> 
</span><span class="lines">@@ -158,19 +159,96 @@
</span><span class="cx">          }
</span><span class="cx">  }
</span><span class="cx"> 
</span><del>-       if ( '/' !== $path ) {
-               $paths = array( '/', $path );
-       } else {
-               $paths = array( '/' );
</del><ins>+        /*
+        * If we've gotten to this function during normal execution, there is
+        * more than one network installed. At this point, who knows how many
+        * we have. Attempt to optimize for the situation where networks are
+        * only domains, thus meaning paths never need to be considered.
+        *
+        * This is a very basic optimization; anything further could have drawbacks
+        * depending on the setup, so this is best done per-install.
+        */
+       $using_paths = true;
+       if ( wp_using_ext_object_cache() ) {
+               $using_paths = wp_cache_get( 'networks_have_paths', 'site-options' );
+               if ( false === $using_paths ) {
+                       $using_paths = (bool) $wpdb->get_var( "SELECT id FROM $wpdb->site WHERE path <> '/' LIMIT 1" );
+                       wp_cache_add( 'networks_have_paths', (int) $using_paths, 'site-options'  );
+               }
</ins><span class="cx">   }
</span><span class="cx"> 
</span><ins>+       $paths = array();
+       if ( $using_paths ) {
+               $path_segments = array_filter( explode( '/', trim( $path, "/" ) ) );
+
+               /**
+                * Filter the number of path segments to consider when searching for a site.
+                *
+                * @since 3.9.0
+                *
+                * @param mixed  $segments The number of path segments to consider. WordPress by default looks at
+                *                         one path segment. The function default of null only makes sense when you
+                *                         know the requested path should match a network.
+                * @param string $domain   The requested domain.
+                * @param string $path     The requested path, in full.
+                */
+               $segments = apply_filters( 'network_by_path_segments_count', $segments, $domain, $path );
+
+               if ( null !== $segments && count($path_segments ) > $segments ) {
+                       $path_segments = array_slice( $path_segments, 0, $segments );
+               }
+
+               while ( count( $path_segments ) ) {
+                       $paths[] = '/' . implode( '/', $path_segments ) . '/';
+                       array_pop( $path_segments );
+               }
+
+               $paths[] = '/';
+       }
+
+       /**
+        * Determine a network by its domain and path.
+        *
+        * This allows one to short-circuit the default logic, perhaps by
+        * replacing it with a routine that is more optimal for your setup.
+        *
+        * Return null to avoid the short-circuit. Return false if no network
+        * can be found at the requested domain and path. Otherwise, return
+        * an object from wp_get_network().
+        *
+        * @since 3.9.0
+        *
+        * @param string $domain   The requested domain.
+        * @param string $path     The requested path, in full.
+        * @param mixed  $segments The suggested number of paths to consult.
+        *                         Default null, meaning the entire path was to be consulted.
+        * @param array  $paths    The paths to search for, based on $path and $segments.
+        */
+       $pre = apply_filters( 'pre_get_network_by_path', null, $domain, $path, $segments, $paths );
+       if ( null !== $pre ) {
+               return $pre;
+       }
+
+       // @todo Consider additional optimization routes, perhaps as an opt-in for plugins.
+       // We already have paths covered. What about how far domains should be drilled down (including www)?
+
</ins><span class="cx">   $search_domains = "'" . implode( "', '", $wpdb->_escape( $domains ) ) . "'";
</span><del>-       $paths = "'" . implode( "', '", $wpdb->_escape( $paths ) ) . "'";
</del><span class="cx"> 
</span><del>-       $networks = $wpdb->get_results( "SELECT id, domain, path FROM $wpdb->site
-               WHERE domain IN ($search_domains) AND path IN ($paths)
-               ORDER BY CHAR_LENGTH(domain) DESC, CHAR_LENGTH(path) DESC" );
</del><ins>+        if ( ! $using_paths ) {
+               $network = $wpdb->get_row( "SELECT id, domain, path FROM $wpdb->site
+                       WHERE domain IN ($search_domains) ORDER BY CHAR_LENGTH(domain) DESC LIMIT 1" );
+               if ( $network ) {
+                       return wp_get_network( $network );
+               }
+               return false;
</ins><span class="cx"> 
</span><ins>+       } else {
+               $search_paths = "'" . implode( "', '", $wpdb->_escape( $paths ) ) . "'";
+               $networks = $wpdb->get_results( "SELECT id, domain, path FROM $wpdb->site
+                       WHERE domain IN ($search_domains) AND path IN ($search_paths)
+                       ORDER BY CHAR_LENGTH(domain) DESC, CHAR_LENGTH(path) DESC" );
+       }
+
</ins><span class="cx">   /*
</span><span class="cx">   * Domains are sorted by length of domain, then by length of path.
</span><span class="cx">   * The domain must match for the path to be considered. Otherwise,
</span><span class="lines">@@ -179,7 +257,7 @@
</span><span class="cx">  $found = false;
</span><span class="cx">  foreach ( $networks as $network ) {
</span><span class="cx">          if ( $network->domain === $domain || "www.$network->domain" === $domain ) {
</span><del>-                       if ( $network->path === $path ) {
</del><ins>+                        if ( in_array( $network->path, $paths, true ) ) {
</ins><span class="cx">                           $found = true;
</span><span class="cx">                          break;
</span><span class="cx">                  }
</span><span class="lines">@@ -191,9 +269,7 @@
</span><span class="cx">  }
</span><span class="cx"> 
</span><span class="cx">  if ( $found ) {
</span><del>-               $network = wp_get_network( $network );
-
-               return $network;
</del><ins>+                return wp_get_network( $network );
</ins><span class="cx">   }
</span><span class="cx"> 
</span><span class="cx">  return false;
</span><span class="lines">@@ -221,61 +297,95 @@
</span><span class="cx"> }
</span><span class="cx"> 
</span><span class="cx"> /**
</span><del>- * Sets current_site object.
- *
- * @access private
- * @since 3.0.0
- * @return object $current_site object
</del><ins>+ * @todo deprecate
</ins><span class="cx">  */
</span><span class="cx"> function wpmu_current_site() {
</span><del>-       global $wpdb, $current_site, $domain, $path;
</del><ins>+}
</ins><span class="cx"> 
</span><del>-       if ( empty( $current_site ) )
-               $current_site = new stdClass;
</del><ins>+/**
+ * Retrieve a site object by its domain and path.
+ *
+ * @since 3.9.0
+ *
+ * @param string $domain   Domain to check.
+ * @param string $path     Path to check.
+ * @param int    $segments Path segments to use. Defaults to null, or the full path.
+ * @return object|bool Site object if successful. False when no site is found.
+ */
+function get_site_by_path( $domain, $path, $segments = null ) {
+       global $wpdb;
</ins><span class="cx"> 
</span><del>-       // 1. If constants are defined, that's our network.
-       if ( defined( 'DOMAIN_CURRENT_SITE' ) && defined( 'PATH_CURRENT_SITE' ) ) {
-               $current_site->id = defined( 'SITE_ID_CURRENT_SITE' ) ? SITE_ID_CURRENT_SITE : 1;
-               $current_site->domain = DOMAIN_CURRENT_SITE;
-               $current_site->path   = $path = PATH_CURRENT_SITE;
-               if ( defined( 'BLOG_ID_CURRENT_SITE' ) )
-                       $current_site->blog_id = BLOG_ID_CURRENT_SITE;
-               elseif ( defined( 'BLOGID_CURRENT_SITE' ) ) // deprecated.
-                       $current_site->blog_id = BLOGID_CURRENT_SITE;
</del><ins>+        $path_segments = array_filter( explode( '/', trim( $path, "/" ) ) );
</ins><span class="cx"> 
</span><del>-       // 2. Pull the network from cache, if possible.
-       } elseif ( ! $current_site = wp_cache_get( 'current_site', 'site-options' ) ) {
</del><ins>+        /**
+        * Filter the number of path segments to consider when searching for a site.
+        *
+        * @since 3.9.0
+        *
</ins><span class="cx"> 
</span><del>-               // 3. See if they have only one network.
-               $networks = $wpdb->get_col( "SELECT id FROM $wpdb->site LIMIT 2" );
</del><ins>+         * @param mixed  $segments The number of path segments to consider. WordPress by default looks at
+        *                         one path segment following the network path. The function default of
+        *                         null only makes sense when you know the requested path should match a site.
+        * @param string $domain   The requested domain.
+        * @param string $path     The requested path, in full.
+        */
+       $segments = apply_filters( 'site_by_path_segments_count', $segments, $domain, $path );
</ins><span class="cx"> 
</span><del>-               if ( count( $networks ) <= 1 ) {
-                       $current_site = wp_get_network( $networks[0] );
</del><ins>+        if ( null !== $segments && count($path_segments ) > $segments ) {
+               $path_segments = array_slice( $path_segments, 0, $segments );
+       }
</ins><span class="cx"> 
</span><del>-                       $current_site->blog_id = $wpdb->get_var( $wpdb->prepare( "SELECT blog_id
-                               FROM $wpdb->blogs WHERE domain = %s AND path = %s",
-                               $current_site->domain, $current_site->path ) );
</del><ins>+        while ( count( $path_segments ) ) {
+               $paths[] = '/' . implode( '/', $path_segments ) . '/';
+               array_pop( $path_segments );
+       }
</ins><span class="cx"> 
</span><del>-                       wp_cache_set( 'current_site', 'site-options' );
</del><ins>+        $paths[] = '/';
</ins><span class="cx"> 
</span><del>-               // 4. Multiple networks are in play. Determine which via domain and path.
-               } else {
-                       // Find the first path segment.
-                       $path = substr( $_SERVER['REQUEST_URI'], 0, 1 + strpos( $_SERVER['REQUEST_URI'], '/', 1 ) );
-                       $current_site = get_network_by_path( $domain, $path );
</del><ins>+        /**
+        * Determine a site by its domain and path.
+        *
+        * This allows one to short-circuit the default logic, perhaps by
+        * replacing it with a routine that is more optimal for your setup.
+        *
+        * Return null to avoid the short-circuit. Return false if no site
+        * can be found at the requested domain and path. Otherwise, return
+        * a site object.
+        *
+        * @since 3.9.0
+        *
+        * @param string $domain   The requested domain.
+        * @param string $path     The requested path, in full.
+        * @param mixed  $segments The suggested number of paths to consult.
+        *                         Default null, meaning the entire path was to be consulted.
+        * @param array  $paths    The paths to search for, based on $path and $segments.
+        */
+       $pre = apply_filters( 'pre_get_site_by_path', null, $domain, $path, $segments, $paths );
+       if ( null !== $pre ) {
+               return $pre;
+       }
</ins><span class="cx"> 
</span><del>-                       // Option 1. We did not find anything.
-                       if ( ! $current_site ) {
-                               wp_load_translations_early();
-                               wp_die( __( 'No site defined on this host. If you are the owner of this site, please check <a href="http://codex.wordpress.org/Debugging_a_WordPress_Network">Debugging a WordPress Network</a> for help.' ) );
-                       }
-               }
</del><ins>+        // @todo
+       // get_blog_details(), caching, etc. Consider alternative optimization routes,
+       // perhaps as an opt-in for plugins, rather than using the pre_* filter.
+       // For example: The segments filter can expand or ignore paths.
+       // If persistent caching is enabled, we could query the DB for a path <> '/'
+       // then cache whether we can just always ignore paths.
+
+       if ( count( $paths ) > 1 ) {
+               $paths = "'" . implode( "', '", $wpdb->_escape( $paths ) ) . "'";
+               $site = $wpdb->get_row( $wpdb->prepare( "SELECT * FROM $wpdb->blogs
+                       WHERE domain = %s AND path IN ($paths) ORDER BY CHAR_LENGTH(path) DESC LIMIT 1", $domain ) );
+       } else {
+               $site = $wpdb->get_row( $wpdb->prepare( "SELECT * FROM $wpdb->blogs WHERE domain = %s and path = %s", $domain, $paths[0] ) );
</ins><span class="cx">   }
</span><span class="cx"> 
</span><del>-       // Option 2. We found something. Load up site meta and return.
-       wp_load_core_site_options();
-       $current_site = get_current_site_name( $current_site );
-       return $current_site;
</del><ins>+        if ( $site ) {
+               // @todo get_blog_details()
+               return $site;
+       }
+
+       return false;
</ins><span class="cx"> }
</span><span class="cx"> 
</span><span class="cx"> /**
</span></span></pre></div>
<a id="trunksrcwpincludesmssettingsphp"></a>
<div class="modfile"><h4>Modified: trunk/src/wp-includes/ms-settings.php (27358 => 27359)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/src/wp-includes/ms-settings.php    2014-03-02 22:22:41 UTC (rev 27358)
+++ trunk/src/wp-includes/ms-settings.php       2014-03-02 22:24:50 UTC (rev 27359)
</span><span class="lines">@@ -22,109 +22,155 @@
</span><span class="cx"> 
</span><span class="cx"> if ( !isset( $current_site ) || !isset( $current_blog ) ) {
</span><span class="cx"> 
</span><del>-       $domain = addslashes( $_SERVER['HTTP_HOST'] );
-       if ( false !== strpos( $domain, ':' ) ) {
-               if ( substr( $domain, -3 ) == ':80' ) {
-                       $domain = substr( $domain, 0, -3 );
-                       $_SERVER['HTTP_HOST'] = substr( $_SERVER['HTTP_HOST'], 0, -3 );
-               } elseif ( substr( $domain, -4 ) == ':443' ) {
-                       $domain = substr( $domain, 0, -4 );
-                       $_SERVER['HTTP_HOST'] = substr( $_SERVER['HTTP_HOST'], 0, -4 );
</del><ins>+        // Given the domain and path, let's try to identify the network and site.
+       // Usually, it's easier to query the site first, which declares its network.
+       // In limited situations, though, we either can or must find the network first.
+
+       $domain = strtolower( stripslashes( $_SERVER['HTTP_HOST'] ) );
+       if ( substr( $domain, -3 ) == ':80' ) {
+               $domain = substr( $domain, 0, -3 );
+               $_SERVER['HTTP_HOST'] = substr( $_SERVER['HTTP_HOST'], 0, -3 );
+       } elseif ( substr( $domain, -4 ) == ':443' ) {
+               $domain = substr( $domain, 0, -4 );
+               $_SERVER['HTTP_HOST'] = substr( $_SERVER['HTTP_HOST'], 0, -4 );
+       }
+
+       $path = stripslashes( $_SERVER['REQUEST_URI'] );
+       if ( is_admin() ) {
+               $path = preg_replace( '#(.*)/wp-admin/.*#', '$1/', $path );
+       }
+       list( $path ) = explode( '?', $path );
+
+       // If the network is defined in wp-config.php, we can simply use that.
+       if ( defined( 'DOMAIN_CURRENT_SITE' ) && defined( 'PATH_CURRENT_SITE' ) ) {
+               $current_site = new stdClass;
+               $current_site->id = defined( 'SITE_ID_CURRENT_SITE' ) ? SITE_ID_CURRENT_SITE : 1;
+               $current_site->domain = DOMAIN_CURRENT_SITE;
+               $current_site->path = PATH_CURRENT_SITE;
+               if ( defined( 'BLOG_ID_CURRENT_SITE' ) ) {
+                       $current_site->blog_id = BLOG_ID_CURRENT_SITE;
+               } elseif ( defined( 'BLOGID_CURRENT_SITE' ) ) { // deprecated.
+                       $current_site->blog_id = BLOGID_CURRENT_SITE;
+               }
+
+               if ( $current_site->domain === $domain && $current_site->path === $path ) {
+                       $current_blog = get_site_by_path( $domain, $path );
+               } elseif ( '/' !== $current_site->path && $current_site->domain === $domain && 0 === strpos( $path, $current_site->path ) ) {
+                       // If the current network has a path and also matches the domain and path of the request,
+                       // we need to look for a site using the first path segment following the network's path.
+                       $current_blog = get_site_by_path( $domain, $path, 1 + count( explode( '/', trim( $current_site->path, '/' ) ) ) );
</ins><span class="cx">           } else {
</span><del>-                       wp_load_translations_early();
-                       wp_die( __( 'Multisite only works without the port number in the URL.' ) );
</del><ins>+                        // Otherwise, use the first path segment (as usual).
+                       $current_blog = get_site_by_path( $domain, $path, 1 );
</ins><span class="cx">           }
</span><ins>+
+       } elseif ( ! is_subdomain_install() ) {
+               /*
+                * A "subdomain" install can be re-interpreted to mean "can support any domain".
+                * If we're not dealing with one of these installs, then the important part is determing
+                * the network first, because we need the network's path to identify any sites.
+                */
+               if ( ! $current_site = wp_cache_get( 'current_network', 'site-options' ) ) {
+                       // Are there even two networks installed?
+                       $one_network = $wpdb->get_row( "SELECT * FROM $wpdb->site LIMIT 2" ); // [sic]
+                       if ( 1 === $wpdb->num_rows ) {
+                               $current_site = wp_get_network( $one_network );
+                               wp_cache_set( 'current_network', 'site-options' );
+                       } elseif ( 0 === $wpdb->num_rows ) {
+                               ms_not_installed();
+                       }
+               }
+               if ( empty( $current_site ) ) {
+                       $current_site = get_network_by_path( $domain, $path, 1 );
+               }
+
+               if ( empty( $current_site ) ) {
+                       ms_not_installed();
+               } elseif ( $path === $current_site->path ) {
+                       $current_blog = get_site_by_path( $domain, $path );
+               } else {
+                       // Search the network path + one more path segment (on top of the network path).
+                       $current_blog = get_site_by_path( $domain, $path, substr_count( $current_site->path, '/' ) );
+               }
+       } else {
+               // Find the site by the domain and at most the first path segment.
+               $current_blog = get_site_by_path( $domain, $path, 1 );
+               if ( $current_blog ) {
+                       $current_site = wp_get_network( $current_blog->site_id ? $current_blog->site_id : 1 );
+               } else {
+                       // If you don't have a site with the same domain/path as a network, you're pretty screwed, but:
+                       $current_site = get_network_by_path( $domain, $path, 1 );
+               }
</ins><span class="cx">   }
</span><span class="cx"> 
</span><del>-       $domain = rtrim( $domain, '.' );
</del><ins>+        // The network declared by the site trumps any constants.
+       if ( $current_blog && $current_blog->site_id != $current_site->id ) {
+               $current_site = wp_get_network( $current_blog->site_id );
+       }
</ins><span class="cx"> 
</span><del>-       $path = preg_replace( '|([a-z0-9-]+.php.*)|', '', $_SERVER['REQUEST_URI'] );
-       $path = str_replace ( '/wp-admin/', '/', $path );
-       $path = preg_replace( '|(/[a-z0-9-]+?/).*|', '$1', $path );
</del><ins>+        // If we don't have a network by now, we have a problem.
+       if ( empty( $current_site ) ) {
+               ms_not_installed();
+       }
</ins><span class="cx"> 
</span><del>-       $current_site = wpmu_current_site();
</del><ins>+        // @todo What if the domain of the network doesn't match the current site?
</ins><span class="cx">   $current_site->cookie_domain = $current_site->domain;
</span><span class="cx">  if ( 'www.' === substr( $current_site->cookie_domain, 0, 4 ) ) {
</span><span class="cx">          $current_site->cookie_domain = substr( $current_site->cookie_domain, 4 );
</span><span class="cx">  }
</span><span class="cx"> 
</span><del>-       if ( ! isset( $current_site->blog_id ) )
-               $current_site->blog_id = $wpdb->get_var( $wpdb->prepare( "SELECT blog_id FROM $wpdb->blogs WHERE domain = %s AND path = %s", $current_site->domain, $current_site->path ) );
-
-       if ( is_subdomain_install() ) {
-               $current_blog = wp_cache_get( 'current_blog_' . $domain, 'site-options' );
-               if ( !$current_blog ) {
-                       $current_blog = get_blog_details( array( 'domain' => $domain ), false );
-                       if ( $current_blog )
-                               wp_cache_set( 'current_blog_' . $domain, $current_blog, 'site-options' );
</del><ins>+        // Figure out the current network's main site.
+       if ( ! isset( $current_site->blog_id ) ) {
+               if ( $current_blog && $current_blog->domain === $current_site->domain && $current_blog->path === $current_site->path ) {
+                       $current_site->blog_id = $current_blog->blog_id;
+               } else {
+                       // @todo we should be able to cache the blog ID of a network's main site easily.
+                       $current_site->blog_id = $wpdb->get_var( $wpdb->prepare( "SELECT blog_id FROM $wpdb->blogs WHERE domain = %s AND path = %s",
+                               $current_site->domain, $current_site->path ) );
</ins><span class="cx">           }
</span><del>-               if ( $current_blog && $current_blog->site_id != $current_site->id ) {
-                       $current_site = $wpdb->get_row( $wpdb->prepare( "SELECT * FROM $wpdb->site WHERE id = %d", $current_blog->site_id ) );
-                       if ( ! isset( $current_site->blog_id ) )
-                               $current_site->blog_id = $wpdb->get_var( $wpdb->prepare( "SELECT blog_id FROM $wpdb->blogs WHERE domain = %s AND path = %s", $current_site->domain, $current_site->path ) );
-               } else
-                       $blogname = substr( $domain, 0, strpos( $domain, '.' ) );
-       } else {
-               $blogname = htmlspecialchars( substr( $_SERVER[ 'REQUEST_URI' ], strlen( $path ) ) );
-               if ( false !== strpos( $blogname, '/' ) )
-                       $blogname = substr( $blogname, 0, strpos( $blogname, '/' ) );
-               if ( false !== strpos( $blogname, '?' ) )
-                       $blogname = substr( $blogname, 0, strpos( $blogname, '?' ) );
-               $reserved_blognames = array( 'page', 'comments', 'blog', 'wp-admin', 'wp-includes', 'wp-content', 'files', 'feed' );
-               if ( $blogname != '' && ! in_array( $blogname, $reserved_blognames ) && ! is_file( $blogname ) )
-                       $path .= $blogname . '/';
-               $current_blog = wp_cache_get( 'current_blog_' . $domain . $path, 'site-options' );
-               if ( ! $current_blog ) {
-                       $current_blog = get_blog_details( array( 'domain' => $domain, 'path' => $path ), false );
-                       if ( $current_blog )
-                               wp_cache_set( 'current_blog_' . $domain . $path, $current_blog, 'site-options' );
-               }
-               unset($reserved_blognames);
</del><span class="cx">   }
</span><span class="cx"> 
</span><del>-       if ( ! defined( 'WP_INSTALLING' ) && is_subdomain_install() && ! is_object( $current_blog ) ) {
-               if ( defined( 'NOBLOGREDIRECT' ) ) {
-                       $destination = NOBLOGREDIRECT;
-                       if ( '%siteurl%' == $destination )
-                               $destination = "http://" . $current_site->domain . $current_site->path;
</del><ins>+        // If we haven't figured out our site, give up.
+       if ( empty( $current_blog ) ) {
+               if ( defined( 'WP_INSTALLING' ) ) {
+                       $current_blog->blog_id = $blog_id = 1;
+
+               } elseif ( is_subdomain_install() ) {
+                       // @todo This is only for an open registration subdomain network.
+                       if ( defined( 'NOBLOGREDIRECT' ) ) {
+                               if ( '%siteurl%' === NOBLOGREDIRECT ) {
+                                       $destination = "http://" . $current_site->domain . $current_site->path;
+                               } else {
+                                       $destination = NOBLOGREDIRECT;
+                               }
+                       } else {
+                               $destination = 'http://' . $current_site->domain . $current_site->path . 'wp-signup.php?new=' . str_replace( '.' . $current_site->domain, '', $domain );
+                       }
+                       header( 'Location: ' . $destination );
+                       exit;
+
</ins><span class="cx">           } else {
</span><del>-                       $destination = 'http://' . $current_site->domain . $current_site->path . 'wp-signup.php?new=' . str_replace( '.' . $current_site->domain, '', $domain );
-               }
-               header( 'Location: ' . $destination );
-               die();
-       }
-
-       if ( ! defined( 'WP_INSTALLING' ) ) {
-               if ( $current_site && ! $current_blog ) {
-                       if ( $current_site->domain != $_SERVER[ 'HTTP_HOST' ] ) {
</del><ins>+                        if ( 0 !== strcasecmp( $current_site->domain, $domain ) ) {
</ins><span class="cx">                           header( 'Location: http://' . $current_site->domain . $current_site->path );
</span><span class="cx">                          exit;
</span><span class="cx">                  }
</span><del>-                       $current_blog = get_blog_details( array( 'domain' => $current_site->domain, 'path' => $current_site->path ), false );
</del><ins>+                        ms_not_installed();
</ins><span class="cx">           }
</span><del>-               if ( ! $current_blog || ! $current_site )
-                       ms_not_installed();
</del><span class="cx">   }
</span><span class="cx"> 
</span><span class="cx">  $blog_id = $current_blog->blog_id;
</span><span class="cx">  $public  = $current_blog->public;
</span><span class="cx"> 
</span><del>-       if ( empty( $current_blog->site_id ) )
</del><ins>+        if ( empty( $current_blog->site_id ) ) {
+               // This dates to [MU134] and shouldn't be relevant anymore,
+               // but it could be possible for arguments passed to insert_blog() etc.
</ins><span class="cx">           $current_blog->site_id = 1;
</span><ins>+       }
+
</ins><span class="cx">   $site_id = $current_blog->site_id;
</span><ins>+       wp_load_core_site_options( $site_id );
+}
</ins><span class="cx"> 
</span><del>-       $current_site = get_current_site_name( $current_site );
-
-       if ( ! $blog_id ) {
-               if ( defined( 'WP_INSTALLING' ) ) {
-                       $current_blog->blog_id = $blog_id = 1;
-               } else {
-                       wp_load_translations_early();
-                       $msg = ! $wpdb->get_var( "SHOW TABLES LIKE '$wpdb->site'" ) ? ' ' . __( 'Database tables are missing.' ) : '';
-                       wp_die( __( 'No site by that name on this system.' ) . $msg );
-               }
-       }
-}
</del><span class="cx"> $wpdb->set_prefix( $table_prefix, false ); // $table_prefix can be set in sunrise.php
</span><span class="cx"> $wpdb->set_blog_id( $current_blog->blog_id, $current_blog->site_id );
</span><span class="cx"> $table_prefix = $wpdb->get_blog_prefix();
</span><span class="lines">@@ -134,5 +180,12 @@
</span><span class="cx"> // need to init cache again after blog_id is set
</span><span class="cx"> wp_start_object_cache();
</span><span class="cx"> 
</span><ins>+if ( ! isset( $current_site->site_name ) ) {
+       $current_site->site_name = get_site_option( 'site_name' );
+       if ( ! $current_site->site_name ) {
+               $current_site->site_name = ucfirst( $current_site->domain );
+       }
+}
+
</ins><span class="cx"> // Define upload directory constants
</span><span class="cx"> ms_upload_constants();
</span></span></pre></div>
<a id="trunktestsphpunittestsmsphp"></a>
<div class="modfile"><h4>Modified: trunk/tests/phpunit/tests/ms.php (27358 => 27359)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/tests/phpunit/tests/ms.php 2014-03-02 22:22:41 UTC (rev 27358)
+++ trunk/tests/phpunit/tests/ms.php    2014-03-02 22:24:50 UTC (rev 27359)
</span><span class="lines">@@ -1221,6 +1221,45 @@
</span><span class="cx">  }
</span><span class="cx"> 
</span><span class="cx">  /**
</span><ins>+        * @ticket 27003
+        */
+       function test_get_site_by_path() {
+               $ids = array(
+                       'wordpress.org/'              => array( 'domain' => 'wordpress.org',      'path' => '/' ),
+                       'wordpress.org/foo/'          => array( 'domain' => 'wordpress.org',      'path' => '/foo/' ),
+                       'wordpress.org/foo/bar/'      => array( 'domain' => 'wordpress.org',      'path' => '/foo/bar/' ),
+                       'make.wordpress.org/'         => array( 'domain' => 'make.wordpress.org', 'path' => '/' ),
+                       'make.wordpress.org/foo/'     => array( 'domain' => 'make.wordpress.org', 'path' => '/foo/' ),
+               );
+
+               foreach ( $ids as &$id ) {
+                       $id = $this->factory->blog->create( $id );
+               }
+               unset( $id );
+
+               $this->assertEquals( $ids['wordpress.org/'],
+                       get_site_by_path( 'wordpress.org', '/notapath/' )->blog_id );
+
+               $this->assertEquals( $ids['wordpress.org/foo/bar/'],
+                       get_site_by_path( 'wordpress.org', '/foo/bar/baz/' )->blog_id );
+
+               $this->assertEquals( $ids['wordpress.org/foo/bar/'],
+                       get_site_by_path( 'wordpress.org', '/foo/bar/baz/', 3 )->blog_id );
+
+               $this->assertEquals( $ids['wordpress.org/foo/bar/'],
+                       get_site_by_path( 'wordpress.org', '/foo/bar/baz/', 2 )->blog_id );
+
+               $this->assertEquals( $ids['wordpress.org/foo/'],
+                       get_site_by_path( 'wordpress.org', '/foo/bar/baz/', 1 )->blog_id );
+
+               $this->assertEquals( $ids['wordpress.org/'],
+                       get_site_by_path( 'wordpress.org', '/', 0 )->blog_id );
+
+               $this->assertEquals( $ids['make.wordpress.org/foo/'],
+                       get_site_by_path( 'make.wordpress.org', '/foo/bar/baz/qux/', 4 )->blog_id );
+       }
+
+       /**
</ins><span class="cx">    * @ticket 20601
</span><span class="cx">   */
</span><span class="cx">  function test_user_member_of_blog() {
</span></span></pre>
</div>
</div>

</body>
</html>