[wp-hackers] Two new, long-overdue plugins to make your wordpress life a little easier...

Fri Oct 28 21:23:28 UTC 2011

Otto, it's not that simple - changing WP_SITEURL and WP_HOME doesn't fix the problem, wordpress admin will still pull content from other sources.  Additionally, a problem you have ignored from email 1, and WP_SITEURL and WP_HOME are not even evaluated in Multisite setups. They are ignored.  This does not work.  The effort I put into my plugin shows what is required to make it work for 99% of the problem domain and it only works in single site installs.

1.  You still don't get the export content issue.  If you host your content (different from exporting it away from your host) you can use root-relative urls 100% of the time no exceptions, every browser every mobile device will work, it's part of the http protocol standard.  When you export your content, like an rss feed, or email, or xml export, you can then process the content, and only then, which will save you all this hassle of dns tricks, search / replace scripts, wp filter and constant hacks.

2. You solve the export problem with one function.  You don't have to change the database, you feed what is in the database through that really simple filter and you're done for that half of your content.  As it stands wordpress filters all of your url content in every instance of its usage.  You don't need to keep those absolute urls in your database I promise millions of websites do it without any issues.

3. Administering sites from multiple URLs is only part of the equation, and just because it doesn't make sense to you doesn't mean that it is stupid or shouldn't happen due to very legitimate reasons in reality.  Remember that reality statement of yours?  See my DOD example, intranet users have to admin the site from a different url on the internal network, do you expect them to go outside the building to do this?  As it stands Wordpress will try to redirect them to the production url on login and every link click, something that they don't have access to that production URL because of what we call DMZ or a Demilitarized Zone: http://en.wikipedia.org/wiki/DMZ_(computing)
Yet, they should also be allowed to administer the site from outside the zone (not likely in the military,) but very likely in fortune 500 corporate use - which almost universally use DMZ's, but even still it is by no means restricted to that tier of customer.

4. If you think convincing the military, pci compliant organizations and the rest is just as simple as standing up and shouting you're stupid, you are obviously not seeing both the impossibility of living outside the reality of the world we live in, AND the extreme necessary value and purpose of why their NATS are setup that way. It is NOT STUPID, it's IMPORTANT.  These are not mistakes they are solutions to a world of security problems and will not go away because you don't like it.  Other "sucky" frameworks do not violate these principles which is why they are used prevalently in enterprise environments.  I agree that they have their drawbacks in some aspects, some more than others, but with regard to how they store urls and serve content this is not one of the drawbacks.

5. Your simple solution to bypassing a load-balanced gateway doesn't always work. Try to do it with an iPhone, or other mobile device, or even from a secured OS where you don't have root access to the host files to begin with.  If wordpress didn't have this problem you could do it without any host file configuration.  What security issue exists if I access wp-admin from 127.0.0.1 (or its production equivalent)?  (the answer is there isn't one.)

What it really comes down to in every situation you try to argue is this:

In my world, we use root-relative urls and munge them in one very specific way (add a domain to the beginning) for one very specific use-case (an export of content.)  Done, end of story.

In your world, to solve all of the problems I point out, you have to use dns tricks, rewrite hacks, content search & replacements, and process/munge the URL on every single page request every single time for every piece of content you serve (again whether you realize it or not, http/https) all the while architecturally restricting the use of IP address access, NAT traversal, DMZ configurations, DRY & KISS continuous integration practices and more issues than have even been presented thus far.

The onus is not on me to convince you that root-relative urls work without fail from an architectural standpoint, because the internet has been operating on this principle for the last 20 years.

So I will put the onus on you, explain one scenario in which root-relative URLS does not work, and does not have a viable solution for which you do not already recommend as a standard practice in wordpress today.  You try to say it won't work because we'd have to process the crap out of the data, but then you say the proper way to do it is to process the crap out of the data when you move environments.  You try to say with rss feeds (a clearly defined entry and exit point in the wordpress core) that it would break, yet we currently process the crap out of those absolute urls in rss feeds as it is to prevent it from breaking with absolute urls.  Why not just implement my "architecturally sound fix" in that very specific case?

The ultimate case is there isn't a scenario in which relative root urls are deficient to absolute urls, because in any one of those cases, we can simply replace this ^(\/.*)$ with the domain, and call it a day.  No manual or programmatic network settings, no global replacements in the database.  It just works in your world AND mine.

-----Original Message-----
From: wp-hackers-bounces at lists.automattic.com [mailto:wp-hackers-bounces at lists.automattic.com] On Behalf Of Otto
Sent: Friday, October 28, 2011 3:05 PM
To: wp-hackers at lists.automattic.com
Subject: Re: [wp-hackers] Two new, long-overdue plugins to make your wordpress life a little easier...

On Fri, Oct 28, 2011 at 2:33 PM, Marcus Pope <Marcus.Pope at springbox.com> wrote:
>> Sure you can. <img src="/root/relative/url.png"> works fine in a post.
>
> Otto, I see you understand less about this problem than I first thought.  Yes you can add root-relative urls in a post if you do it manually, or via a plugin that adds a filter to strip out the domain.  However you cannot login to a wordpress multisite via the following:
>
> http://127.0.0.1/wp-admin
> and
> http://localhost/wp-admin
> and
> http://mycomputer/wp-admin
>
> because wordpress core uses absolute urls.  In doing so, if you attempt to login to that site, regardless of any dns or htaccess tricks up your sleeve you will be stuck in an infinite loop.

Of course you can't. If you have the site on a different URL, then you
need to tell WP what the new URL is.

Is that all that this is about? The Site and Home URLs? Jeez... just
set the WP_SITEURL and WP_HOME defines in the wp-config.php file.

> It is very simple.  You only every process the content if you are exporting it.  You don't process the content if you are hosting it.

"Exporting" the content is what happens a large portion of the time.
You're exporting it to a feed, to somebody's feed reader, to
somebody's *browser*, which can be any of dozens of different types of
devices.

Viewing something in a browser is "exporting" it from the database to
the browser, because you don't necessarily know the context in which
the content is being viewed or used.

> As it stands wordpress processes the content hundreds of times in both situations (exports & hosting.)  There is zero maintenance with root-relative urls.  No host file hacks, no dns tweaks, no spoofing production urls in dev and staging. NO MAINTENANCE.

No, it's *constant* 100% all-the-time maintainance, because now I have
to add a filter to change those relative URLs you want in the database
to absolute URLs so that the damn things will actually work where the
content is displayed.

Something just under *half* of the views of the content I write are
not actually viewed on my websites themselves. So, how do your
relative URLs work then?

> Sure, add a filter to change example.com to example.co.uk, then try to administer that site from both urls.

Why in the hell are you trying to administer this site from both URLs?
Why should a site even HAVE multiple URLs for the same content? This
makes zero sense.

> Sometimes you don't get to create a proper NAT environment.  Sometimes you are forced to adhere to a company's intranet regulations.

Convince them otherwise. Seriously, when presented with a stupid rule,
stand the heck up and tell them that it's stupid. You may not get
every contract, but you'll feel better about yourself at the end of
the day.

And sometimes, you can convince them otherwise. Then the world improves.

> Explain how with wordpress on a load-balanced setup, you could use a DNS trick, to access server4 by IP address, click links, administer content without being redirected back to the gateway when wordpress hard-codes www.gatewayhost.com in every link ever sent back to the browser.  IT CANNOT BE DONE.  If it's so easy, just publish a really simple step by step example for others to understand how.

1. Edit the Hosts file.
2. 1.2.3.4 example.com
3. Done.

Seriously, the load balancer is directing traffic, but only if you hit
it. If the directly line to the server is available, then simply
bypass the load balancer. How you do that depends on your setup, but
assuming it's doing DNS-based balancing or even HTTP traffic
forwarding, then you can just hit that server directly instead of
hitting the load balancer. Hell, I've used this method before. On a
java server I set up, the load balancer was doing traffic forwarding
(all HTTP), but to hit the individual servers, all I had to do was use
their IP address directly.

> Again, you don't understand the problem domain if you think this has nothing to do with how WP sends url's to the browser.  When on a WIFI network that doesn't support DNS Tricks, you CANNOT ACCESS WP ADMIN WITH YOUR iPhone.

You cannot access wp-admin if you're using the wrong URL of the site,
period. And that's as it should be, because anything else is a
security issue.

> Root-relative urls do not require post-processing the crap out of anything.

They requite converting the relative URLs to absolute ones for half of
my readers. How is that not post-processing?

> And nice to see that you ignore the DRY aspect of not repeating your domain name in a URL when it doesn't need to be repeated.

The domain name absolutely DOES need to be repeated. DRY doesn't enter into it.

The whole point of an absolute URL is that it is *absolute*. That is
*intentional*. That is a *good and desirable thing*. That's the whole
bloody point I'm making. Absolute URLs work in contexts outside of the
website. Relative ones don't.

> Using the same content everywhere is not possible, as expressed by the multitude of problem areas I'm pointing out.

Using the same content works fine as long as you don't tie the content
tightly to the context in which it is displayed.

On Fri, Oct 28, 2011 at 2:49 PM, Marcus Pope <Marcus.Pope at springbox.com> wrote:
> Yes, and if you use that content on staging.example.com AND www.example.com it is not useful and does not work in both places - hence it violates "being useful no matter where you put it."

No, it works in both places, because the URL shouldn't change just
because the content moved. I want a link to point to a specific place,
and I want an image to come from a specific place. If I move the
content, but not the image, and the URL changes, then the content is
now broken.

> Ok, let's be more explicit.  How do you show your client, let's say Google.com, their staging site so they can review features you've added before they go into production?  That is to say, how do you show them staging.google.com, let them login to the admin, make post development edits and testing on new features before pushing it live to www.google.com?  The moment they login to staging.google.com and try to edit a post they will be redirected to www.google.com and they'll be editing live posts because you stored and served absolute URLs.

You're making the assumption that staging.* and production.* are using
the same dataset, unaltered. They should not be. The staging
environment should be wholly separate from the production environment.
If you need to write a script to re-up staging every so often, then
you can filter/alter your data there as needed. But the content
between them should absolutely not be shared. In companies I've worked
at, it explicitly could not be such. That'd get you fired, as well as
being against the law in some ways.

Production data *stays* in production. You don't use it for testing.
You don't use it for staging. You can copy it for testing on the copy,
but you never ever operate on it live from another system.

> If you think considering your viewpoint as narrow is me being a dick, then I'm sorry you took offense.  But that doesn't mean you're any less of a dick.  I'm not here to convince YOU, I'm here to convince OTHERS who read both of our comments and may not necessarily have the technical skill sets to understand who's right and who's wrong.

No, I consider your attitude to be what is making you out to be a dick
here. You're espousing a point of view, and that's fine. But you're
completely ignoring what I see as the VAST MAJORITY of uses of the
software.

Yes, okay, you move sites around a lot. I get it. That's great and
maybe your ideas help you. Fine.

But most people *don't* do that. Most people write content in order to
have that content read by other people. And your ideas make that a)
harder and b) broken. That's what I'm trying to get across to you.
Your viewpoint is your own, and by far not the correct viewpoint for
the majority of people.

That's what I want you to see here. You think that relative URLs are
some kind of fucking godsend idea, whereas I'm trying to tell you that
using relative URLs would *not work* for me or the majority of people
who aren't building sites but are instead just publishing their
content. They cause myriad problems and make things fundamentally
broken. They screw up the whole notion of portable content, and
require a metric fuck-ton of code and CPU work for post-processing. In
other words, for the case where people aren't moving sites around
constantly, relative URLs are a total bitch and a half. That's what
I'm trying to explain to you.

Also, everybody on these mailing lists knows that I'm a dick already.
So I have no problems there.

> The reality here is that you don't do any of this, because if you did you would immediately recognize our headaches.  You might get away with what you've had to do in the past, but it would not fly with blue-chip industries.  Wordpress isn't even a common selection in this industry because of this fact, (sure  it may power their blog, but it doesn't power their intranet or extranet webapps.)

Feel free to not use WordPress then.

-Otto
_______________________________________________
wp-hackers mailing list
wp-hackers at lists.automattic.com
http://lists.automattic.com/mailman/listinfo/wp-hackers