[wp-hackers] wp_remote_request not telling me the 301'd URL

Edward de Leau e at leau.net
Fri Mar 11 20:09:39 UTC 2011


I have implemented manual redirection for the wp-favicons plugin here:
http://plugins.trac.wordpress.org/browser/wp-favicons/trunk/includes/class-http.php

(part of next version 0.5.1 where a mouseover over a redirect/tiny url shows
the url it redirects to)
I redirect 5 times max.

e.g. (1) nu.nl (301) ---> (2) www.nu.nl (200) --> (3)
www.nu.nl/images/favicon.ico (200)

I needed the manual redirection because I needed the base_href when no
base_href is given in the HTML source.
I then need the redirected URI to use that as base_href

Code is not completely done since a use case like:

e.g. (1) newscred.com (301) --> (2) http://platform.newscred.com (200) (look
in page) --> (3) http://newscred.com/favicon.ico (301) (wtf? redirect of
content in page) -->
(4) http://newspapers.newscred.com/favicon.ico (200) --> (5)
http://newspapers.newscred.com//media/img/favicon.ico (200)

does not work yet since this site gives 4 as redirect url while (4) is
actually a page. So i need to add another check for binary content in the
beginning.

But for all none favicon self-redirection this should work.





On Tue, Mar 1, 2011 at 12:31 AM, Scott Kingsley Clark <scott at skcdev.com>wrote:

> The spidering process can really take a lot of time for a large site, and
> can end up eating resources and adding time to the infamous php
> max_execution_time so I was looking to cut corners. If I've gotta do two
> requests to do this, I'll do it. Thanks for the advice and attention.
>
> -Scott
>
> On Monday, February 28, 2011 5:28:54 PM UTC-6, Jacob Santos wrote:
> >
> > Not really. The wp_remote_request simply defaults to GET, you can change
> it
> > to be HEAD, which is what it seems like you are wanting anyway. You can
> > check to see if it is a redirect and then send another request. It does
> not
> > sound like speed is a concern (albeit one factor since many sites can
> quite
> > frankly get up there with the amount of redirects given Canonical URLs
> > might
> > give you (Hint: Should be at most 2 requests, one for the redirect and
> one
> > for the actual page).
> >
> > You'll probably want to use wp_remote_head() instead, since
> > wp_remote_request() is a generic function made to accommodated the rest
> of
> > the HTTP and HTTP extensions (there isn't any built-in calls support for
> > Subversion or webdav).
> >
> > Jacob Santos
> >
> > On Mon, Feb 28, 2011 at 5:22 PM, Scott Kingsley Clark <sc... at skcdev.com
> > >wrote:
> >
> > > Actually, this is in regards to a plugin I'm currently developing. It's
> > in
> > > Beta right now but it's available on WP.org. It's called Search Engine
> > and
> > > it's like a mini-Google on your site. It spiders your site (or other
> > sites
> > > too) and indexes content into the DB.
> > >
> > > http://wordpress.org/extend/plugins/search-engine/
> > >
> > > <http://wordpress.org/extend/plugins/search-engine/>The use-case is
> that
> > I
> > > want to be able to tell whether a page that's linked to on a site, is
> > > really
> > > redirected elsewhere. Right now, since I switched to wp_remote_request,
> I
> > > only get the content of the final destination page, without any
> knowledge
> > > of
> > > the path it's taken. So the best my script (or any script) can tell is
> > that
> > > when you get content using wp_remote_request and it's redirected, there
> > > page
> > > exists at the URL requested -- oblivious to the real redirect
> happening.
> > > Previously I was using a home-brewed version similar
> > > to wp_remote_request but calling cURL and others manually).
> > >
> > > So it looks like right now I'll need to do a little extra code to make
> my
> > > own wp_remote_request like function which does both the 301/302
> redirect
> > > headers check and the body content return.
> > >
> > > -Scott
> > >
> > > On Monday, February 28, 2011 5:11:22 PM UTC-6, Dion Hulse (dd32) wrote:
> > > >
> > > > 2 separate requests will be 2 separate requests.
> > > > What's the use-case you're working on here?
> > > > Personally, I'd do a normal fetch, followed by a head if it was a
> > > > exceeded-redirects error if you want the body, otherwise, the url..
> > > > But i cant think of a case where you'd want one or the other..
> > > >
> > > > On 1 March 2011 04:06, Scott Kingsley Clark <sc... at skcdev.com>
> wrote:
> > > >
> > > > > Not sure if anyone knows this, but does the page get loaded twice
> or
> > is
> > > > the
> > > > > second time getting loaded from some sort of cache? I'm
> specifically
> > > > > calling
> > > > > to the idea of using wp_remote_head on a URL to check for a
> redirect,
> > > and
> > > > > then using wp_remote_request on the same URL to get the content /
> > etc.
> > > > > _______________________________________________
> > > > > wp-hackers mailing list
> > > > > wp-h... at lists.automattic.com
> > > > > http://lists.automattic.com/mailman/listinfo/wp-hackers
> > > > >
> > > > >
> > > > _______________________________________________
> > > > wp-hackers mailing list
> > > > wp-h... at lists.automattic.com
> > > > http://lists.automattic.com/mailman/listinfo/wp-hackers
> > > >
> > > >
> > >
> > > _______________________________________________
> > > wp-hackers mailing list
> > > wp-ha... at lists.automattic.com
> > > http://lists.automattic.com/mailman/listinfo/wp-hackers
> > >
> > >
> > _______________________________________________
> > wp-hackers mailing list
> > wp-ha... at lists.automattic.com
> > http://lists.automattic.com/mailman/listinfo/wp-hackers
> >
> >
>
> _______________________________________________
> wp-hackers mailing list
> wp-hackers at lists.automattic.com
> http://lists.automattic.com/mailman/listinfo/wp-hackers
>
>


More information about the wp-hackers mailing list