[wp-hackers] wp_remote_request not telling me the 301'd URL

Jacob Santos wordpress at santosj.name
Sat Mar 12 03:50:33 UTC 2011


1. Check content-type, if exists. If it is "text/html" then run the filter
to get the favicon.ico.

2. Oh my god, who would have thought an use case like this would have come
up?

3. You need to look for "Refresh" header as well. Some web servers (IIS)
will send Refresh instead of Location as well as web sites with a redirect
message for systems that do not support redirects.

Jacob Santos

On Fri, Mar 11, 2011 at 2:09 PM, Edward de Leau <e at leau.net> wrote:

> I have implemented manual redirection for the wp-favicons plugin here:
>
> http://plugins.trac.wordpress.org/browser/wp-favicons/trunk/includes/class-http.php
>
> (part of next version 0.5.1 where a mouseover over a redirect/tiny url
> shows
> the url it redirects to)
> I redirect 5 times max.
>
> e.g. (1) nu.nl (301) ---> (2) www.nu.nl (200) --> (3)
> www.nu.nl/images/favicon.ico (200)
>
> I needed the manual redirection because I needed the base_href when no
> base_href is given in the HTML source.
> I then need the redirected URI to use that as base_href
>
> Code is not completely done since a use case like:
>
> e.g. (1) newscred.com (301) --> (2) http://platform.newscred.com (200)
> (look
> in page) --> (3) http://newscred.com/favicon.ico (301) (wtf? redirect of
> content in page) -->
> (4) http://newspapers.newscred.com/favicon.ico (200) --> (5)
> http://newspapers.newscred.com//media/img/favicon.ico (200)
>
> does not work yet since this site gives 4 as redirect url while (4) is
> actually a page. So i need to add another check for binary content in the
> beginning.
>
> But for all none favicon self-redirection this should work.
>
>
>
>
>
> On Tue, Mar 1, 2011 at 12:31 AM, Scott Kingsley Clark <scott at skcdev.com
> >wrote:
>
> > The spidering process can really take a lot of time for a large site, and
> > can end up eating resources and adding time to the infamous php
> > max_execution_time so I was looking to cut corners. If I've gotta do two
> > requests to do this, I'll do it. Thanks for the advice and attention.
> >
> > -Scott
> >
> > On Monday, February 28, 2011 5:28:54 PM UTC-6, Jacob Santos wrote:
> > >
> > > Not really. The wp_remote_request simply defaults to GET, you can
> change
> > it
> > > to be HEAD, which is what it seems like you are wanting anyway. You can
> > > check to see if it is a redirect and then send another request. It does
> > not
> > > sound like speed is a concern (albeit one factor since many sites can
> > quite
> > > frankly get up there with the amount of redirects given Canonical URLs
> > > might
> > > give you (Hint: Should be at most 2 requests, one for the redirect and
> > one
> > > for the actual page).
> > >
> > > You'll probably want to use wp_remote_head() instead, since
> > > wp_remote_request() is a generic function made to accommodated the rest
> > of
> > > the HTTP and HTTP extensions (there isn't any built-in calls support
> for
> > > Subversion or webdav).
> > >
> > > Jacob Santos
> > >
> > > On Mon, Feb 28, 2011 at 5:22 PM, Scott Kingsley Clark <
> sc... at skcdev.com
> > > >wrote:
> > >
> > > > Actually, this is in regards to a plugin I'm currently developing.
> It's
> > > in
> > > > Beta right now but it's available on WP.org. It's called Search
> Engine
> > > and
> > > > it's like a mini-Google on your site. It spiders your site (or other
> > > sites
> > > > too) and indexes content into the DB.
> > > >
> > > > http://wordpress.org/extend/plugins/search-engine/
> > > >
> > > > <http://wordpress.org/extend/plugins/search-engine/>The use-case is
> > that
> > > I
> > > > want to be able to tell whether a page that's linked to on a site, is
> > > > really
> > > > redirected elsewhere. Right now, since I switched to
> wp_remote_request,
> > I
> > > > only get the content of the final destination page, without any
> > knowledge
> > > > of
> > > > the path it's taken. So the best my script (or any script) can tell
> is
> > > that
> > > > when you get content using wp_remote_request and it's redirected,
> there
> > > > page
> > > > exists at the URL requested -- oblivious to the real redirect
> > happening.
> > > > Previously I was using a home-brewed version similar
> > > > to wp_remote_request but calling cURL and others manually).
> > > >
> > > > So it looks like right now I'll need to do a little extra code to
> make
> > my
> > > > own wp_remote_request like function which does both the 301/302
> > redirect
> > > > headers check and the body content return.
> > > >
> > > > -Scott
> > > >
> > > > On Monday, February 28, 2011 5:11:22 PM UTC-6, Dion Hulse (dd32)
> wrote:
> > > > >
> > > > > 2 separate requests will be 2 separate requests.
> > > > > What's the use-case you're working on here?
> > > > > Personally, I'd do a normal fetch, followed by a head if it was a
> > > > > exceeded-redirects error if you want the body, otherwise, the url..
> > > > > But i cant think of a case where you'd want one or the other..
> > > > >
> > > > > On 1 March 2011 04:06, Scott Kingsley Clark <sc... at skcdev.com>
> > wrote:
> > > > >
> > > > > > Not sure if anyone knows this, but does the page get loaded twice
> > or
> > > is
> > > > > the
> > > > > > second time getting loaded from some sort of cache? I'm
> > specifically
> > > > > > calling
> > > > > > to the idea of using wp_remote_head on a URL to check for a
> > redirect,
> > > > and
> > > > > > then using wp_remote_request on the same URL to get the content /
> > > etc.
> > > > > > _______________________________________________
> > > > > > wp-hackers mailing list
> > > > > > wp-h... at lists.automattic.com
> > > > > > http://lists.automattic.com/mailman/listinfo/wp-hackers
> > > > > >
> > > > > >
> > > > > _______________________________________________
> > > > > wp-hackers mailing list
> > > > > wp-h... at lists.automattic.com
> > > > > http://lists.automattic.com/mailman/listinfo/wp-hackers
> > > > >
> > > > >
> > > >
> > > > _______________________________________________
> > > > wp-hackers mailing list
> > > > wp-ha... at lists.automattic.com
> > > > http://lists.automattic.com/mailman/listinfo/wp-hackers
> > > >
> > > >
> > > _______________________________________________
> > > wp-hackers mailing list
> > > wp-ha... at lists.automattic.com
> > > http://lists.automattic.com/mailman/listinfo/wp-hackers
> > >
> > >
> >
> > _______________________________________________
> > wp-hackers mailing list
> > wp-hackers at lists.automattic.com
> > http://lists.automattic.com/mailman/listinfo/wp-hackers
> >
> >
> _______________________________________________
> wp-hackers mailing list
> wp-hackers at lists.automattic.com
> http://lists.automattic.com/mailman/listinfo/wp-hackers
>


More information about the wp-hackers mailing list