[wp-hackers] Blocking SEO robots
danielx386 at gmail.com
Thu Aug 7 04:31:16 UTC 2014
I like to use a nice tool from http://www.spambotsecurity.com/ but it
may cause more issues for some people though. Best thing is that it
very fast and dowsn't slow down unlike .htaccess
On Thu, Aug 7, 2014 at 2:28 PM, Daniel <malkir at gmail.com> wrote:
> Almost forgot, the link should be in a subdirectory that is marked in
> robots.txt to ignore, so anything ignoring robots.txt is whats hit.
> On Wed, Aug 6, 2014 at 9:26 PM, Daniel <malkir at gmail.com> wrote:
>> Set up a trap. A link hidden by CSS on each page that if hit, the IP gets
>> blacklisted for a period of time. No human will ever come across the link
>> unless they're digging. No bot actually renders the entire page out before
>> deciding what to use.
>> On Wed, Aug 6, 2014 at 5:31 AM, Jeremy Clarke <jer at simianuprising.com>
>>> On Wednesday, August 6, 2014, David Anderson <david at wordshell.net> wrote:
>>> > The issue's not about how to write blocklist rules; it's about having a
>>> > reliable, maintained, categorised list of bots such that it's easy to
>>> > automate the blocklist. Turning the list into .htaccess rules is the
>>> > bit; what I want to avoid is having to spend long churning through log
>>> > files to obtain the source data, because it feels very much like
>>> > there 'ought' to be pre-existing data out there for, given how many
>>> > the world's servers must be wasting on such bots.
>>> The best answer is the htaccess-based blacklists from PerishablePress. I
>>> think this is the latest one:
>>> He uses a mix of blocked user agents, blocked IP's and blocked requests
>>> (i.e /admin.php, intrusion scans for other software). He's been updating
>>> for years and it's definitely a WP-centric project.
>>> In the past some good stuff has been blocked by his lists (Facebook spider
>>> blocked because it had an empty user agent, common spiders used by
>>> academics were blocked) but that's bound to happen and I'm sure every UA
>>> was used by a spammer at some point.
>>> I run a ton of sites on my server so I hate the .htaccess format (which is
>>> a pain to implement alongside wp+super cache rules). If I used multisite
>>> would be less of a big deal. Either way, know that you can block UA's for
>>> all virtual hosts if that's relevant.
>>> Note that ip blocking is a lot more effective at the server level because
>>> blocking with Apache still uses a ton of resources (but at least no MySQL
>>> etc). On Linux an iptables based block is much more effective.
>>> Jeremy Clarke
>>> Code and Design • globalvoicesonline.org
>>> wp-hackers mailing list
>>> wp-hackers at lists.automattic.com
> wp-hackers mailing list
> wp-hackers at lists.automattic.com
More information about the wp-hackers