[wp-hackers] Blocking SEO robots
David Anderson
david at wordshell.net
Thu Aug 7 08:20:18 UTC 2014
Jeremy Clarke wrote:
>
> The best answer is the htaccess-based blacklists from PerishablePress. I
> think this is the latest one:
>
> http://perishablepress.com/5g-blacklist-2013/
This looks like an interesting list, but doesn't fit the use case. The
proprietor says "the 5G Blacklist helps reduce the number of malicious
URL requests that hit your website" - and reading the list confirms
that's what he's aiming for. I'm aiming to block non-malicious actors
who are running their own private search engines - i.e. those who want
to spider the web as part of creating their own non-public products
(e.g. databases of SEO back-links). It's not about site security; it's
about not being spidered each day by search engines that Joe Public will
never use. If you have a shared server used to host many sites for your
managed clients, then this quickly adds up.
At the moment the best solution I have is adding a robots.txt to every
site with "Crawl-delay: 15" in it, to slow down the rate of compliant
bots and spread the load around a bit.
Best wishes,
David
--
UpdraftPlus - best WordPress backups - http://updraftplus.com
WordShell - WordPress fast from the CLI - http://wordshell.net
More information about the wp-hackers
mailing list