[wp-hackers] Blocking SEO robots
hendronix at gmail.com
Wed Aug 6 09:58:00 UTC 2014
This is not a bad idea at all - and I'd like to second the request if
anyone has researched this previously. David is correct as I've found the
same issue with valuable server resources - especially when you're running
a handful of heavy WP sites.
So, bot experts, what say you?
On Wed, Aug 6, 2014 at 5:50 AM, David Anderson <david at wordshell.net> wrote:
> This isn't specifically a WP issue, but I think it will be relevant to
> lots of us, trying to maximise our resources...
> Issue: I find that a disproportionate amount of server resources are
> consumed by a certain subset crawlers/robots which contribute nothing. I'd
> like to just block them. I have in mind the various semi-private search
> engines run by SEO companies/backlink-checkers, e.g.
> http://en.seokicks.de/, https://ahrefs.com/. These things happily spider
> a few thousand pages, every author, tag, category, etc., archive. Some of
> them refuse to obey robots.txt (the one that specifically annoys is when
> they ignore the Crawl-Delay directive. I even came across one that proudly
> had a section on its website explaining that robots.txt was a stupid idea,
> so they always ignored it!).
> I'd like to just block such crawlers. So: does anyone know of where a
> reliable list of the IP addresses used by these services is kept?
> Specifically, I want to block the semi-private or obscure crawlers that do
> nothing useful for my sites. I don't want to block mainstream search
> engines, of course. I've done some Googling, and haven't managed to find
> something that makes this distinction.
> Or alternatively - anyone think this is a bad idea?
> Best wishes,
> UpdraftPlus - best WordPress backups - http://updraftplus.com
> WordShell - WordPress fast from the CLI - http://wordshell.net
> wp-hackers mailing list
> wp-hackers at lists.automattic.com
*Eric A. HendrixUSA, MSG(R)*hendronix at gmail.com
*"Non Timebo Mala"*
More information about the wp-hackers