[wp-hackers] Blocking SEO robots

Haluk Karamete halukkaramete at gmail.com
Wed Aug 6 10:06:47 UTC 2014


Could this list help you? ->http://www.robotstxt.org/db/all.txt

Source:

http://stackoverflow.com/questions/1717049/tell-bots-apart-from-human-visitors-for-stats
This is not a bad idea at all - and I'd like to second the request if
anyone has researched this previously. David is correct as I've found the
same issue with valuable server resources - especially when you're running
a handful of heavy WP sites.

So, bot experts, what say you?


On Wed, Aug 6, 2014 at 5:50 AM, David Anderson <david at wordshell.net> wrote:

> This isn't specifically a WP issue, but I think it will be relevant to
> lots of us, trying to maximise our resources...
>
> Issue: I find that a disproportionate amount of server resources are
> consumed by a certain subset crawlers/robots which contribute nothing. I'd
> like to just block them. I have in mind the various semi-private search
> engines run by SEO companies/backlink-checkers, e.g.
> http://en.seokicks.de/, https://ahrefs.com/. These things happily spider
> a few thousand pages, every author, tag, category, etc., archive. Some of
> them refuse to obey robots.txt (the one that specifically annoys is when
> they ignore the Crawl-Delay directive. I even came across one that proudly
> had a section on its website explaining that robots.txt was a stupid idea,
> so they always ignored it!).
>
> I'd like to just block such crawlers. So: does anyone know of where a
> reliable list of the IP addresses used by these services is kept?
> Specifically, I want to block the semi-private or obscure crawlers that do
> nothing useful for my sites. I don't want to block mainstream search
> engines, of course. I've done some Googling, and haven't managed to find
> something that makes this distinction.
>
> Or alternatively - anyone think this is a bad idea?
>
> Best wishes,
> David
>
> --
> UpdraftPlus - best WordPress backups - http://updraftplus.com
> WordShell - WordPress fast from the CLI - http://wordshell.net
>
> _______________________________________________
> wp-hackers mailing list
> wp-hackers at lists.automattic.com
> http://lists.automattic.com/mailman/listinfo/wp-hackers
>



--


*Eric A. HendrixUSA, MSG(R)*hendronix at gmail.com
(910) 644-8940

*"Non Timebo Mala"*
_______________________________________________
wp-hackers mailing list
wp-hackers at lists.automattic.com
http://lists.automattic.com/mailman/listinfo/wp-hackers


More information about the wp-hackers mailing list