Moz campaign works around my robots.txt settings
-
My robots.txt file looks like this:
User-agent: *
Disallow: /*?
Disallow: /search
So, it should block (deindex) all dynamic URLs.
If I check this url in Google:
site:http://www.webdesign.org/search/page-1.html?author=47
Google tells me:
A description for this result is not available because of this site's robots.txt – learn more.
So far so good.
Now, I ran a Moz SEO campaign and I got a bunch of duplicate page content errors.
One of the links is this one:
http://www.webdesign.org/search/page-1.html?author=47
(the same I tested in Google and it told me that the page is blocked by robots.txt which I want)
So, it makes me think that Moz campaigns check files regardless of what robots.txt say? It’s my understanding User-agent: * should forbid Rogerbot from crawling as well. Am I missing something?
-
Hello Vince, thank you for reaching out to us! This seems quite odd, our crawler usually obeys all robots.txt files. Let's try this. Add this code to your robots.txt:
Useragent: Rogerbot
Disallow: /
This should specifically instruct us to follow these rules. Once you have tried this, if it does not work, please send an email to help@moz.com and we will have our engineers dig in a bit further. Sorry for the inconvenience, I hope the above fix works for you.
-
Thanks Abe.
I guess I'll try this:
Useragent: Rogerbot
Disallow: /*?
Because if I use Disallow: / I'll lose my current Moz reports because Rogerbot will just ignore all my file, right?
-
That worked, thanks!