Moz campaign works around my robots.txt settings

VinceWicks

My robots.txt file looks like this:

User-agent: *

Disallow: /*?

Disallow: /search

So, it should block (deindex) all dynamic URLs.

If I check this url in Google:

site:http://www.webdesign.org/search/page-1.html?author=47

Google tells me:

A description for this result is not available because of this site's robots.txt – learn more.

So far so good.

Now, I ran a Moz SEO campaign and I got a bunch of duplicate page content errors.

One of the links is this one:

http://www.webdesign.org/search/page-1.html?author=47

(the same I tested in Google and it told me that the page is blocked by robots.txt which I want)

So, it makes me think that Moz campaigns check files regardless of what robots.txt say? It’s my understanding User-agent: * should forbid Rogerbot from crawling as well. Am I missing something?

Abe_Schmidt

Hello Vince, thank you for reaching out to us! This seems quite odd, our crawler usually obeys all robots.txt files. Let's try this. Add this code to your robots.txt:

Useragent: Rogerbot

Disallow: /

This should specifically instruct us to follow these rules. Once you have tried this, if it does not work, please send an email to help@moz.com and we will have our engineers dig in a bit further. Sorry for the inconvenience, I hope the above fix works for you.

VinceWicks

Thanks Abe.

I guess I'll try this:

Useragent: Rogerbot

Disallow: /*?

Because if I use Disallow: / I'll lose my current Moz reports because Rogerbot will just ignore all my file, right?

VinceWicks

That worked, thanks!

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz campaign works around my robots.txt settings

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved