Misbehaving Robots

Andrew-SEO

I have a disallow entry in my robots file which is

/*src=

So I have urls that look like

www.website.co.uk/example?value=1&src=2

I would expect these URLs to be blocked.

However both Moz and AHREFS have these in the crawl logs

www.website.co.uk/example?value=1

It is as if the bots are reading the first parameter but being blocked when they reach the second and logging part of the URL.

Is this standard behavior for Roger (mozbot) or Ahrefs bot. Does google act in the same way?

eli.myers

Hey Andrew,

Would you be able to reach out to help@moz.com so we could have a look at your specific Campaign/Website.

Thanks!

Eli

Rajesh.Prajapati

No, Google does not act in the same way.

Andrew-SEO

I have gone into search console > crawl > URL parameters.

I have found the SRC parameter and told google to crawl it. (as opposed to "let google decide")

Google has been indexing the pages without the parameter so I think it chose to ignore it. (because it doesn't alter page content)

Google now states. "This will be crawled unless overridden by other commands". So robots should kick in now and stop the URLs being crawled.

Why Moz and AHREFS are also ignoring this parameter is beyond me.

Andrew-SEO

Ok the URL looks like this

www.site.com /engine/referrer.asp?web=http%3a%2f%2fwww.example.co.uk&src=3078c98d2da385d5468f562

However google moz and ahrefs only get this far

www.site.com /engine/referrer.asp?web=http%3a%2f%2fwww.example.co.uk

Could it be because there is a domain name in the paramater

Could it be because they cannot pass the ampersand

Could it be because src is blocked in robots.

Any suggestions would be most welcome.

ThompsonPaul

My suspicion would be that those URLs are accessible somewhere without the second variable. (I.e. That second variable isn't always present when the first variable is.)

Andrew-SEO

Hi Paul

Thanks for your response. I'm always happy to hear your advise.

I've been through the page code line by line.

I've also fetched in google and the HTML does not contain a URL without the SRC value.

I'm really stuck

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Misbehaving Robots

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved