404 Crawl Error on Homepage
-
Hi - new to using Moz Pro so hopefully someone can help. I'm now in charge of a corporate site that has been developed 10+ years ago, and I know there are some quirky issues off the bat.
In setting up a campaign I used the homepage domain (mysite.com), when Moz tries to crawl the site each week it returns a 404 on the homepage and stops crawling. The homepage is definitely not a 404, however it is set up with a redirect. As example, entering the URL mysite.com redirects to mysite.com/en-us/pages/default, which is the actual homepage URL.
Using the MozBar it recognizes the redirect and returns a 200 page status. As this is a corporate site I can't change the homepage URL, and the sub-folder /en-us/pages/ contains only the homepage.
There has been no issues with the site indexing in Google, including the homepage, as it consistently ranks top 3 for branded terms. But since it only crawls 1 page I can't get data from Moz about site health via crawl or drill into specific page optimizations.
Any ideas why Moz thinks it's a 404 or what I can do to remedy this situation? I'm anxious to use Moz, but I can't justify the expense if all the tools aren't available and functioning.
-
try checking the http status with other checkers like httpstatus.io or web-sniffer.net
Could be that your server is blocking access to Rogerbot - change your user agent (using a browser plugin) to
Mozilla/5.0 (compatible; rogerBot/1.0; UrlCrawler; http://www.seomoz.org/dp/rogerbot)
and try checking the status of your homepage.
If this isn't helping - try contacting Moz directly (help@moz.com)
rgds
Dirk
-
Do you fancy popping me a PM over and I'll have a quick look for you? It could be a few things, including MOZ might be getting confused.
-Andy
-
Hi there! I just ran the site through our crawler using a few different cURLs and it does look like the homepage URL is returning a 404 to our crawler: http://www.screencast.com/t/fgoeCGHu and http://www.screencast.com/t/SRzcBXJIZmw. It may be a user agent specific issue, where the server may be trying to specifically block our crawler, but it is set to use the wrong http status so it returns a 404 rather than a 403. If you look at your server logs for the time of the crawl, you should be able to see the exact response of your server to our crawler, rogerbot. I would recommend working with your webmaster to look into how the site is responding to our crawler further.
If you have more questions or need specific information, I would recommend emailing help@moz.com so that we can look into the issue further for you.