Moz crawler is not able to crawl my website
-
Hello All,
I'm facing an issue with the MOZ Crawler. Every time it crawls my website , there will be an error message saying " **Moz was unable to crawl your site on Sep 13, 2017. **Our crawler was not able to access the robots.txt file on your site. This often occurs because of a server error from the robots.txt. Although this may have been caused by a temporary outage, we recommend making sure your robots.txt file is accessible and that your network and server are working correctly. Typically errors like this should be investigated and fixed by the site webmaster. "
We changed the robots.txt file and checked it . but still the issue is not resolved.
URL : https://www.khadination.shop/robots.txt
Do let me know what went wrong and wjhat needs to be done.
Any suggestion is appreciated.
Thank you.
-
Hi there! Tawny from Moz's Help Team here!
I think I can help you figure out what's going on with your robots.txt file. First things first: we're not starting at the robots.txt URL you list. Our crawler always starts from your Campaign URL and goes from there, and it can't start at an HTTPS URL, so it starts at the HTTP version and crawls from there. So, the robots.txt file we're having trouble accessing is khadination.shop/robots.txt.
I ran a couple of tests, and it looks like this robots.txt file might be inaccessible from AWS (Amazon Web Services). When I tried to curl your robots.txt file from AWS I got a 302 temporary redirect error (https://www.screencast.com/t/jy4MkDZQNbQ), and when I ran it through hurl.it, which also runs on AWS, it returned an internal server error (https://www.screencast.com/t/mawknIyaMn).
One more thing — it looks like you have a wildcard character ( * ) for the user-agent as the first line in this robots.txt file. Best practices indicate that you should put all your specific user-agent disallow commands before a wildcard user-agent; otherwise those specific crawlers will stop reading your robots.txt file after the wildcard user-agent line, since they'll assume that those rules apply to them.
I think if you fix up those things, we should be able to access your robots.txt and crawl your site!

If you still have questions or run into more trouble, shoot us a note at help@moz.com and we'll do everything we can to help you sort everything out.
