Why RogerBot can't crawl site https://unplag.com
-
Hello
Please help me to solve the problem.
The on-page grader and Crawl Test are not working for Unplag.com website. Both said that they can't access the url. Yes, I've tried different variants like unplag.com, http://unplag.com
One more thing - RogerBot was disallowed in robots.txt file. I deleted it from the file a week ago so maybe moz index haven't been renewed.
-
It's possible that your site hasn't been crawled yet (since you changed the robots.txt). You can see in your campaign dashboard (upper right corner) when the next crawl is scheduled.
Do you see any specific error codes on your dashboard?
Dirk
-
This post is deleted! -
After creating a fresh test campaign for the site, I'm still seeing a 400 response being served to rogerbot from https://unplag.com/. While I'm not able to pinpoint the exact setting that is causing the site to serve that response, I'd recommend checking your server logs to verify the response that is being served.
-
Thanks.
The last crawl was after the robots.txt change.
And I don't see any errors in the dashboard.
-
Follow the advice from Jordan below and try to check your log files to see what the server response is when Rogerbot is trying to visit the site.
I noticed some DNS issues with your site - check http://dnscheck.pingdom.com/?domain=unplag.com - Nameservers don't seem to be ok. Also noticed that you have a 302 redirect from http -> https - while this should be 301. Probably not related to your main issue but worth checking.
-
thanks a lot!
-
The logs are like this:
"GET / HTTP/1.0" 400 166 "-" "rogerbot/1.1 (http://moz.com/help/guides/search-overview/crawl-diagnostics#more-help, rogerbot-crawler+pr2-crawler-101@moz.com)" "-" - "https"
and of course sometimes rogerbot is trying to see the robots file:
"GET /robots.txt HTTP/1.1" 400 166 "-" "rogerbot/1.1 (http://moz.com/help/guides/search-overview/crawl-diagnostics#more-help, rogerbot-crawler+pr2-crawler-101@moz.com)" "-" - "https"
for me it looks like the rogerbot is disallowed in robots.txt but the file is like this https://unplag.com/robots.txt
-
The trouble is not with your robot.txt - in the server config you block rogerbot completely and serve a 400 for each request it makes..
If you have a user agent switcher plugin in your browser & change the user agent to rogerbot (rogerbot/1.1 (http://moz.com/help/guides/search-overview/crawl-diagnostics#more-help, rogerbot-crawler+pr2-crawler-101@moz.com) - the server returns a 400 Bad Request.
Dirk
-
Thank you. I'll try to solve the problem