Moz Crawler Causing Server Timeouts... Crawling thousands of non-existant pages with query parameters
-
Moz crawler is crawling all pages like this:
- http://www.xxxx.com/?product_count=100&product_order=desc&product_orderby=date
- http://www.xxxx.com/?product_count=100&product_order=desc&paged=1
- http://www.xxx.com/?product_count=100&product_order=desc&product_view=grid
Last month it crawled 80,000 pages on a site with less than 100 pages. Is there a way to select only certain pages to be crawled? Right now it is still crawling this site, since Monday morning and it's Tuesday mid-day. Every Monday it is causing time-outs from high band width on our server. Just getting ready to delete this client from the account unless there is a solution someone can give us.
Thanks.
-
The immediate solution is use your robots.txt file to block the Moz crawler from crawling URLs with parameters. Pamela.
User-agent: rogerbot
Disallow: /*?utmThose pages are coming from the bot trying to follow links to all the different ways product pages can be sorted. You'll want to insure Googlebot isn't having the same problem.
Hope that helps;
Paul