Why is 4XX (Client Error) shown for valid pages?
-
My Crawl Diagnostics Summary says I have 5,141 errors of the 4XX (Client Error) variety. Yet when I view the list of URLs they all resolve to valid pages.
Here is an example.
http://www.ryderfleetproducts.com/ryder/af/ryder/core/content/product/srm/key/ACO 3018/pn/Wiper-Blade-Winter-18-Each/erm/productDetail.doThese pages are all dynamically created from search or browse using a database where we offer 36,000 products.
Can someone help me understand why these are errors.
-
You need to look at the pages like a spider would. Here is a great tool to check and view the server response. In this case, you will need to go to your developers and allow them to look at this as well.
-
I think something is going on with a space vs a %20
- Copy and paste in the url that you listed above to Brent's recommended tool, you get a 404 response.
- If you copy and paste that url to a browser, however, your page comes up.
- Now take THAT url and paste into the tool Brent recommends, and you get a 200 (good) response.
The only difference that I see, is that when I copy and paste that url to a browser (Chrome in my case), it adds a %20 where you have a space.
Since this is the thing that makes these other url checkers work, I am guessing that the crawl diagnostics tool is having a similar problem. See the comparison below (much abbreviated to the area in question)
ACO 3018 (from your post, and gives you the error)
ACO%203018 (when it resolves in the browser, and shows a good response in the tools)
I am just smart enough to tell you that these are different, but not smart enough to know why it causes problems for crawlers, but not for browsers.
The good news is that your pages work for users. The bad news is that Google probably never sees them.
-
We had spaces in the URL that browsers handled well but not spiders. We replaced the spaces with dashes with dynamic code and...it's off to the next problem.
Thanks