Tool for Bulk Index Checking
-
Hey mozzers,
I am having some difficulties in finding a tool that allows you to mass input a whole bunch of URLs and it will return results that let you know if a page is not indexed in Google.
We have some large ecommerce sites that I would like to crawl with Xenu and then match up with the results from the above mentioned tool.
Thanks.
-
Search Guys,
I just came across this and sent it to our dev team, but it may be worth checking it out: http://www.indexbear.com/index-checker/
Now, a lot of these tools have the proverbial get free backlinks, etc. features and this does too. But, I like that they have separated them out and state you can store the results as indexed, not indexed, errors. It is a free tool and does appear to do bulk for G, B, and Y.
Given that we love Xenu, I am hoping this will mesh with your idea. Hope all is well in Adelaide,
Best,
-
Here's a YouMoz post that might also provide some ideas for checking the index. http://www.seomoz.org/ugc/solving-new-content-indexation-issues-for-large-b2b-websites
-
The URL Keri posts has a lot of helpful info, but does not answer Luke's original question. I am very curious about such a tool as well.
Here is my research on the subject:
There are numerous sites offering to check 20-50 URLs for index status. (Search for "bulk indexation checker") but none available to check the thousands of pages on a typical enterprise site.
Why not just trust WMT for the list? A) Large complex sites have too many pages excluded via robots or non-index tags. Important pages can slip through the cracks. More importantly, B) I do not trust WMT data.
Why not just use Screaming Frog, IIS, SEOtools for Excel, Xenu or another site scraper, to check if a cached copy exists? Seems simple, just plug in a list of URLs like http://webcache.googleusercontent.com/search?q=cac... and report which throw 404 errors. Unfortunately, Google dislikes this and forces CAPTCHAs after 30 or so.
After much research, I have only uncovered two ways to check large #s of URLs: 1) Scrapebox, with rotating proxies
2) Using the Moz toolbar and this search https://www.google.com/search?q=site:www.zappos.co... Export the SERP to CSV using the Mozbar then do it over again to the next 200: https://www.google.com/search?q=site:www.zappos.co... Merge & purge.The idea would be to use either method to generate a list, then check it against a site crawl/sitemap file to see if all important pages are indexed.
QUESTION: Does anyone know of a better method than 1 or 2 above? I'd be happy to share the results of my research with anyone similarly interested.
Thanks,
Carl
-
Just wrote a post exploring the various indexation checking options. Settled on URL Profiler as best solution for Enterprise-sized sites.
http://seattlesearchnetwork.org/seo-tools/check-enterprise-website-google-cache-status/