Google Search Results...
-
I'm trying to download every google search results for my company site:company.com. The limit I can get is 100. I tried using seoquake but I can only get to 100.
The reason for this? I would like to see what are the pages indexed. www pages, and subdomain pages should only make up 7,000 but search results are 23,000. I would like to see what the others are in the 23,000.
Any advice how to go about this? I can individually check subdomains site:www.company.com and site:static.company.com, but I don't know all the subdomains.
Anyone cracked this? I tried using a scrapper tool but it was only able to retrieve 200.
-
Hi Cyto. Why don't you try exporting pages receiving google/organic visits from Google Analytics using the Landing Page metric as a secondary dimension... It won't be all inclusive, but it will give you a good idea on what pages are indexed and drawing in visitors. You can then compare that data against your sitemaps.
-
My GA is only focused on a single domain, as subdomains hold just PDFs, images etc. Traffic reports from GA are focused on www.company.com pages.
The only way I can know exactly which URLS have been indexed, seems to be going through the google search results, but it caps after 7 pages

-
Ok, but what's your goal with this? And why don't you know your own subdomains that you've created? It seems like you could work backwards from a better starting point by applying those things.
-
The goal is to identify what pages are Google indexing and are there ones it shouldn't. (We don't index search pages, we don't index basket or checkout pages)
I do know know all of the subdomains and searching them individually isn't making up the total search count when I do site:company.com.
I don't have duplicate pages from my moz reports so it can't be that. If I was able to download a full google search result into a spreadsheet. I could quickly filter and see what pages are being indexed that shouldn't.
-
I see. If you have some idea of what section of your site might be in there that you don't want, you can use site:company.com inurl:whatever to narrow it down. You should know the file or call for search and shop pages and can put that name after the inurl modifier.