The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Is there a way to get a list of Total Indexed pages from Google Webmaster Tools?

    Is there a way to get a list of Total Indexed pages from Google Webmaster Tools?

    Intermediate & Advanced SEO
    7 3 10.4k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • sparrowdog
      sparrowdog last edited by

      I'm doing a detailed analysis of how Google sees and indexes our website and we have found that there are 240,256 pages in the index which is way too many. It's an e-commerce site that needs some tidying up.

      I'm working with an SEO specialist to set up URL parameters and put information in to the robots.txt file so the excess pages aren't indexed (we shouldn't have any more than around 3,00 - 4,000 pages) but we're struggling to find a way to get a list of these 240,256 pages as it would be helpful information in deciding what to put in the robots.txt file and which URL's we should ask Google to remove.

      Is there a way to get a list of the URL's indexed? We can't find it in the Google Webmaster Tools.

      1 Reply Last reply Reply Quote 0
      • FedeEinhorn
        FedeEinhorn last edited by

        Joanne,

        I'm afraid there's no way to know which pages are actually indexed from your Webmaster Tools. You can use a simple search in Google: site:domain.com and it will list "all" your indexed pages, however, there's no way to export that as a report.

        You can create a report using some "hack". Login to your Google Drive, create a new spreadsheet and use the following command to populate rows:

        =importXml("https://www.google.com/search?q=site:www.yourdomainnamehere.com&num=100&start=1"; "//cite")

        This will load the first 100 results. You will need to repeat the process for every 1000 results you have, changing the last variable: "start=1" to "start=100" and then "start=200", etc (you see where I'm going). This could really be a pain in the butt for your site's size.

        My recommendation is you navigate your own site, decide which pages should be removed and then create the robots.txt regardless what google has indexed. Once you complete your robots.txt, it will take a few weeks (or even a month) to have the blocked pages removed.

        Hope that helps!

        sparrowdog 2 Replies Last reply Reply Quote 3
        • DeanAndrews
          DeanAndrews last edited by

          Hi,

          I'm going to assume that as you have said it's an e-commerce site that the URL parameters are created by product variations, filters, sorts etc. If so then you must already be seeing those parameters on the URL of your site as you navigate and in your analytics or search results.

          Your SEO specialist should easily be able to add those parameters to the robots file. Then personally I would resubmit a site map for completeness and wait for results to take effect.

          sparrowdog 1 Reply Last reply Reply Quote 0
          • sparrowdog
            sparrowdog @FedeEinhorn last edited by

            Thanks. There's a lot of auto-generated content, duplicate pages and we've set the robots.txt file up to exclude a large number of them. Now we wait.

            Very helpful and greatly appreciated. Thank you.

            sparrowdog 1 Reply Last reply Reply Quote 0
            • sparrowdog
              sparrowdog @DeanAndrews last edited by

              Correct. I have gone in to URL Parameters already and set them to Crawl 'No URLs' for those we don't want crawled.

              We haven't added those parameters listed in there in to the robots.txt file yet, but I will do that now. I had an initial consult today and we ran way over time when we discovered all this stuff so I have another appointment in a couple of weeks.

              We have a sitemap of all the category pages and relevant static pages on the site already and Google has those indexed nicely. We just need to get rid of the 240,000 pages it has indexed that we don't want in there (frightening I know - it's a really high number).

              I greatly appreciate you taking the time to respond. Thank you.

              1 Reply Last reply Reply Quote 0
              • sparrowdog
                sparrowdog @FedeEinhorn last edited by

                Finally getting around  to doing this and noticed that when I change the start number to anything above 900, it doesn't work - ie: it's only letting me look at the first 1,000 results for some reason.

                The list of 1,000 has given me some good URL's to search off for the filtering thingy that was generating all the garbage URL's but I'd love to get past 1,000 if I can.

                Does anyone know how?

                1 Reply Last reply Reply Quote 0
                • sparrowdog
                  sparrowdog @sparrowdog last edited by

                  Looks like I can only do the first thousand. It's a start though. Thank you for the information.

                  Many of the URL's on my list, when put in to Google search, are giving me 80-100  other variants I can remove by hand.

                  http://www.mathewporter.co.uk/list-a-domains-indexed-pages-in-google-docs/ for anyone else following.

                  1 Reply Last reply Reply Quote 0
                  • 1 / 1
                  • First post
                    Last post
                  • Best way to link to 1000 city landing pages from index page in a way that google follows/crawls these links (without building country pages)?
                    lcourse
                    lcourse
                    0
                    7
                    54

                  • I'm noticing that URL that were once indexed by Google are suddenly getting dropped without any error messages in Webmasters Tools, has anyone seen issues like this before?
                    nystromandy
                    nystromandy
                    0
                    7
                    67

                  • Fetch as Google -- Does not result in pages getting indexed
                    0
                    1
                    82

                  • Webmaster Tools Not Indexing New Pages
                    orangeoctop.us
                    orangeoctop.us
                    0
                    3
                    99

                  • Page Count in Webmaster Tools Index Status Versus Page Count in Webmaster Tools Sitemap
                    AlanBleiweiss
                    AlanBleiweiss
                    0
                    16
                    1.6k

                  • Does Google still don't index Hashtag Links ? No chance to get a Search Result that leads directly to a section of a page? or to one of numeras Hashtag Pages in a single HTML page?
                    Muhammad_Jabali
                    Muhammad_Jabali
                    0
                    3
                    748

                  • Incorrect cached page indexing in Google while correct page indexes intermittently
                    MikeRoberts
                    MikeRoberts
                    0
                    2
                    298

                  • Are links that are disavowed with Google Webmaster Tools removed from the Google Webmaster Profile for the domain?
                    GregB123
                    GregB123
                    0
                    3
                    229

                  Get started with Moz Pro!

                  Unlock the power of advanced SEO tools and data-driven insights.

                  Start my free trial
                  Products
                  • Moz Pro
                  • Moz Local
                  • Moz API
                  • Moz Data
                  • STAT
                  • Product Updates
                  Moz Solutions
                  • SMB Solutions
                  • Agency Solutions
                  • Enterprise Solutions
                  • Digital Marketers
                  Free SEO Tools
                  • Domain Authority Checker
                  • Link Explorer
                  • Keyword Explorer
                  • Competitive Research
                  • Brand Authority Checker
                  • Local Citation Checker
                  • MozBar Extension
                  • MozCast
                  Resources
                  • Blog
                  • SEO Learning Center
                  • Help Hub
                  • Beginner's Guide to SEO
                  • How-to Guides
                  • Moz Academy
                  • API Docs
                  About Moz
                  • About
                  • Team
                  • Careers
                  • Contact
                  Why Moz
                  • Case Studies
                  • Testimonials
                  Get Involved
                  • Become an Affiliate
                  • MozCon
                  • Webinars
                  • Practical Marketer Series
                  • MozPod
                  Connect with us

                  Contact the Help team

                  Join our newsletter
                  Moz logo
                  © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                  • Accessibility
                  • Terms of Use
                  • Privacy