The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. How can I best find out which URLs from large sitemaps aren't indexed?

    How can I best find out which URLs from large sitemaps aren't indexed?

    Technical SEO Issues
    4 2 271
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • rango
      rango last edited by

      I have about a dozen sitemaps with a total of just over 300,000 urls in them. These have been carefully created to only select the content that I feel is above a certain threshold.

      However, Google says they have only indexed 230,000 of these urls. Now I'm wondering, how can I best go about working out which URLs they haven't indexed? No errors are showing in WMT related to these pages.

      I can obviously manually start hitting it, but surely there's a better way?

      1 Reply Last reply Reply Quote 0
      • Audiohype
        Audiohype last edited by

        Hi Peter,

        I'd attempt some sort of export of both indexed URLs and actual URLs into an Excel file and try and remove duplicates.

        You would need to look into it but I'm sure there's a way of matching and removing duplicates.

        Other than that I wouldn't know.

        Ben

        rango 1 Reply Last reply Reply Quote 1
        • rango
          rango @Audiohype last edited by

          Any ideas on how to go about exporting indexed urls?

          Audiohype 1 Reply Last reply Reply Quote 0
          • Audiohype
            Audiohype @rango last edited by

            There's no obvious function in WM tools, but having a look round there's this option:

            http://www.aspfree.com/c/a/BrainDump/Extracting-Google-Indexed-Web-Site-Pages-Using-MS-Excel/

            But Google will only display the first 1000 URLs on a site query so you would need to adapt it lots of times. From the looks of it there's not an easy way.

            There's maybe a tool out there that is similar to Xenu, but checks the index status in Google also. I haven't ever had the need for this so I'm not aware of one, but the chances are there is something out there.

            Good luck!

            1 Reply Last reply Reply Quote 0
            • 1 / 1
            • First post
              Last post
            • If I'm using a compressed sitemap (sitemap.xml.gz) that's the URL that gets submitted to webmaster tools, correct?
              ThompsonPaul
              ThompsonPaul
              0
              6
              1.8k

            • Site hacked, but can't find the code
              LinkWheelOldSchool
              LinkWheelOldSchool
              0
              4
              134

            • I have a 404 error on my site i can't find.
              NateStewart
              NateStewart
              0
              7
              324

            • Can't find mistake in robots.txt
              Debdulal
              Debdulal
              0
              3
              338

            • I have 404 errors but can't find where these links are?
              JackMurphy
              JackMurphy
              0
              2
              280

            • Any idea why our sitemap images aren't indexed?
              CommercePundit
              CommercePundit
              0
              5
              1.1k

            • Destination URL in SERPs keeps changing and I can't work out why.. Help.
              seoninja20
              seoninja20
              0
              8
              966

            Get started with Moz Pro!

            Unlock the power of advanced SEO tools and data-driven insights.

            Start my free trial
            Products
            • Moz Pro
            • Moz Local
            • Moz API
            • Moz Data
            • STAT
            • Product Updates
            Moz Solutions
            • SMB Solutions
            • Agency Solutions
            • Enterprise Solutions
            • Digital Marketers
            Free SEO Tools
            • Domain Authority Checker
            • Link Explorer
            • Keyword Explorer
            • Competitive Research
            • Brand Authority Checker
            • Local Citation Checker
            • MozBar Extension
            • MozCast
            Resources
            • Blog
            • SEO Learning Center
            • Help Hub
            • Beginner's Guide to SEO
            • How-to Guides
            • Moz Academy
            • API Docs
            About Moz
            • About
            • Team
            • Careers
            • Contact
            Why Moz
            • Case Studies
            • Testimonials
            Get Involved
            • Become an Affiliate
            • MozCon
            • Webinars
            • Practical Marketer Series
            • MozPod
            Connect with us

            Contact the Help team

            Join our newsletter
            Moz logo
            © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
            • Accessibility
            • Terms of Use
            • Privacy