The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Huge Google index on E-commerce site

    Huge Google index on E-commerce site

    Intermediate & Advanced SEO
    5 4 757
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • ssiebn7
      ssiebn7 last edited by

      Hi Guys,

      I got a question which i can't understand.

      I'm working on a e-commerce site which recently got a CMS update including URL updates. 
      We did a lot of 301's on the old url's (around 3000 /4000 i guess) and submitted a new sitemap (around 12.000 urls, of which 10.500 are indexed).

      The strange thing is.. When i check the indexing status in webmaster tools Google tells me there are over 98.000 url's indexed.
      Doing the site:domainx.com Google tells me there are 111.000 url's indexed.

      Another strange thing which another forum member describes here :

      Cache date has been reverted 

      And next to that old url's (which have a 301 for about a month now) keep showing up in the index.

      Does anyone know what i could do to solve the problem?

      1 Reply Last reply Reply Quote 0
      • LynnPatchett
        LynnPatchett last edited by

        Hi,

        A couple of things could be and probably are at work in this situation.

        1. For the 301 redirects, if the site is big (12000 urls), depending on how often and much google crawls the site it could easily take more than a month for it to find and identify all the new urls/301 redirects etc and then update its cache of indexed pages. So in this case its is a matter of patience. If the 301s are implemented correctly, they will eventually be indexed.

        2. You have done 3 or 4000 301s, for the rest of the the old 12000 urls what are you showing, a 404? It is a big undertaking to redirect that many pages, but worth thinking about the technical side of what is happening, part of your 98000 indexed urls could be a mix of old and new if the old ones are not being redirected to a page that clearly states that they are either somewhere else (301) or no longer available (404).

        3. A common problem with e-shops is duplicate content due to various things like product filters, search string variables etc that are going to pages that are indexable and do not have rel canonical tags. A good way to see if this is the case is to search for likely url parts in your cms that could lead to this issue (maybe you have filters that result in urls like xxx?search=123 or xxx?manufacturer=23 etc) and then do a google search along the lines of site:xxx.com inurl:manufacturer which should give a good idea of if/where you have this problem. This case of duplicate content could be even more pronounced if it was occurring on your old cms urls AND your new cms urls and a combination of these are in your 98000 total.

        Hope that helps!

        1 Reply Last reply Reply Quote 2
        • AJPro
          AJPro last edited by

          We had similar issues with too many indexed pages (about 100,000 pages) for a site with about 3500 pages.

          By setting a canonical url on each page and also preventing google from indexing and crawling some of the urls (robots.txt and meta noindex) we are now down to 3500 urls, The benefit is (besides less duplicate content), much faster indexing of new pages.

          http://support.google.com/webmasters/bin/answer.py?hl=en&answer=139394

          1 Reply Last reply Reply Quote 1
          • SEOAndy
            SEOAndy last edited by

            something to check would be in WMT if you go to the advanced section of the index status chart you should see currently in the index and ever indexed, it sounds like you are just seeing the ever indexed number which could be huge for almost any website.

            1 Reply Last reply Reply Quote 1
            • ssiebn7
              ssiebn7 last edited by

              Allright guys, thanks alot for the answers.

              Gonna try some things out coming monday.

              Canonical url's and pagination (rel=prev) will work i guess.

              The hard part is, i'm working on this site with a development company that tells me they can url redirect all the 404's to the homepage while they must be redirected either to other products or category pages.

              So only solution is that i have to do that by hand, one by one via a tool they build. But it's a hell of a job!

              @ Andy , I checked it and it actually says :

              Total indexed : 98.000
              Ever crawled: 929.762

              And when i check the questionmark at total indexed it says:
              Total number of url's added to Google index.

              Thanks again for your answers 🙂

              1 Reply Last reply Reply Quote 0
              • 1 / 1
              • First post
                Last post
              • Google Is Indexing my 301 Redirects to Other sites
                Keszi
                Keszi
                0
                4
                573

              • Can anyone help me diagnose an indexing/sitemap issue on a large e-commerce site?
                rjonesx. 0
                rjonesx. 0
                0
                4
                80

              • Indexed Pages Different when I perform a "site:Google.com" site search - why?
                0
                1
                118

              • Google Indexing our site
                GastonRiera
                GastonRiera
                0
                9
                187

              • Site not indexed in Google UK
                SEO5Team
                SEO5Team
                0
                5
                119

              • Huge Google index on E-commerce site
                ssiebn7
                ssiebn7
                0
                4
                106

              • Google Indexed the HTTPS version of an e-commerce site
                OptimizeSmart
                OptimizeSmart
                0
                6
                1.7k

              • Indexing an e-commerce site
                jenga11
                jenga11
                0
                9
                594

              Get started with Moz Pro!

              Unlock the power of advanced SEO tools and data-driven insights.

              Start my free trial
              Products
              • Moz Pro
              • Moz Local
              • Moz API
              • Moz Data
              • STAT
              • Product Updates
              Moz Solutions
              • SMB Solutions
              • Agency Solutions
              • Enterprise Solutions
              • Digital Marketers
              Free SEO Tools
              • Domain Authority Checker
              • Link Explorer
              • Keyword Explorer
              • Competitive Research
              • Brand Authority Checker
              • Local Citation Checker
              • MozBar Extension
              • MozCast
              Resources
              • Blog
              • SEO Learning Center
              • Help Hub
              • Beginner's Guide to SEO
              • How-to Guides
              • Moz Academy
              • API Docs
              About Moz
              • About
              • Team
              • Careers
              • Contact
              Why Moz
              • Case Studies
              • Testimonials
              Get Involved
              • Become an Affiliate
              • MozCon
              • Webinars
              • Practical Marketer Series
              • MozPod
              Connect with us

              Contact the Help team

              Join our newsletter
              Moz logo
              © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
              • Accessibility
              • Terms of Use
              • Privacy