The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Lots of incorrect urls indexed - Googlebot found an extremely high number of URLs on your site

    Lots of incorrect urls indexed - Googlebot found an extremely high number of URLs on your site

    Intermediate & Advanced SEO
    8 3 1.1k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • SarahCollins
      SarahCollins last edited by

      Hi,

      Any assistance would be greatly appreciated.

      Basically, our rankings and traffic etc have been dropping massively recently google sent us a message stating " Googlebot found an extremely high number of URLs on your site".

      This first highligted us to the problem that for some reason our eCommerce site has recently generated loads (potentially thousands) of rubbish urls hencing giving us duplication everywhere which google is obviously penalizing us with in the terms of rankings dropping etc etc.

      Our developer is trying to find the route cause of this but my concern is, How do we get rid of all these bogus urls ?. If we use GWT to remove urls it's going to take years.

      We have just amended our Robot txt file to exclude them going forward but they have already been indexed so I need to know do we put a redirect 301 on them and also a HTTP Code 404 to tell google they don't exist ? Do we also put a No Index on the pages or what .

      what is the best solution .?

      A couple of example of our problems are here :

      In Google type -

      site:bestathire.co.uk inurl:"br"

      You will see 107 results. This is one of many lot we need to get rid of.

      Also -

      site:bestathire.co.uk intitle:"All items from this hire company"

      Shows 25,300 indexed pages we need to get rid of

      Another thing to help tidy this mess up going forward is to improve on our pagination work. Our Site uses Rel=Next and Rel=Prev but no concanical.

      As a belt and braces approach, should we also put concanical tags on our category pages whereby there are more than 1 page. I was thinking of doing it on the Page 1 of our most important pages or the View all or both ?. Whats' the general consenus ?

      Any advice on both points greatly appreciated?

      thanks

      Sarah.

      1 Reply Last reply Reply Quote 0
      • emediaSEO
        emediaSEO last edited by

        In the short term I would definitely use canonicals to let Google know which are the right pages until you can fix your problem.  Also, have you submitted a sitemap to Webmasters?

        SarahCollins 1 Reply Last reply Reply Quote 0
        • SarahCollins
          SarahCollins @emediaSEO last edited by

          Yes we submitted mini site maps to webmaster originally a couple of months back as our site is 60K pages so we broke is down to categories it etc.

          We have not submitted a new map since finding this problem.

          We are in the process of using the sitemap generator to generator new site map to see if it picks up anything usual.

          Are u suggesting to resubmit ?

          thanks

          Sarah

          emediaSEO 1 Reply Last reply Reply Quote 0
          • emediaSEO
            emediaSEO @SarahCollins last edited by

            As long as you think the sitemap is done right it should be fine.

            1 Reply Last reply Reply Quote 0
            • RuthBurrReedy
              RuthBurrReedy last edited by

              Oh how frustrating!

              There are a couple of things that you can do. Updating your robots.txt is a good start since the next time your site is crawled, Google should find that and drop at least some of the offending pages from the index.  I would also go in to every page of your site and add in a rel=canonical tag to the original version of the URL.  That way, even if your ecommerce platform is generating odd versions of the URL, that canonical tag will be on the duplicate versions letting engines know they're not the original page.

              For the existing pages, you could just 301 them all back to the original versions, or add the canonical tag pointing back to the original versions.  I would also add the tag to these pages to let Google know not to include them in the index.

              With pagination and canonicalization there are a few different approaches, and each has its pros and cons.  Dr. Pete wrote a really great post on canonicalization that just went out, you can read it here: http://www.seomoz.org/blog/which-page-is-canonical.  I also recommend reading Adam Audette's post on pagination options at Search Engine Land: http://searchengineland.com/the-latest-greatest-on-seo-pagination-114284\. I hope that helps!

              SarahCollins 1 Reply Last reply Reply Quote 1
              • SarahCollins
                SarahCollins @RuthBurrReedy last edited by

                Thanks Ruth for the very comprehensive answer. Greatly Appreciated !.

                Just to clarify your suggestion about the Rel=Canonical tag. Put it on the preferred pages . When the duplicate odd urls get generated, they Wont have a canonical tag so google will know there are not the original page ?.. Is that correct.

                Sorry I just got a bit confused as you said the duplicate pages will have a concanical tag as well ?

                As for the existing pages, they are very recent so wouldn't assume they would have any  pr to warrent a 301 as opposed to a 404 but guess either would be ok.

                Also adding the Meta name no index tag as you suggested to sounds very wise so will get that done to.

                We also can't find how these urls were created and then indexed so just hoping a debug file we just created may shed some light.

                Will keep you posted....

                Many thanks

                Sarah

                RuthBurrReedy SarahCollins 2 Replies Last reply Reply Quote 0
                • RuthBurrReedy
                  RuthBurrReedy @SarahCollins last edited by

                  Since (I assume this is what is happening) your ecommerce platform is duplicating the entire page, code and all, and putting it at these new URLs, having the canonical tag of the original page URL in the code for the right/real page will mean that, when it gets duplicated, the canonical tag will get duplicated as well and point back to the original URL.  Make sense?

                  Can you talk to your ecommerce platform provider?  This can't be an intended feature!

                  1 Reply Last reply Reply Quote 1
                  • SarahCollins
                    SarahCollins @SarahCollins last edited by

                    Ahhh,  I see what you mean now. Yes, good idea .

                    Will get that implement to.

                    Yes, everything is duplicated.It's all the same apart from the url which seems to be bringing in to different locations instead of one.

                    Odd url Generated(notice it has 2 locations in it)

                    http://www.bestathire.co.uk/rent/Vacuum_cleaners/Walsall/250/Alfreton

                    Correct location specific urls -

                    http://www.bestathire.co.uk/rent/Vacuum_cleaners/Walsall/250

                    http://www.bestathire.co.uk/rent/Vacuum_cleaners/Alfreton/250

                    thanks

                    Sarah.

                    1 Reply Last reply Reply Quote 0
                    • 1 / 1
                    • First post
                      Last post
                    • Same URL-Structure & the same number of URLs indexed on two different websites - can it lead to a Google penalty?
                      0
                      1
                      13

                    • Google Indexed Site A's Content On Site B, Site C etc
                      Paddy_Moogan
                      Paddy_Moogan
                      1
                      7
                      70

                    • Client wants to remove mobile URLs from their sitemap to avoid indexing issues. However this will require SEVERAL billing hours. Is having both mobile/desktop URLs in a sitemap really that detrimental to search indexing?
                      RosemaryB
                      RosemaryB
                      0
                      7
                      89

                    • Is 1:1 301 redirect required on indexed URL when restructing URL even if the new URL is canonicalized?
                      EricaMcGillivray
                      EricaMcGillivray
                      0
                      2
                      138

                    • New site, new URL, lots of custom content. Load it all or "trickle" it over time?
                      Houses
                      Houses
                      0
                      3
                      113

                    • Google showing high volume of URLs blocked by robots.txt in in index-should we be concerned?
                      TakeshiYoung
                      TakeshiYoung
                      0
                      4
                      302

                    • Site revamp for neglected site - modifying site structure, URLs and content - is there an optimal approach?
                      macrobbo
                      macrobbo
                      0
                      3
                      171

                    • Googlebot found an extremely high number of URLs on your site
                      Myntra
                      Myntra
                      0
                      5
                      2.6k

                    Get started with Moz Pro!

                    Unlock the power of advanced SEO tools and data-driven insights.

                    Start my free trial
                    Products
                    • Moz Pro
                    • Moz Local
                    • Moz API
                    • Moz Data
                    • STAT
                    • Product Updates
                    Moz Solutions
                    • SMB Solutions
                    • Agency Solutions
                    • Enterprise Solutions
                    • Digital Marketers
                    Free SEO Tools
                    • Domain Authority Checker
                    • Link Explorer
                    • Keyword Explorer
                    • Competitive Research
                    • Brand Authority Checker
                    • Local Citation Checker
                    • MozBar Extension
                    • MozCast
                    Resources
                    • Blog
                    • SEO Learning Center
                    • Help Hub
                    • Beginner's Guide to SEO
                    • How-to Guides
                    • Moz Academy
                    • API Docs
                    About Moz
                    • About
                    • Team
                    • Careers
                    • Contact
                    Why Moz
                    • Case Studies
                    • Testimonials
                    Get Involved
                    • Become an Affiliate
                    • MozCon
                    • Webinars
                    • Practical Marketer Series
                    • MozPod
                    Connect with us

                    Contact the Help team

                    Join our newsletter
                    Moz logo
                    © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                    • Accessibility
                    • Terms of Use
                    • Privacy