The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. Are there any negative side effects of having millions of URLs on your site?

    Are there any negative side effects of having millions of URLs on your site?

    Technical SEO Issues
    4 4 200
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Deluxe
      Deluxe last edited by

      After a site upgrade, we found that we have over 3.7 million URLs on our site. Many of these URLs are due to the facet options. Each facet combination yields a different URL. However, we need to do a deeper analysis into these URLs to see if this is the only reason why so many are returning.

      Does anyone know if there are any negatives of having so many URLs crawled, other than the fact that Google only spends so much time crawling a site? Is the number of URLs something that should be concerning?

      Any insight appreciated!

      1 Reply Last reply Reply Quote 0
      • Toddfoster
        Toddfoster last edited by

        There are several concerns to be addressed with this scenario:

        1. Organization

        This is going to be very difficult to keep track of. If you are well-organized or the pages will not need much adjusting, this is probably okay.

        1. Duplicate Content

        This is going to be a pain the behind. That being said, most site auditing tools will allow you to make adjustments as necessary.

        1. Broken Links

        With a site of this size, broken links and 404's are going to be inevitable. This could lead to some negative SEO impacts and will have to be kept on top of.

        1. Hacking

        This is a big reason why some sites have enormous numbers of URLs. This would likely be the biggest concern on my mind and worth looking in to. Going through that many pages will be impossible, so it might be worth taking a look at the link profile and determining where most of your links are coming from. If these are coming from spammy sites, you may have a problem there.

        All this being said, the size of a website is normally not a cause for concern. Just make sure that your main pages (Home, Landing Pages) are properly handled and optimized and you shouldn't have too much trouble. I would add that unwieldy htaccess files (large ones) can result in slower loading times, which can impact your rankings with Google.

        Let me know if there is anything specific concerning you and I will be happy to help. Congrats on the upgrade and hope it works out!

        Rob

        1 Reply Last reply Reply Quote 3
        • MichaelC-15022
          MichaelC-15022 last edited by

          I'll echo Robert's concern about duplicate content.  If those facet combinations are creating many pages with very similar content, that could be an issue for you.

          If, let's say, there are 100 facet combinations that create essentially the same basic page content, then consider taking facet elements that do NOT substantially change the page content, and use rel=canonical to tell Google that those are all really the same page.  For instance, let's say one of the facets is packaging size, and product X comes in boxes of 1, 10, 100, or 500 units.  Let's say another facet is color, and it comes in blue, green, or red. Let's say the URLs for these look like this:

          www.mysite.com/product.php?pid=12345&color=blue&pkgsize=1

          www.mysite.com/product.php?pid=12345&color=green&pkgsize=10

          www.mysite.com/product.php?pid=12345&color=red&pkgsize=100

          You would want to set the rel=canonical on all of these to:

          www.mysite.com/product.php?pid=12345

          Be sure that your XML sitemap, your on-page meta robots, and your rel=canonicals are all in agreement.  In other words, if a page has meta robots "noindex,follow", it should NOT show up in your XML sitemap.  If the pages above have their rel=canonicals set as described, then your sitemap should contain www.mysite.com/product.php?pid=12345 and NONE of the three example URLs with the color and pkgsize parameters above.

          1 Reply Last reply Reply Quote 3
          • CleverPhD
            CleverPhD last edited by

            Agree with the points above with one exception.   Yes, you have to find a way to deal with duplicate and quality content at scale.  Yes, Robots.txt, nofollow links and index sitemaps are your friends.  I would not use rel=canonical unless I had to.  Better to get those extra pages de-indexed and then not let Google crawl the urls with the extra parameters to start with.  Why waste Google's time in crawling pages that are just resorted versions of another?   If you use the directives wisely you probably "only" have 200,000 pages worth crawling if you have that many sort parameters.

            Good luck!

            1 Reply Last reply Reply Quote 1
            • 1 / 1
            • First post
              Last post
            • Changing site URL structure
              vezaus
              vezaus
              0
              2
              80

            • New SEO manager needs help! Currently only about 15% of our live sitemap (~4 million url e-commerce site) is actually indexed in Google. What are best practices sitemaps for big sites with a lot of changing content?
              Nigel_Carr
              Nigel_Carr
              1
              4
              106

            • If I want clean up my URLs and take the "www.site.com/page.html" and make it "www.site.com/page" do I need a redirect?
              Booj
              Booj
              0
              4
              113

            • Can the Hosting location of image files have a negative effect if 'off-site' such as on the devs own media server ?
              0
              1
              103

            • Does "?" in my URL have a negative effect?
              Sarbs
              Sarbs
              0
              2
              131

            • Ideally should a 301 or 302 redirect be used from https://www.site.com to http://www.site.com? Is there a valid reason to use a 302 in this situation or would using a 301 have any negative impact on seo?
              Unity
              Unity
              0
              2
              317

            • Negative url name?
              fun52dig
              fun52dig
              0
              8
              1.0k

            • What are some of the negative effects of having duplicate content from other sites?
              EGOL
              EGOL
              0
              2
              737

            Get started with Moz Pro!

            Unlock the power of advanced SEO tools and data-driven insights.

            Start my free trial
            Products
            • Moz Pro
            • Moz Local
            • Moz API
            • Moz Data
            • STAT
            • Product Updates
            Moz Solutions
            • SMB Solutions
            • Agency Solutions
            • Enterprise Solutions
            • Digital Marketers
            Free SEO Tools
            • Domain Authority Checker
            • Link Explorer
            • Keyword Explorer
            • Competitive Research
            • Brand Authority Checker
            • Local Citation Checker
            • MozBar Extension
            • MozCast
            Resources
            • Blog
            • SEO Learning Center
            • Help Hub
            • Beginner's Guide to SEO
            • How-to Guides
            • Moz Academy
            • API Docs
            About Moz
            • About
            • Team
            • Careers
            • Contact
            Why Moz
            • Case Studies
            • Testimonials
            Get Involved
            • Become an Affiliate
            • MozCon
            • Webinars
            • Practical Marketer Series
            • MozPod
            Connect with us

            Contact the Help team

            Join our newsletter
            Moz logo
            © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
            • Accessibility
            • Terms of Use
            • Privacy