The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Indexing a several millions pages new website

    Indexing a several millions pages new website

    Intermediate & Advanced SEO
    5 4 190
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Pureshore
      Pureshore last edited by

      Hello everyone,

      I am currently working for a huge classified website who will be released in France in September 2013.

      The website will have up to 10 millions pages. I know the indexing of a website of such size should be done step by step and not in only one time to avoid a long sandbox risk and to have more control about it.

      Do you guys have any recommandations or good practices for such a task ? Maybe some personal experience you might have had ?

      The website will cover about 300 jobs :

      • In all region (= 300 * 22 pages)
      • In all departments (= 300 * 101 pages)
      • In all cities (= 300 * 37 000 pages)

      Do you think it would be wiser to index couple of jobs by couple of jobs (for instance 10 jobs every week) or to index with levels of pages (for exemple, 1st step with jobs in region, 2nd step with jobs in departements, etc.) ?

      More generally speaking, how would you do in order to avoid penalties from Google and to index the whole site as fast as possible ?

      One more specification : we'll rely on a (big ?) press followup and on a linking job that still has to be determined yet.

      Thanks for your help !

      Best Regards,

      Raphael

      1 Reply Last reply Reply Quote 0
      • EGOL
        EGOL last edited by

        If you plan to get a website that big indexed you will need to have a few things in order...

        First, you will need thousands of deep links that connect to hub pages deep within the site.  These will force spiders down there and make them chew their way out through the unindexed pages.  These must be permanent links.  If you remove them then spiders will stop visiting and google will forget your pages.  For a 10 million page site you will need thousands of links hitting thousands of hub pages.

        Second, for a site this big.... are you going to have substantive amounts of unique content?  If your pages are made from a cookie cutter and look like this....

        "yada yada yada yada yada yada yada yada SEO job in Paris yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada send application to Joseph Blowe, 11 Anystreet, Paris, France yada yada yada yada yada yada yada yadayada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada"

        .... then Google will index these pages, then a few weeks to a few months later your entire site might receive a Panda penalty and drop from google.

        Finally... all of those links needed to get the site in the index... they need to be Penguin proof.

        It is not easy to get a big site in the index.  Google is tired of big cookie cutter sites with no information or yada yada content.  They are quickly toasted these days.

        1 Reply Last reply Reply Quote 1
        • iboxsecurityltd
          iboxsecurityltd last edited by

          We worked in partnership with a similar large scale site last year and found the exact same. Google simply cut off 60% of our pages out of the index as they were cookie cutter.

          You have to ensure that pages have relevant, unique and worthy content. Otherwise if all your doing is replacing the odd word here and there for the locality and job name its not going to work.

          Focus on having an on going SEO campaign for each target audience be that  for e.g. by job type / locality / etc.

          1 Reply Last reply Reply Quote 1
          • jonnyholt
            jonnyholt last edited by

            I really don't think Google likes it when you release a website that big.  It would much rather you build it slowly.  I would urge you to have main pages and noindex the sub categories.

            1 Reply Last reply Reply Quote 0
            • Pureshore
              Pureshore last edited by

              Hello everyone,

              Thanks for sharing your experience and your answers, it's greatly appreciated.

              The website is build in order to avoid cookie cutter pages : each page will have unique content from classifieds (unique because classifieds won't be indexed in the first place, to avoid having too much pages).

              The linking is as well though in order for each page to have permanents internal links in a logical way.

              I understand from your answers that it is better to take time and to index the site step by step : mostly according to the number and the quality of classifieds (and thus the content) for each jobs/locality. It's not worth to index pages without any classifieds (and thus unique content) as they will be cut off by Google in a near future.

              1 Reply Last reply Reply Quote 0
              • 1 / 1
              • First post
                Last post
              • How long will old pages stay in Google's cache index. We have a new site that is two months old but we are seeing old pages even though we used 301 redirects.
                DonnaDuncan
                DonnaDuncan
                0
                3
                81

              • How do we decide which pages to index/de-index? Help for a 250k page site
                julie-getonthemap
                julie-getonthemap
                0
                2
                63

              • How to speed indexing of web pages after website overhaul.
                ramansaab
                ramansaab
                0
                3
                150

              • Client rebranded with a new website but can't migrate now defunct franchise website to new website.
                Paddy_Moogan
                Paddy_Moogan
                0
                4
                122

              • Does it make sense to create new pages with friendlier URLs then redirect old pages to new?
                Keszi
                Keszi
                0
                3
                73

              • New Web Page Not Indexed
                DougRoberts
                DougRoberts
                0
                9
                106

              • Incorrect cached page indexing in Google while correct page indexes intermittently
                MikeRoberts
                MikeRoberts
                0
                2
                298

              • Why are new pages not being indexed, and old pages (now in robots.txt) remain in the index?
                KeriMorgret
                KeriMorgret
                0
                3
                378

              Get started with Moz Pro!

              Unlock the power of advanced SEO tools and data-driven insights.

              Start my free trial
              Products
              • Moz Pro
              • Moz Local
              • Moz API
              • Moz Data
              • STAT
              • Product Updates
              Moz Solutions
              • SMB Solutions
              • Agency Solutions
              • Enterprise Solutions
              • Digital Marketers
              Free SEO Tools
              • Domain Authority Checker
              • Link Explorer
              • Keyword Explorer
              • Competitive Research
              • Brand Authority Checker
              • Local Citation Checker
              • MozBar Extension
              • MozCast
              Resources
              • Blog
              • SEO Learning Center
              • Help Hub
              • Beginner's Guide to SEO
              • How-to Guides
              • Moz Academy
              • API Docs
              About Moz
              • About
              • Team
              • Careers
              • Contact
              Why Moz
              • Case Studies
              • Testimonials
              Get Involved
              • Become an Affiliate
              • MozCon
              • Webinars
              • Practical Marketer Series
              • MozPod
              Connect with us

              Contact the Help team

              Join our newsletter
              Moz logo
              © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
              • Accessibility
              • Terms of Use
              • Privacy