The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Crawl efficiency - Page indexed after one minute!

    Crawl efficiency - Page indexed after one minute!

    Intermediate & Advanced SEO
    5 3 560
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Mr.bfz
      Mr.bfz last edited by

      Hey Guys,A site that has  5+ million pages indexed and 300 new pages a day.I hear a lot that sites at this level its all about efficient crawlabitliy.The pages of this site gets indexed one minute after the page is online.1) Does this mean that the site is already crawling efficient and there is not much else  to do about it?2) By increasing crawlability efficiency, should I expect gogole to crawl my site less (less bandwith google takes from my site for the same amount of crawl)or to crawl my site more often?Thanks

      1 Reply Last reply Reply Quote 0
      • AndrewAtMGXCopy
        AndrewAtMGXCopy last edited by

        You can actually let Google know about a new mass of pages through the sitemap. The sitemap is a single file what can be parsed to produce a large list of links.

        Google can discover new pages by comparing the list of links with what they know about.

        Here's an intro link that covers the sitemap: http://blog.kissmetrics.com/get-google-to-index/

        1 Reply Last reply Reply Quote 0
        • anthonydnelson
          anthonydnelson last edited by

          Crawl efficiency isn't exactly the same as indexation speed. It is normal for a new page to be indexed quickly, often times it is linked to from the blog home page, shared on social networks, etc.

          Crawl efficiency has a lot to do with making sure your most important pages are crawled as frequently as possible. Let's use the example of your site with 5,000,000 pages indexed. Perhaps there are 100,000 of those pages that are extremely important for your website. Your top categories, all of your products, your content, etc.

          Then you are left with 4,900,000 pages that are not that important, but needed for the functionality of your website (pagination, filtering, sorting, etc). You have to determine, is it a good thing that Google has 5 million pages of your site indexed? Do you want Google regularly crawling those 4,900,000 pages, potentially at the expense of your more important pages?

          Next, you check your Google Webmaster Tools and see that Google is crawling about 130,000 pages/day on your site. At that rate, it would take Google 38 days (over an entire month) to crawl your entire site. Of course, it doesn't actually work that way - Google will crawl your site in a logical manor, crawling the pages with high authority (well linked to internally/externally) much more often. The point is, you can see that not all of your pages are being crawled every day. You want your best content crawled as frequently as possible.

          "To be more blunt, if a page hasn't been crawled recently, it won't rank well." This quote is taken from one of my favorite resources on this topic, is this post by AJ Kohn. http://www.blindfiveyearold.com/crawl-optimization

          Crawl efficiency is guiding the search spiders to your best content and helping them learn what types of pages you can ignore. You do this primarily through: Site Structure, Internal Linking, robots.txt, NoFollow attribute and Parameter Handling in Google Webmaster Tools.

          1 Reply Last reply Reply Quote 0
          • Mr.bfz
            Mr.bfz last edited by

            Thanks Anthony,

            Your explanation was very helpful.

            Assuming that 3 millions pages out of my 5 are not so important for google to be crawling or indexing.

            What would be the best way to optimize my crawl efficiency in relation to the amount of pages?

            Just <noindex>3 million pages on the site, I believe this can be a risk move.</noindex>

            Perhaps robots.txt but that would not de-index the existing pages.

            1 Reply Last reply Reply Quote 0
            • anthonydnelson
              anthonydnelson last edited by

              This is a complicated question that I can't give a simple answer for, as every site is set-up differently and has it's own challenges. You will likely use a variety of the techniques mentioned in my last paragraph above. Good luck.

              1 Reply Last reply Reply Quote 0
              • 1 / 1
              • First post
                Last post
              • How do we decide which pages to index/de-index? Help for a 250k page site
                julie-getonthemap
                julie-getonthemap
                0
                2
                63

              • What to do when your home page an index for a series of pages.
                donford
                donford
                0
                7
                154

              • Should we show(to google) different city pages on our website which look like home page as one page or different? If yes then how?
                sanchitmalik
                sanchitmalik
                0
                3
                140

              • How can a Page indexed without crawled?
                Devanur-Rafi
                Devanur-Rafi
                0
                7
                91

              • Does Google still don't index Hashtag Links ? No chance to get a Search Result that leads directly to a section of a page? or to one of numeras Hashtag Pages in a single HTML page?
                Muhammad_Jabali
                Muhammad_Jabali
                0
                3
                748

              • Why would one of our section pages NOT be indexed by Google?
                yatesandcojewelers
                yatesandcojewelers
                0
                5
                130

              • Howcome Google is indexing one day 2500 pages and the other day only 150 then 2000 again ect?
                Zanox
                Zanox
                0
                4
                186

              • Why are new pages not being indexed, and old pages (now in robots.txt) remain in the index?
                KeriMorgret
                KeriMorgret
                0
                3
                378

              Get started with Moz Pro!

              Unlock the power of advanced SEO tools and data-driven insights.

              Start my free trial
              Products
              • Moz Pro
              • Moz Local
              • Moz API
              • Moz Data
              • STAT
              • Product Updates
              Moz Solutions
              • SMB Solutions
              • Agency Solutions
              • Enterprise Solutions
              • Digital Marketers
              Free SEO Tools
              • Domain Authority Checker
              • Link Explorer
              • Keyword Explorer
              • Competitive Research
              • Brand Authority Checker
              • Local Citation Checker
              • MozBar Extension
              • MozCast
              Resources
              • Blog
              • SEO Learning Center
              • Help Hub
              • Beginner's Guide to SEO
              • How-to Guides
              • Moz Academy
              • API Docs
              About Moz
              • About
              • Team
              • Careers
              • Contact
              Why Moz
              • Case Studies
              • Testimonials
              Get Involved
              • Become an Affiliate
              • MozCon
              • Webinars
              • Practical Marketer Series
              • MozPod
              Connect with us

              Contact the Help team

              Join our newsletter
              Moz logo
              © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
              • Accessibility
              • Terms of Use
              • Privacy