The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Googlebot on steroids... Why?

    Googlebot on steroids... Why?

    Intermediate & Advanced SEO
    7 2 82
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Olaf
      Olaf last edited by

      We launched a new website (www.gelderlandgroep.com). The site contains 500 pages, but some pages (like https://www.gelderlandgroep.com/collectie/) contains filters (so there are a lot possible url parameters). Last week we mentioned a tremendous amount of traffic (25 GB!!) and CPU usage on the server.

      2017-12-04 16:11:57 W3SVC66 IIS14 83.219.93.171 GET /collectie model=6511,6901,7780,7830,2105-illusion&ontwerper=henk-vos,foklab 443 - 66.249.76.153 HTTP/1.1 Mozilla/5.0+(Linux;+Android+6.0.1;+Nexus+5X+Build/MMB29P)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/41.0.2272.96+Mobile+Safari/537.36+(compatible;+Googlebot/2.1;++http://www.google.com/bot.html) - - www.gelderlandgroep.com 200 0 0 9445 501 312

      We find out that "Googlebot" was firing many, many requests.  At first we did a nslookup for the IPadres where it actually seems to be googlebot.

      Second we visited Google Searchconsole and I was really surprised... Googlebot on steroids? Googlebot requested 922.565 different url's and made combinations for every filter/ parameter combination on the site. Why? The sitemap.xml contains 500 url's... The authority of the site isn't very high, no other signal that this is a special website... Why so much "Google resources"?

      Of course we will exclude the parameters in SearchConsole, but I never saw a Googlebot activity for a small website like this before! Does anybody have any clue?

      Regards Olaf

      searchconsole.png nslookup.png

      1 Reply Last reply Reply Quote 0
      • seoman10
        seoman10 last edited by

        I would say your filters are creating pages in their own right, or at least as Google bot sees it.  I have seen a similar thing happen on a site redesign.  Potentially, if you can access each filter with a URL that could be listed as an individual page, assuming the content is different.

        The first time Google crawls your site, it will try to find everything it possibly can to put it in the index, Google will eat data like no tomorrow 🙂

        At this stage I wouldn't be too worried about it, just keep an eye out for duplicate content.  I guess you'll see both graphs dipped down again to normal levels within a few days.

        Olaf 1 Reply Last reply Reply Quote 0
        • Olaf
          Olaf @seoman10 last edited by

          Mmm, is that correct? I thought that the amount of resources Google will put in crawling your (new) website also depends of it's authority. 9 million url's, for four days now... It seems to bee so much for this small website...

          1 Reply Last reply Reply Quote 0
          • seoman10
            seoman10 last edited by

            As far as I know, Google will attempt to find every single page it can possibly find regardless of authority.  The frequency after the initial crawl will be affected by the site authority, volume and frequency of updates.

            Virtually every page on every website that is publicly accessible will be index and rank somewhere, where you rank will be determined by Google ranking factors.

            Keep in mind that search console stats will be a few days out of date (2 or 3 days) and it will normally take two or three days to crawl.

            1 Reply Last reply Reply Quote 1
            • Olaf
              Olaf last edited by

              Thanks for your help!

              I think you're probably right. The initial crawling must be complete if Google wants to put everything into the right perspective. But we manage en host more than 300 sites, including large A-brand sites. And even at those sites I had not seen this kind of volumes before.

              The server logs also show the same amount of request this night (day five). I will keep you posted if this still continues after the weekend.

              1 Reply Last reply Reply Quote 0
              • seoman10
                seoman10 last edited by

                Glad to help!

                The large volume could well be to do with the way the filters are set up. There is also a possibility you could be sending some sort of authority signal somehow to Google, for instance if it is using the same Search Console as other valued brands or same WHOIS information.

                My gut feeling is after the initial crawl the traffic will reduce, if it doesn't, it probably means Google is finding something new to index, may be dynamically created pages?

                1 Reply Last reply Reply Quote 0
                • Olaf
                  Olaf last edited by

                  We got an answer from JohnMu - Webmaster Trends Analyst at Google. The reason of crawling is (as we find out) the filters  which have infinite variations (one of developers was sleeping), we will correct this. Disallowing in Robot.txt is adviced as the quickest fix to stop the mega-crawling. This case will be used for further research because of the disproportionate capacity usage. You're right, Google initially will crawl everything, but they don't want Googlebot crawling looks like a "mini-Ddos-like attack".

                  1 Reply Last reply Reply Quote 0
                  • 1 / 1
                  • First post
                    Last post
                  • Mobile Googlebot vs Desktop Googlebot - GWT reports - Crawl errors
                    iPullRank
                    iPullRank
                    0
                    3
                    832

                  • Block Googlebot from submit button
                    iPullRank
                    iPullRank
                    0
                    5
                    1.2k

                  • GoogleBot Mobile & Depagination
                    yeagerd
                    yeagerd
                    0
                    2
                    363

                  • Does Googlebot Read Session IDs?
                    MikeRoberts
                    MikeRoberts
                    0
                    3
                    516

                  • Fetch as Googlebot
                    KeriMorgret
                    KeriMorgret
                    0
                    6
                    1.2k

                  • How to find what Googlebot actually sees on a page?
                    Amjath
                    Amjath
                    0
                    5
                    588

                  • Googlebot crawling partial URLs
                    Improvements
                    Improvements
                    0
                    4
                    1.2k

                  • Googlebot + Meta-Refresh
                    Netrepid
                    Netrepid
                    1
                    7
                    2.6k

                  Get started with Moz Pro!

                  Unlock the power of advanced SEO tools and data-driven insights.

                  Start my free trial
                  Products
                  • Moz Pro
                  • Moz Local
                  • Moz API
                  • Moz Data
                  • STAT
                  • Product Updates
                  Moz Solutions
                  • SMB Solutions
                  • Agency Solutions
                  • Enterprise Solutions
                  • Digital Marketers
                  Free SEO Tools
                  • Domain Authority Checker
                  • Link Explorer
                  • Keyword Explorer
                  • Competitive Research
                  • Brand Authority Checker
                  • Local Citation Checker
                  • MozBar Extension
                  • MozCast
                  Resources
                  • Blog
                  • SEO Learning Center
                  • Help Hub
                  • Beginner's Guide to SEO
                  • How-to Guides
                  • Moz Academy
                  • API Docs
                  About Moz
                  • About
                  • Team
                  • Careers
                  • Contact
                  Why Moz
                  • Case Studies
                  • Testimonials
                  Get Involved
                  • Become an Affiliate
                  • MozCon
                  • Webinars
                  • Practical Marketer Series
                  • MozPod
                  Connect with us

                  Contact the Help team

                  Join our newsletter
                  Moz logo
                  © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                  • Accessibility
                  • Terms of Use
                  • Privacy