The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. Increase in pages crawled per day

    Increase in pages crawled per day

    Technical SEO Issues
    13 4 6.3k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • max.favilli
      max.favilli last edited by

      What does it mean when GWT abruptly jump from 15k to 30k pages crawled per day?

      I am used to see spikes, like 10k average and a couple of time per month 50k pages crawled.

      But in this case 10 days ago moved from 15k to 30k per day and it's staying there. I know it's a good sign, the crawler is crawling more pages per day, so it's picking up changes more often, but I have no idea of why is doing it, what good signals usually drive google crawler to choose to increase the number of pages crawled per day?

      Anyone knows?

      1 Reply Last reply Reply Quote 1
      • AFW1179
        AFW1179 last edited by

        There are two variables in play and you are picking up on one.

        If there are 1,000 pages on your website then Google may index all 1,000 if they are aware of all the pages. As you indicated, it is also Google's decision how many of your pages to index.

        The second factor which is most likely the case in your situation is that Google only has two ways to index your pages. One is to submit a sitemap in GWT to all of your known pages. So Google would then have a choice to index all 1,000 as it would then be aware of their existence. However, it sounds like your website is relying on links. If you have 1,000 pages and a home page with one link leading to an about us page then Google is only aware of two pages on your entire website. Your website has to have a internal link structure that Google can crawl.

        Imagine your website like a tree root structure. For Google to get to every page and index it then it has to have clear, defined, and easy access. Websites with a home page that links to a page A that then links to page B that then links to page C that then links to page D that then links to 500 pages can easily lose 500 pages if there is an obstruction between any of the pages that lead to page D. Because google can't crawl to page D to see all the pages on it.

        max.favilli 1 Reply Last reply Reply Quote -1
        • RyanPurkey
          RyanPurkey last edited by

          There could be several factors... maybe your brand based search is prompting Google to capture more of your site. Maybe you got a link from a very high authority site that prompts higher crawl volumes. Queries that prompt freshness related to your site could also spur on Google. It is a lot of guesswork, but can be whittled down some by a close look at Analytics and perhaps tomorrows OSE update (Fresh Web Explorer might provide some clue's in the meantime.) At least you're moving in the right direction. Cheers!

          max.favilli 1 Reply Last reply Reply Quote 0
          • MickEdwards
            MickEdwards last edited by

            I would also check you have not got a spike of URL parameters becoming available.  I recently had a similar issue and although I had these set up in GWT the crawler was actively wasting its time on them.  Once I added to robots the crawl level went back to 'normal'.

            AFW1179 1 Reply Last reply Reply Quote 2
            • AFW1179
              AFW1179 @MickEdwards last edited by

              When you say URL variables do you mean query string variables like ?key=value

              That is really good advice. You can check in your GWT. If you let google crawl and it runs in to a loop it will not index that section of your site. It would be costly for them.

              MickEdwards 1 Reply Last reply Reply Quote 0
              • MickEdwards
                MickEdwards @AFW1179 last edited by

                yes, I updated to parameters just before you posted

                1 Reply Last reply Reply Quote 0
                • max.favilli
                  max.favilli @AFW1179 last edited by

                  I am not sure I understand what you mean, that website has a total of 35k pages submitted through sitemap to GWT, of which only 8k are indexed. The total number of pages indexed have always been slowly increasing through time, it moved from 6k to 8k in the last couple of months, slowly with no spikes.

                  That's not the total number of pages served by the site, since dynamics search results page amount to around 150k total pages, we do not submit all of them in the sitemap on purpose, and GWT shows 70k pages as the total number of indexed pages.

                  I analyzed Google crawler activity through server logs in the past, it does pick a set of (apparently) random pages every night and does crawl them. I actually never analyzed what percentage of those pages are in the sitemap or not.

                  Internal link structure was built on purpose to try to favor ranking of pages we considered more important.

                  The point is we didn't change anything in the website structure recently. User generated content have been lowering duplicate pages count, slowly, through time, without any recent spike. We have a PR campaign which is increasing backlinks with an average rate of around 3 links per week, and we didn't have any high DA backlinks appearing in the last few weeks.

                  So I am wondering what made google crawler start crawling much more pages per day.

                  1 Reply Last reply Reply Quote 1
                  • max.favilli
                    max.favilli @RyanPurkey last edited by

                    Hi Ryan,

                    • GWT (Search Traffic->Search Queries) shows a drop of 6% in impressions for brand based searches (google trends shows a similar pattern).
                    • GWT is not showing any recent backlink with an abnormally high DA.
                    • we actually had a couple of unusually high traffic from Facebook thanks to a couple of particularly successful post, but we are talking about a couple of spikes of just 5k visits and they both started after the rise of pages crawled per day.

                    If you have any other idea it's more than welcome, I wish I could understand the source of that change to be able to replicate it on other websites.

                    RyanPurkey 1 Reply Last reply Reply Quote 0
                    • max.favilli
                      max.favilli last edited by

                      Tw things I forgot to mention are:

                      1. something like 2 weeks ago we turned the website responsive, could it be google mobile crawler is increasing the number of crawled pages, I have to analyze the logs to see if the requests are coming from google mobile crawler
                      2. the total number of indexed pages didn't change, which make me wonder if a rise in the number of crawled pages per day is all that relevant
                      1 Reply Last reply Reply Quote 0
                      • RyanPurkey
                        RyanPurkey @max.favilli last edited by

                        Ah, the responsive change could be a big part of it. You're probably getting crawls from the mobile crawler. GWT wouldn't be the best source for the recency on backlinks. I'd actually look for spikes via referrers in Analytics. GWT isn't always that responsive when reporting links.  Still, it looks like the responsive redesign is a likely candidate for this, especially with Google's looming April 21st deadline.

                        max.favilli 1 Reply Last reply Reply Quote 0
                        • max.favilli
                          max.favilli @RyanPurkey last edited by

                          I usually analyze backlinks with both gwt and ahrefs, and ahrefs also doesn't show any abnormally high DA backlink either.

                          Agree the responsive change is the most probable candidate, I have a couple of other websites I want to turn responsive before April 21st, that's an opportunity to test and see if that is the reason.

                          RyanPurkey 1 Reply Last reply Reply Quote 0
                          • RyanPurkey
                            RyanPurkey @max.favilli last edited by

                            Agreed. Especially since Google's own Gary Illyes respond to the following with:

                            How long is the delay between making it mobile friendly and it being reflected in the search results?

                            Illyes says “As soon as we discover it is mobile friendly, on a URL by URL basis, it will be updated.

                            Sounds like when you went responsive they double checked each URL to confirm.  From: http://www.thesempost.com/googles-gary-illyes-qa-upcoming-mobile-ranking-signal-change/. Cheers!

                            AFW1179 1 Reply Last reply Reply Quote 3
                            • AFW1179
                              AFW1179 @RyanPurkey last edited by

                              Nice find Ryan.

                              1 Reply Last reply Reply Quote 0
                              • 1 / 1
                              • First post
                                Last post
                              • Very wierd pages. 2900 403 errors in page crawl for a site that only has 140 pages.
                                H.M.N.
                                H.M.N.
                                0
                                6
                                64

                              • Site Crawl -> Duplicate Page Content -> Same pages showing up with duplicates that are not
                                davebuts
                                davebuts
                                0
                                2
                                72

                              • 404 Pages increasing day by day. Why??
                                BlueCorona
                                BlueCorona
                                0
                                3
                                143

                              • Pages with Duplicate Page Content Crawl Diagnostics
                                evolvingSEO
                                evolvingSEO
                                0
                                6
                                243

                              • Crawl Test Report only shows home page and no inner site pages?
                                jhinchcliffe
                                jhinchcliffe
                                0
                                10
                                173

                              • Does this page crawl well?
                                lonniea
                                lonniea
                                0
                                5
                                476

                              • SEOMoz Crawl Diagnostic indicates duplicate page content for home page?
                                Linesides
                                Linesides
                                0
                                3
                                607

                              • Crawl report showing only 1 crawled page
                                Mikpam
                                Mikpam
                                0
                                4
                                913

                              Get started with Moz Pro!

                              Unlock the power of advanced SEO tools and data-driven insights.

                              Start my free trial
                              Products
                              • Moz Pro
                              • Moz Local
                              • Moz API
                              • Moz Data
                              • STAT
                              • Product Updates
                              Moz Solutions
                              • SMB Solutions
                              • Agency Solutions
                              • Enterprise Solutions
                              • Digital Marketers
                              Free SEO Tools
                              • Domain Authority Checker
                              • Link Explorer
                              • Keyword Explorer
                              • Competitive Research
                              • Brand Authority Checker
                              • Local Citation Checker
                              • MozBar Extension
                              • MozCast
                              Resources
                              • Blog
                              • SEO Learning Center
                              • Help Hub
                              • Beginner's Guide to SEO
                              • How-to Guides
                              • Moz Academy
                              • API Docs
                              About Moz
                              • About
                              • Team
                              • Careers
                              • Contact
                              Why Moz
                              • Case Studies
                              • Testimonials
                              Get Involved
                              • Become an Affiliate
                              • MozCon
                              • Webinars
                              • Practical Marketer Series
                              • MozPod
                              Connect with us

                              Contact the Help team

                              Join our newsletter
                              Moz logo
                              © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                              • Accessibility
                              • Terms of Use
                              • Privacy