The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. How can I tell Google, that a page has not changed?

    How can I tell Google, that a page has not changed?

    Technical SEO Issues
    5 4 739
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • bimp
      bimp last edited by

      Hello,

      we have a website with many thousands of pages. Some of them change frequently, some never. Our problem is, that googlebot is generating way too much traffic. Half of our page views are generated by googlebot.

      We would like to tell googlebot, to stop crawling pages that never change. This one for instance:

      http://www.prinz.de/party/partybilder/bilder-party-pics,412598,9545978-1,VnPartypics.html

      As you can see, there is almost no content on the page and the picture will never change.So I am wondering, if it makes sense to tell google that there is no need to come back.

      The following header fields might be relevant. Currently our webserver answers with the following headers:

      Cache-Control:  no-cache, must-revalidate, post-check=0, pre-check=0, public
      Pragma: no-cache
      Expires: Thu, 19 Nov 1981 08:52:00 GMT

      Does Google honor these fields? Should we remove no-cache, must-revalidate, pragma: no-cache and set expires e.g. to 30 days in the future?

      I also read, that a webpage that has not changed, should answer with 304 instead of 200. Does it make sense to implement that? Unfortunatly that would be quite hard for us.

      Maybe Google would also spend more time then on pages that actually changed, instead of wasting it on unchanged pages.

      Do you have any other suggestions, how we can reduce the traffic of google bot on unrelevant pages?

      Thanks for your help

      Cord

      1 Reply Last reply Reply Quote 0
      • john4math
        john4math last edited by

        If you have Google Webmaster Tools set up, go to Site configuration > Settings, and you can set a custom crawl rate for you site.  That will change it site-wide, so if you have other pages that change frequently, that might not be so great for you.

        Another thing you could try is generate a sitemap, and set a change frequency of never (or yearly) for all of the pages you don't expect to change.  That also might slow down Google's crawl rate of those pages.

        1 Reply Last reply Reply Quote 1
        • RobMay
          RobMay last edited by

          Your best bet is to build an Excel report using a crawl tool (like Xenu, Frog, Moz, etc), and export that data. Then look to map out the pages you want to log and mark as 'not changing'.

          Make sure to built (or have a functioning XML sitemap file) for the site, and as John said, state which URL's NEVER change. Over time, this will tell googlebot that it isn't neccessary yo crawl those page URL's as they never change.

          You could also place a META REFRESH tag on those individual pages, and set that to never as well.

          Hope some of this helps! Cheers 🙂

          1 Reply Last reply Reply Quote 0
          • bimp
            bimp last edited by

            Thanks for the answers so far. The tips are not really solving my problems yet, though: I don't want to set down general crawling speed in the webmaster tools, because pages that frequently change should also be crawled frequently. We do have XML Sitemaps, although we did not include these picture pages, as in our example. There are ten- maybe houndreds- of thousands of these pages. If everyone agrees on this, we can include these pages in our XML Sitemaps of course. Using "meta refresh" to indicate, that the page never changed, seems a bit odd to me. But I'll look into it.

            But what about the http headers, I asked about? Does anyone have any ideas on that?

            1 Reply Last reply Reply Quote 0
            • Dr-Pete
              Dr-Pete last edited by

              Unfortunately, I don't think there are many reliable options, in the sense that Google will always honor them. I don't think they gauge crawl frequency by the "expires" field - or, at least, it carries very little weight. As John and Rob mentioned, you can set the "changefreq" in the XML sitemap, but again, that's just a hint to Google. They seem to frequently ignore it.

              If it's really critical, a 304 probably is a stronger signal, but I suspect even that's hit or miss. I've never seen a site implement it on a large scale (100s or 1000s of pages), so I can't speak to that.

              Two broader questions/comments:

              (1) If you currently list all of these pages in your XML sitemap, consider taking them out. The XML sitemap doesn't have to contain every page on your site, and in many cases, I think it shouldn't. If you list these pages, you're basically telling Google to re-crawl them (regardless of the changefreq setting).

              (2) You may have overly complex crawl paths. In other words, it may not be the quantity of pages that's at issue, but how Google accesses those pages. They could be getting stuck in a loop, etc. It's going to take some research on a large site, but it'd be worth running a desktop crawler like Xenu or Screaming Frog. This could represent a site architecture problem (from an SEO standpoint).

              (3) Should all of these pages even be indexed at all, especially as time passes? More and more (especially post-Panda), more indexed pages is often worse. If Googlebot is really hitting you that hard, it might be time to canonicalize some older content or 301-redirect it to newer, more relevant content. If it's not active at all, you could even NOINDEX or 404 it.

              1 Reply Last reply Reply Quote 3
              • 1 / 1
              • First post
                Last post
              • Any SEO-wizards out there who can tell me why Google isn't following the canonicals on some pages?
                Inevo
                Inevo
                0
                5
                113

              • How can i make Google to consider my News pages
                AlexisWithers
                AlexisWithers
                0
                4
                301

              • Can Google Crawl This Page?
                N1ghteyes
                N1ghteyes
                0
                4
                263

              • Does google know every time you change content on your page
                ClaireH-184886
                ClaireH-184886
                0
                6
                131

              • Can Google show the hReview-Aggregate microformat in the SERPs on a product page if the reviews themselves are on a separate page?
                0
                1
                1.0k

              • Canonical - how can you tell if page is appearing duplicate in Google?
                RobMay
                RobMay
                0
                3
                470

              • Can I redirect when Google is showing these as 2 different pages?
                fasctimseo
                fasctimseo
                0
                5
                774

              • How can I get unimportant pages out of Google?
                KeriMorgret
                KeriMorgret
                0
                11
                1.2k

              Get started with Moz Pro!

              Unlock the power of advanced SEO tools and data-driven insights.

              Start my free trial
              Products
              • Moz Pro
              • Moz Local
              • Moz API
              • Moz Data
              • STAT
              • Product Updates
              Moz Solutions
              • SMB Solutions
              • Agency Solutions
              • Enterprise Solutions
              • Digital Marketers
              Free SEO Tools
              • Domain Authority Checker
              • Link Explorer
              • Keyword Explorer
              • Competitive Research
              • Brand Authority Checker
              • Local Citation Checker
              • MozBar Extension
              • MozCast
              Resources
              • Blog
              • SEO Learning Center
              • Help Hub
              • Beginner's Guide to SEO
              • How-to Guides
              • Moz Academy
              • API Docs
              About Moz
              • About
              • Team
              • Careers
              • Contact
              Why Moz
              • Case Studies
              • Testimonials
              Get Involved
              • Become an Affiliate
              • MozCon
              • Webinars
              • Practical Marketer Series
              • MozPod
              Connect with us

              Contact the Help team

              Join our newsletter
              Moz logo
              © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
              • Accessibility
              • Terms of Use
              • Privacy