The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Indexing non-indexed content and Google crawlers

    Indexing non-indexed content and Google crawlers

    Intermediate & Advanced SEO
    8 3 508
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Alex-Harford
      Alex-Harford last edited by

      On a news website we have a system where articles are given a publish date which is often in the future. The articles were showing up in Google before the publish date despite us not being able to find them linked from anywhere on the website.

      I've added a 'noindex' meta tag to articles that shouldn't be live until a future date.

      When the date comes for them to appear on the website, the noindex disappears. Is anyone aware of any issues doing this - say Google crawls a page that is noindex, then 2 hours later it finds out it should now be indexed? Should it still appear in Google search, News etc. as normal, as a new page?

      Thanks. 🙂

      1 Reply Last reply Reply Quote 0
      • Highland
        Highland last edited by

        Do you have an automated sitemap? On at least one occasion, I've found that to be a culprit.

        Noindex means it won't be kept in the index. It doesn't mean it won't be crawled. I'm not sure how it would affect crawl timing , tho. I would assume that Google would assume that you would want things not indexed crawled less frequently. Something to potentially try is to use the GWT Fetch as Googlebot tool to force a new crawl of the page and see if that gets it in the index any faster.

        http://googlewebmastercentral.blogspot.com/2011/08/submit-urls-to-google-with-fetch-as.html

        Alex-Harford 1 Reply Last reply Reply Quote 0
        • Alex-Harford
          Alex-Harford @Highland last edited by

          There is no automated sitemap. We checked every page we could, including feeds.

          1 Reply Last reply Reply Quote 0
          • CleverPhD
            CleverPhD last edited by

            I like the automated sitemap answer for the cause (as this has bitten me before), but you mentioned you do not have that.  I would still bet that somewhere on your web site you are linking to the page that you do not want indexed.    It could be a tag cloud page or some other index page.  We had a site that it would accidentally publish out articles on our home page ahead of schedule.  Point here is that when you have a dynamic site with a CMS, you really have to be on your toes with stuff like this as the automation can get you into situations like this.

            I would not use the noindex tag and remove it later.  My concern would be that you are sending conflicting signals to Google.  noindex tells good to remove this page from the index.

            "When we see the noindex meta tag on a page, Google will completely drop the page from our search results, even if other pages link to it." from GWT

            When I read that - it sounds like this is not what you want for this page. 🙂

            You could also setup your system to show a 404 on the URL until the content is live and then let it 200, but you run into the same issue of Google getting 2 opposite signals on the same page.     Either way, if you first give the signal to Google that you do not want something indexed, you are at the mercy of the next crawl to see if Google looks at it again.

            Regardless, you need to get to the crux of the issue, how is Google finding this URL?

            I would use a 3rd party spider tool.  We have used Screaming Frog SEO Spider.  There are others out there.  You would be amazed what they find.  The key to this tool is that when it finds something, it also tells you on what page it found it.  We have big sites with thousands of pages and we have used it to find broken links to images and links to pages on our site that now 404.  Really handy to clean things up.  I bet it would find where there is a link on your site that contains the page (or pages) that link to the content.   You can then update that page and not have to worry about using noindex etc.  Also not that the spiders are much better than humans at finding this stuff.  Even if you have looked, the spider looks at things differently.

            It also may be as simple as searching for the URL on the web with the link: attribute.  Google may show you where it is finding the link.

            Good luck and please post back what you find.  This is kind of like one of those "who dun it?" mystery shows!  🙂

            Alex-Harford 2 Replies Last reply Reply Quote 1
            • Alex-Harford
              Alex-Harford @CleverPhD last edited by

              Thanks. I agree I need to get rid of that noindex. The site is new and doesn't have much in the way of tag clouds etc. yet, so it's not like we have a lot of pages to check.

              I've used the link: attribute to try and find the offending links each time, but nothing showed up. I use Xenu Link Sleuth rather than Screaming Frog, and I can't find a way to find backlinks with Xenu. Do you know if you can with the free version of Screaming Frog? I've seen the free version described as "almost fully functional" - the number of crawlable links seems to be the main restriction.

              CleverPhD 1 Reply Last reply Reply Quote 0
              • CleverPhD
                CleverPhD @Alex-Harford last edited by

                I think Screaming Frog has a trial version, I forget if it limits total number of pages etc. as we bought it a while ago.   At least you can try out and see.  May be others who have more tools as well.

                1 Reply Last reply Reply Quote 0
                • Alex-Harford
                  Alex-Harford @CleverPhD last edited by

                  Good luck and please post back what you find.  This is kind of like one of those "who dun it?" mystery shows!  🙂

                  Sorted! The link was from a mobile version of the site on an m. subdomain - and only in a facebook share as follows:

                  Post to Facebook

                  CleverPhD 1 Reply Last reply Reply Quote 0
                  • CleverPhD
                    CleverPhD @Alex-Harford last edited by

                    Wow!  Nice detective work!   I could see how that one would slip under the radar.

                    Congrats on finding a needle in a haystack!

                    You should buy yourself the adult beverage of your choice and have a little toast!

                    Cheers!

                    1 Reply Last reply Reply Quote 0
                    • 1 / 1
                    • First post
                      Last post
                    • Does Google have a separate crawler for Javascript and Content?
                      TucsonAZWebDesign
                      TucsonAZWebDesign
                      0
                      2
                      23

                    • "Null" appearing as top keyword in "Content Keywords" under Google index in Google Search Console
                      Tom-Anthony
                      Tom-Anthony
                      0
                      5
                      588

                    • Our client's web property recently switched over to secure pages (https) however there non secure pages (http) are still being indexed in Google. Should we request in GWMT to have the non secure pages deindexed?
                      N1ghteyes
                      N1ghteyes
                      0
                      3
                      128

                    • Apps content Google indexation ?
                      SamuelScott
                      SamuelScott
                      0
                      2
                      54

                    • Google isn't seeing the content but it is still indexing the webpage
                      jacobfy
                      jacobfy
                      0
                      7
                      469

                    • How to get content to index faster in Google.....pubsubhubbub?
                      Marcus_Miller
                      Marcus_Miller
                      0
                      2
                      664

                    • Indexation of content from internal pages (registration) by Google
                      Copstead
                      Copstead
                      0
                      9
                      592

                    • Do you bother cleaning duplicate content from Googles Index?
                      Dr-Pete
                      Dr-Pete
                      0
                      5
                      450

                    Get started with Moz Pro!

                    Unlock the power of advanced SEO tools and data-driven insights.

                    Start my free trial
                    Products
                    • Moz Pro
                    • Moz Local
                    • Moz API
                    • Moz Data
                    • STAT
                    • Product Updates
                    Moz Solutions
                    • SMB Solutions
                    • Agency Solutions
                    • Enterprise Solutions
                    • Digital Marketers
                    Free SEO Tools
                    • Domain Authority Checker
                    • Link Explorer
                    • Keyword Explorer
                    • Competitive Research
                    • Brand Authority Checker
                    • Local Citation Checker
                    • MozBar Extension
                    • MozCast
                    Resources
                    • Blog
                    • SEO Learning Center
                    • Help Hub
                    • Beginner's Guide to SEO
                    • How-to Guides
                    • Moz Academy
                    • API Docs
                    About Moz
                    • About
                    • Team
                    • Careers
                    • Contact
                    Why Moz
                    • Case Studies
                    • Testimonials
                    Get Involved
                    • Become an Affiliate
                    • MozCon
                    • Webinars
                    • Practical Marketer Series
                    • MozPod
                    Connect with us

                    Contact the Help team

                    Join our newsletter
                    Moz logo
                    © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                    • Accessibility
                    • Terms of Use
                    • Privacy