The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Crawl and Indexation Error - Googlebot can't/doesn't access specific folders on microsites

    Crawl and Indexation Error - Googlebot can't/doesn't access specific folders on microsites

    Intermediate & Advanced SEO
    2 2 87
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • ImpericMedia
      ImpericMedia last edited by

      Hi,

      My first time posting here, I am just looking for some feedback on a indexation issue we have with a client and any feedback on possible next steps or items I may have overlooked.

      To give some background, our client operates a website for the core band and a also a number of microsites based on specific business units, so you have corewebsite.com along with bu1.corewebsite.com, bu2.corewebsite.com.

      The content structure isn't ideal, as each microsite follows a structure of bu1.corewebsite.com/bu1/home.aspx, bu2.corewebsite.com/bu2/home.aspx and so on.

      In addition to this each microsite has duplicate folders from the other microsites so bu1.corewebsite.com has indexable folders bu1.corewebsite.com/bu1/home.aspx but also bu1.corewebsite.com/bu2/home.aspx the same with bu2.corewebsite.com has bu2.corewebsite.com/bu2/home.aspx but also bu2.corewebsite.com/bu1/home.aspx. Therre are 5 different business units so you have this duplicate content scenario for all microsites.

      This situation is being addressed in the medium term development roadmap and will be rectified in the next iteration of the site but that is still a ways out.

      The issue
      About 6 weeks ago we noticed a drop off in search rankings for two of our microsites (bu1.corewebsite.com and bu2.corewebsite.com) over a period of 2-3 weeks pretty much all our terms dropped out of the rankings and search visibility dropped to essentially 0.

      I can see that pages from the websites are still indexed but oddly it is the duplicate content pages so (bu1.corewebsite.com/bu3/home.aspx or (bu1.corewebsite.com/bu4/home.aspx is still indexed, similiarly on the bu2.corewebsite microsite bu2.corewebsite.com/bu3/home.aspx and bu4.corewebsite.com/bu3/home.aspx are indexed but no pages from the BU1 or BU2 content directories seem to be indexed under their own microsites.

      Logging into webmaster tools I can see there is a "Google couldn't crawl your site because we were unable to access your site's robots.txt file." This was a bit odd as there was no robots.txt in the root directory but I got some weird results when I checked the BU1/BU2 microsites in technicalseo.com robots text tool.

      Also due to the fact that there is a redirect from bu1.corewebsite.com/ to bu1.corewebsite.com/bu4.aspx I thought maybe there could be something there so consequently we removed the redirect and added a basic robots to the root directory for both microsites.

      After this we saw a small pickup in site visibility, a few terms pop into our Moz campaign rankings but drop out again pretty quickly. Also the error message in GSC persisted.

      Steps taken so far after that

      1. In Google Search Console, I confirmed there are no manual actions against the microsites.
      2. Confirmed there is no instances of noindex on any of the pages for BU1/BU2
      3. A number of the main links from the root domain to microsite BU1/BU2 have a rel="noopener noreferrer" attribute but we looked into this and found it has no impact on indexation
      4. Looking into this issue we saw some people had similar  issues when using Cloudflare but our client doesn't use this service
      5. Using a response redirect header tool checker, we noticed a timeout when trying to mimic googlebot accessing the site
      6. Following on from point 5 we got a hold of a week of server logs from the client and I can see Googlebot successfully pinging the site and not getting 500 response codes from the server...but couldn't see any instance of it trying to index microsite BU1/BU2 content

      So it seems to me that the issue could be something server side but I'm at a bit of a loss of next steps to take.

      Any advice at all is much appreciated!

      1 Reply Last reply Reply Quote 0
      • Everett
        Everett last edited by

        Hello ImpericMedia,

        If you can share the site with me (private message is OK) I'll look into it. If you don't want to do that, here are some things I would look at:

        1. If you have verified that the Robots.txt file is not blocking the pages you want indexed, and the pages are still not indexed (or indexed with a message about the Robots.txt file) you should check for a Robots Noindex meta tag on the page. If the source code looks strange you may have to use the Chrome Inspect tool to see the fully rendered page.

        2. If there are no blocking robots meta tags on the page you should check the HTTP response for an X-Robots header.

        3. If there is no X-Robots header, it's probably because of the duplicate content and spammy(seeming) subdomain setup.

        Sorry about the wait. If you include the site URL it will get other community member's curious enough to check it out next time.

        I hope this helps. If not, feel free to message me.

        1 Reply Last reply Reply Quote 0
        • 1 / 1
        • First post
          Last post
        • Incorrect Spelling Indexed In Meta Info - Can't Change It
          gfiorelli1
          gfiorelli1
          1
          12
          93

        • Why doesn't my website crawl by Google?
          LoganRay
          LoganRay
          0
          8
          82

        • How to 301 Redirect /page.php to /page, after a RewriteRule has already made /page.php accessible by /page (Getting errors)
          DirkC
          DirkC
          0
          2
          435

        • My crawl can't find ANY product pages. The links to product pages aren't links, they're script. :(
          Joe.Robison
          Joe.Robison
          0
          8
          247

        • Can't diagnose this 404 error
          kevinliao
          kevinliao
          0
          4
          93

        • Google can't access/crawl my site!
          Travis_Bailey
          Travis_Bailey
          0
          16
          3.0k

        • After Receiving a "Googlebot can't access your site" would this stop your site from being crawled?
          evolvingSEO
          evolvingSEO
          0
          4
          394

        • How to Disallow Specific Folders and Sub Folders for Crawling?
          RyanKent
          RyanKent
          0
          14
          1.5k

        Get started with Moz Pro!

        Unlock the power of advanced SEO tools and data-driven insights.

        Start my free trial
        Products
        • Moz Pro
        • Moz Local
        • Moz API
        • Moz Data
        • STAT
        • Product Updates
        Moz Solutions
        • SMB Solutions
        • Agency Solutions
        • Enterprise Solutions
        • Digital Marketers
        Free SEO Tools
        • Domain Authority Checker
        • Link Explorer
        • Keyword Explorer
        • Competitive Research
        • Brand Authority Checker
        • Local Citation Checker
        • MozBar Extension
        • MozCast
        Resources
        • Blog
        • SEO Learning Center
        • Help Hub
        • Beginner's Guide to SEO
        • How-to Guides
        • Moz Academy
        • API Docs
        About Moz
        • About
        • Team
        • Careers
        • Contact
        Why Moz
        • Case Studies
        • Testimonials
        Get Involved
        • Become an Affiliate
        • MozCon
        • Webinars
        • Practical Marketer Series
        • MozPod
        Connect with us

        Contact the Help team

        Join our newsletter
        Moz logo
        © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
        • Accessibility
        • Terms of Use
        • Privacy