The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Google can't access/crawl my site!

    Google can't access/crawl my site!

    Intermediate & Advanced SEO
    16 4 3.0k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • granitgash
      granitgash last edited by

      Hi

      I'm dealing with this problem for a few days. In fact i didn't realize it was this serious until today when i saw most of my site "de-indexed" and losing most of the rankings.

      [URL Errors: 1st photo]

      8/21/14 there were only 42 errors but in 8/22/14 this number went to 272 and it just keeps going up.

      The site i'm talking about is gazetaexpress.com (media news, custom cms) with lot's of pages.

      After i did some research i came to the conclusion that the problem is to the firewall, who might have blocked google bots from accessing the site. But the server administrator is saying that this isn't true and no google bots have been blocked.

      Also when i go to WMT, and try to Fetch as Google the site, this is what i get:

      [Fetch as Google: 2nd photo]

      From more than 60 tries, 2-3 times it showed Complete (and this only to homepage, never to articles).

      What can be the problem? Can i get Google to crawl properly my site and is there a chance that i will lose my previous rankings?

      Thanks a lot
      Granit

      FvhvDVR.png dKx3m1O.png

      1 Reply Last reply Reply Quote 0
      • KeriMorgret
        KeriMorgret last edited by

        If I do a site:gazetaexpress.com in Google, I get some results that are http, and some results that are https. The https ones say there is an SSL connection error.

        Are you looking at the http or https version in GWT?

        granitgash 1 Reply Last reply Reply Quote 1
        • granitgash
          granitgash @KeriMorgret last edited by

          I'm looking at the http version in GWT

          KeriMorgret 1 Reply Last reply Reply Quote 0
          • KeriMorgret
            KeriMorgret @granitgash last edited by

            Unfortunately, I don't have a quick answer for you. Looking forward to seeing what other community members have to say on this one!

            granitgash 1 Reply Last reply Reply Quote 1
            • granitgash
              granitgash @KeriMorgret last edited by

              No prb. Thanks a lot for your time. Let just hope that someone in the community will help with a solution 🙂

              1 Reply Last reply Reply Quote 0
              • Andy.Drinkwater
                Andy.Drinkwater last edited by

                Hi Granit,

                Has any work been done to the site in the last 2-3 months? Have you had any warnings in webmaster tools at all? I did once see a strange problem where Google wasn't crawling a site correctly because it had been compromised, but after checking, there is nothing like this on yours.

                -Andy

                granitgash 1 Reply Last reply Reply Quote 1
                • granitgash
                  granitgash @Andy.Drinkwater last edited by

                  In mid-march website changed it's CMS but i don't think that could be the reason because until this week everything was working perfectly. I don't think it could have been compromised too. I'm still suspecting it could be the firewall blocking bots from crawling the site, but the server administrator couldn't find any evidence of this.

                  Andy.Drinkwater 1 Reply Last reply Reply Quote 0
                  • Andy.Drinkwater
                    Andy.Drinkwater @granitgash last edited by

                    It doesn't look like a firewall, as I can crawl it with Screaming Frog. However, the server logs will be able to answer that one for you.

                    Without looking in depth, I'm not seeing anything that stands out to me - do you think that there have been changes to the server that could cause issues? What firewall is the server running? Also, if there were errors in crawling the site, you would see a warning about this.

                    -Andy

                    granitgash 1 Reply Last reply Reply Quote 1
                    • granitgash
                      granitgash @Andy.Drinkwater last edited by

                      We are suspecting that CloudFlare might be causing these troubles. We are trying everything, in the meantime i'm looking here to see if anyone has any similar experience or an idea for solution.

                      As for warnings, the only warning we had was the one last week (8/23/14) saying that Google bot can't acces our site:

                      Over the last 24 hours, Googlebot encountered 316 errors while attempting to connect to your site. Your site's overall connection failure rate is 7.5%.

                      -Granit

                      Andy.Drinkwater 1 Reply Last reply Reply Quote 0
                      • Andy.Drinkwater
                        Andy.Drinkwater @granitgash last edited by

                        Ah OK - well keep us updated with what you find. Someone else will chip in with other info if they have some 🙂

                        -Andy

                        1 Reply Last reply Reply Quote 0
                        • Travis_Bailey
                          Travis_Bailey last edited by

                          A friend of mine just got back from Kosovo. It was the last stop on a tour of the Balkans. He had a pretty good time. Moving along...

                          I crawled about 12K URLs and hit almost 90 Internal Server Errors (500). It's probably not your core problem, but it's something to look at. Here are a few examples:

                          http://www.gazetaexpress.com/blihet/?search_category_id=1&searchFilter=1

                          http://www.gazetaexpress.com/shitet/?category_id=134&searchFilter=1

                          http://www.gazetaexpress.com/me-qera/?category_id=131&searchFilter=1

                          There was one actual page that threw a 500 at the time of crawl:

                          http://www.gazetaexpress.com/mistere/edhe-kesaj-i-thuhet-veze-22591/

                          The edhe kesaj page now resolves fine. (I'm not even going to pretend to understand or write Albanian.)

                          So there may be some issues with the server or hosting. If you haven't already, try this troubleshooter from Cloudflare.

                          granitgash 1 Reply Last reply Reply Quote 0
                          • granitgash
                            granitgash @Travis_Bailey last edited by

                            Hi Travis, thank you for your time.

                            Great for your friend, I also suggest to visit Kosovo someday, you will have great time here, for sure 🙂

                            Back to the issue:

                            Here is an interesting issue that is happening with the crawler.

                            Our own cms uses htaccess for rewrite purposes. I created 2 new files that are independent from CMS and tried to fetch them with WMT, and it worked like a charm.

                            These 2 independent files are:

                            www.gazetaexpress.com/test_manaferra.php

                            www.gazetaexpress.com/xhezidja.php

                            Then, I created an ajax page with our CMS, which contains only plain text, tried to fetch it by WMT and strangely enough it didn't work. To make sure that the .htaccess file is not affecting this behavior, I deleted the htaccess and tried to fetch it, but it didn't worked.

                            The ajax page is: www.gazetaexpress.com/page/xhezidja/?pageSEO=false

                            The site works perfectly for humans which access it via the browser.

                            I'm more than confused now!

                            ac857dfbf02a316d92d378bc48f9c395.png

                            1 Reply Last reply Reply Quote 0
                            • granitgash
                              granitgash last edited by

                              Hi all

                              Just wanted to let you know that we fixed the problem. We disabled CloudFlare which we found out was blocking Google bots. More about this issue can be found at: https://support.cloudflare.com/hc/en-us/articles/200169806-I-m-getting-Google-Crawler-Errors-What-should-I-do-

                              KeriMorgret Travis_Bailey 2 Replies Last reply Reply Quote 3
                              • KeriMorgret
                                KeriMorgret @granitgash last edited by

                                Great, thanks for letting us know what happened with this!

                                Travis_Bailey 1 Reply Last reply Reply Quote 0
                                • Travis_Bailey
                                  Travis_Bailey @KeriMorgret last edited by

                                  This applies to the guy from Albania.

                                  Oh, this IS the guy from Albania. Never mind.

                                  1 Reply Last reply Reply Quote 0
                                  • Travis_Bailey
                                    Travis_Bailey @granitgash last edited by

                                    What did you do specifically to mitigate the problem? You can PM me, if you would like.

                                    1 Reply Last reply Reply Quote 0
                                    • 1 / 1
                                    • First post
                                      Last post
                                    • Why doesn't my website crawl by Google?
                                      LoganRay
                                      LoganRay
                                      0
                                      8
                                      82

                                    • Why isn't my site being indexed by Google?
                                      Chris661
                                      Chris661
                                      0
                                      3
                                      188

                                    • When Mobile and Desktop sites have the same page URLs, how should I handle the 'View Desktop Site' link on a mobile site to ensure a smooth crawl?
                                      DirkC
                                      DirkC
                                      0
                                      3
                                      1.4k

                                    • After Receiving a "Googlebot can't access your site" would this stop your site from being crawled?
                                      evolvingSEO
                                      evolvingSEO
                                      0
                                      4
                                      394

                                    • How can Google index a page that it can't crawl completely?
                                      OlegKorneitchouk
                                      OlegKorneitchouk
                                      0
                                      4
                                      75

                                    • Why isn't google indexing our site?
                                      MikeTek
                                      MikeTek
                                      0
                                      18
                                      438

                                    • Googlebot Can't Access My Sites After I Repair My Robots File
                                      Igal_Zeifman
                                      Igal_Zeifman
                                      1
                                      4
                                      2.6k

                                    • What on-page/site optimization techniques can I utilize to improve this site (http://www.paradisus.com/)?
                                      RyanKent
                                      RyanKent
                                      0
                                      2
                                      626

                                    Get started with Moz Pro!

                                    Unlock the power of advanced SEO tools and data-driven insights.

                                    Start my free trial
                                    Products
                                    • Moz Pro
                                    • Moz Local
                                    • Moz API
                                    • Moz Data
                                    • STAT
                                    • Product Updates
                                    Moz Solutions
                                    • SMB Solutions
                                    • Agency Solutions
                                    • Enterprise Solutions
                                    • Digital Marketers
                                    Free SEO Tools
                                    • Domain Authority Checker
                                    • Link Explorer
                                    • Keyword Explorer
                                    • Competitive Research
                                    • Brand Authority Checker
                                    • Local Citation Checker
                                    • MozBar Extension
                                    • MozCast
                                    Resources
                                    • Blog
                                    • SEO Learning Center
                                    • Help Hub
                                    • Beginner's Guide to SEO
                                    • How-to Guides
                                    • Moz Academy
                                    • API Docs
                                    About Moz
                                    • About
                                    • Team
                                    • Careers
                                    • Contact
                                    Why Moz
                                    • Case Studies
                                    • Testimonials
                                    Get Involved
                                    • Become an Affiliate
                                    • MozCon
                                    • Webinars
                                    • Practical Marketer Series
                                    • MozPod
                                    Connect with us

                                    Contact the Help team

                                    Join our newsletter
                                    Moz logo
                                    © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                                    • Accessibility
                                    • Terms of Use
                                    • Privacy