The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Wordpress error

    Wordpress error

    Intermediate & Advanced SEO
    18 5 2.6k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • NileCruises
      NileCruises last edited by

      On our Google Webmaster Tools I'm getting a Severe Health Warning regarding our Robot.txt file reading:

      User-agent: *
      Crawl-delay: 20

      User-agent: 008
      Disallow: /

      I'm wondering how I can fix this and stop it happening again.

      The site was hacked about 4 months ago but I thought we'd managed to clear things up.

      Colin

      1 Reply Last reply Reply Quote 0
      • TroyCarlson
        TroyCarlson last edited by

        I think, if you have a robots.txt reading what you show above:

        User-agent: * Crawl-delay: 20

        User-agent: 008 Disallow: /

        That just basically says, "Don't crawl my site at all"  (The "Disallow: /" means, I'm not allowing anything to be crawled by any search engine that pays attention to robots.txt at all)

        So...I'm guessing that's not what you want?

        (Bah..ignore.  "User-agent".  I'm a fool)

        Actually, this seems to have solved your issue...make sure you explicitly tell all other User-agents that they are allowed:

        User-agent: * Disallow: Crawl-delay: 20

        User-agent: 008 Disallow: /

        The extra "Disallow:" under User-agent: * says "I'm not going to disallow anything to most user-agents."  Then the Disallow under user-agent 008 seems to only apply to them.

        SEO-Pump.com 1 Reply Last reply Reply Quote 1
        • Dr-Pete
          Dr-Pete last edited by

          Not honestly sure what User-agent "008" is, but that seems harmless. Why the crawl delay? There are better ways to handle that than Robots.txt, if a crawler is giving you trouble.

          Was there a specific message/error in GWT?

          TroyCarlson 1 Reply Last reply Reply Quote 0
          • TroyCarlson
            TroyCarlson @Dr-Pete last edited by

            I'm not 100% sure what he's seeing, but when I plug his robots.txt into the robots analysis tool, I get this back:

            Googlebot blocked by line 5: Disallow: /

            Detected as a directory; specific files may have different restrictions

            However, when I gave the top "**User-agent: ***" the "Disallow: " it seemed to fix the problem.  Like, it didn't understand that the **Disallow: /  **was meant only for the 008 user-agent?

            Dr-Pete 1 Reply Last reply Reply Quote 0
            • Dr-Pete
              Dr-Pete @TroyCarlson last edited by

              That's odd: "008" appears to be the user agent for "80legs", a custom crawler platform. I'm seeing it in other Robots.txt files.

              1 Reply Last reply Reply Quote 1
              • SEOKeith
                SEOKeith last edited by

                I would simplify your robots.txt to read something like:

                **User-agent: *
                Disallow: /wp-admin/
                Disallow: /wp-includes/
                
                Sitemap: http://www.your-domain.com/sitemap.xml**
                
                1 Reply Last reply Reply Quote 2
                • NileCruises
                  NileCruises last edited by

                  Hi Guys,

                  Thanks so much for your help. As you say Troy, that's defintely not what I want.

                  I assumed when we were hacked (twice in 8 months) that it might have been a competitor as we are in a very competitive niche. Might be very wrong there but we have certainly lost our top ranking on Google.co.uk for our main key phrases and our now at about position 7 for the same key phrases after about 3 years at number 1.

                  So when I saw on Google Webmaster Tools yesterday that we had a severe health warning and that the Googlebot was being prevented crawling our site I thought it might be the aftereffects of the hack.

                  Today even though I changed the robot.txt file yesterday GWT is showing 1000 pages with errors, 285 Access Denied and 719 Not Found and this message: Googlebot is blocked from http://nile-cruises-4u.co.uk/

                  I've just tested the robot.txt via GWT and now get this message:

                  AllowedDetected as a directory; specific files may have different restrictionsSo maybe the pages will be able to access by Googlebot shortly and the Access Denied message will disappear.I've chaged the robot.txt file to

                  User-agent: *
                  Crawl-delay: 20But should I change it to a better version? Sorry guys, I'm an online travel agent and not great on coding and really techie stuff. Although I'm learning pretty quickly about the bad stuff!I seem to have a few problems getting this sorted and wonder if this is a part of why our page position is dropping?

                  SEOKeith 1 Reply Last reply Reply Quote 0
                  • SEOKeith
                    SEOKeith @NileCruises last edited by

                    I gave you an example of a basic robots.txt file that I use on one of my Wordpress sites above, I would suggest using that for now.

                    I would not bother messing around with crawl delay in robots.txt as Peter said above there are better ways to achieve this... Plus I doubt you need it any way.

                    Google caches the robots.txt info for about 24hrs normally in my experience... So it's possible the old cached version is still being used by Google.

                    NileCruises SEOKeith 4 Replies Last reply Reply Quote 0
                    • NileCruises
                      NileCruises @SEOKeith last edited by

                      Thanks Keith.

                      Only part of our site is WP based. Would that be a problem using the example you kindly suggested?

                      1 Reply Last reply Reply Quote 0
                      • SEOKeith
                        SEOKeith @SEOKeith last edited by

                        Use this:

                        **User-agent: *
                        Disallow: /blog/wp-admin/
                        Disallow: /blog/wp-includes/
                        
                        Sitemap: http://nile-cruises-4u.co.uk/sitemap.xml**
                        

                        Any FYI, you have the following error on your blog:

                        Warning: is_readable() [function.is-readable]: open_basedir restriction in effect. File(D:\home\nile-cruises-4u.co.uk\wwwroot\blog/wp-content/plugins/D:\home\nile-cruises-4u.co.uk\wwwroot\blog\wp-content\plugins\websitedefender-wordpress-security/languages/WSDWP_SECURITY-en_US.mo) is not within the allowed path(s): (D:\home\nile-cruises-4u.co.uk\wwwroot) in D:\home\nile-cruises-4u.co.uk\wwwroot\blog\wp-includes\l10n.php on line **339 **

                        Get your web guy to look at that, it appears at the top of every blog page for me...

                        Hope that helps,

                        Keith

                        1 Reply Last reply Reply Quote 0
                        • NileCruises
                          NileCruises @SEOKeith last edited by

                          Thanks very much Keith. I've just edited the file as suggested.

                          I see the error but as I am the web guy I cant' figure out how to get rid of it.

                          I think it might be a plugin that's causing it so I'm going to disable the and re-able them one as a time.

                          I've just PM'd you by the way.

                          Thanks for your help Keith.

                          Colin

                          1 Reply Last reply Reply Quote 0
                          • NileCruises
                            NileCruises @SEOKeith last edited by

                            Mind you the whole blog is now showing an error message and cant' be viewed so looks like an afternoon of trial and error!

                            1 Reply Last reply Reply Quote 0
                            • SEOKeith
                              SEOKeith last edited by

                              Looks like a 403 permissions problem, that's a server side error... Make sure you have the correct permissions set on the blog folder in IIS Personally I always host on Linux...

                              NileCruises 1 Reply Last reply Reply Quote 0
                              • NileCruises
                                NileCruises @SEOKeith last edited by

                                Thanks Keith. Just contacting out hosts.

                                Nightmare!

                                1 Reply Last reply Reply Quote 0
                                • NileCruises
                                  NileCruises last edited by

                                  Blog isn't' showing now and my hosts say that the index.php file is missing from the directory but I can see it.

                                  Strange.

                                  Have contacted them again to see what the problem can be.

                                  Bit of a wasted Saturday!

                                  🙂

                                  1 Reply Last reply Reply Quote 0
                                  • Dr-Pete
                                    Dr-Pete last edited by

                                    Google is seeing the same Robots.txt content (in GWT) that you show in the physical file, right? I just want to make sure that, when the site was hacked, no changes were made that are showing different versions of files to Google. It sounds like that's not the case here, but it definitely can happen.

                                    NileCruises 1 Reply Last reply Reply Quote 1
                                    • NileCruises
                                      NileCruises @Dr-Pete last edited by

                                      Hi Peter,

                                      I've tested the robot.txt file in Webmaster Tools and it now seems to be working as it should and it seems Google is seeing the same file as I have on the server.

                                      I'm afraid this side of things isn't' my area of expertise so it's been a bit of a minefield.

                                      I've taken a subscription with sucuri.net and taken various other steps that hopefully will hel;p with security. But who knows?

                                      Thanks,

                                      Colin

                                      1 Reply Last reply Reply Quote 0
                                      • SEO-Pump.com
                                        SEO-Pump.com @TroyCarlson last edited by

                                        This will be my first post on SEOmoz so bear with me 😉

                                        The way I understand it is that robots read the robots.txt file from top to bottom, and once they find a rule that applies to them they stop reading and begin crawling.  So basically the robots.txt written as:

                                        User-agent:*

                                        Disallow:

                                        Crawl-delay: 20

                                        User-agent: 008

                                        Disallow: /

                                        would not have the desired result as user-agent 008 would first read the top guideline:

                                        User-agent: *

                                        Disallow:

                                        Crawl-delay: 20

                                        and then begin crawling your site, as it is first being told that All user-agents are disallowed to crawl no pages or directories.

                                        The corrected way to write this would be:

                                        User-agent: 008

                                        Disallow: /

                                        User-agent: *

                                        Disallow:

                                        Crawl-delay: 20

                                        1 Reply Last reply Reply Quote 0
                                        • 1 / 1
                                        • First post
                                          Last post
                                        • AMP for WordPress: To Do Or Not To Do
                                          lydiagilbertson
                                          lydiagilbertson
                                          1
                                          4
                                          227

                                        • I have 6 URL errors in GSC showing a 500 error code. How do I fix?
                                          pmull
                                          pmull
                                          0
                                          5
                                          183

                                        • WordPress and Rich Snippets plugin creating 501 error
                                          DIGIHOUSE
                                          DIGIHOUSE
                                          0
                                          4
                                          314

                                        • Sitemap error
                                          edward-may
                                          edward-may
                                          0
                                          3
                                          117

                                        • Best way to fix 404 crawl errors caused by Private blog posts in WordPress?
                                          FedeEinhorn
                                          FedeEinhorn
                                          0
                                          2
                                          468

                                        • Wordpress No 404
                                          LukeHutchinson
                                          LukeHutchinson
                                          0
                                          5
                                          231

                                        • Error 403
                                          AdoptionHelp
                                          AdoptionHelp
                                          0
                                          2
                                          415

                                        • Wordpress "trackbacks" Errors
                                          Hurf
                                          Hurf
                                          0
                                          2
                                          929

                                        Get started with Moz Pro!

                                        Unlock the power of advanced SEO tools and data-driven insights.

                                        Start my free trial
                                        Products
                                        • Moz Pro
                                        • Moz Local
                                        • Moz API
                                        • Moz Data
                                        • STAT
                                        • Product Updates
                                        Moz Solutions
                                        • SMB Solutions
                                        • Agency Solutions
                                        • Enterprise Solutions
                                        • Digital Marketers
                                        Free SEO Tools
                                        • Domain Authority Checker
                                        • Link Explorer
                                        • Keyword Explorer
                                        • Competitive Research
                                        • Brand Authority Checker
                                        • Local Citation Checker
                                        • MozBar Extension
                                        • MozCast
                                        Resources
                                        • Blog
                                        • SEO Learning Center
                                        • Help Hub
                                        • Beginner's Guide to SEO
                                        • How-to Guides
                                        • Moz Academy
                                        • API Docs
                                        About Moz
                                        • About
                                        • Team
                                        • Careers
                                        • Contact
                                        Why Moz
                                        • Case Studies
                                        • Testimonials
                                        Get Involved
                                        • Become an Affiliate
                                        • MozCon
                                        • Webinars
                                        • Practical Marketer Series
                                        • MozPod
                                        Connect with us

                                        Contact the Help team

                                        Join our newsletter
                                        Moz logo
                                        © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                                        • Accessibility
                                        • Terms of Use
                                        • Privacy