The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. What's wrong with this robots.txt

    What's wrong with this robots.txt

    Technical SEO Issues
    19 4 493
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Leonie-Kramer
      Leonie-Kramer last edited by

      Hi. really struggling with the robots.txt file
      this is it:

      User-agent: *
      Disallow: /product/

      #old sitemap
      Disallow: /media/name.xml

      When testing in w3c.org everything looks good, testing is okay, but when uploading it to the server, Google webmaster tools gives 3 errors. Checked it with my collegue we both don't know what's wrong.

      Can someone take a look at this and give me the solution.
      Thanx in advance!

      Leonie

      1 Reply Last reply Reply Quote 1
      • Martijn_Scheijbeler
        Martijn_Scheijbeler last edited by

        Hi Leonie, what are the 3 errors as it seems that the robots.txt file syntax is correct.

        1 Reply Last reply Reply Quote 1
        • Leonie-Kramer
          Leonie-Kramer last edited by

          Hi, sorry forgot to mention that 😉

          syntax error @ User-agent: *

          no user agent @ Disallow: /product/

          no user agent @ Disallow: /media/name.xml

          Thanx, Leonie

          Martijn_Scheijbeler 1 Reply Last reply Reply Quote 0
          • DeanAndrews
            DeanAndrews last edited by

            Hi,
            Lines containing only a comment are discarded completely, and therefore do not indicate a record boundary however you may need to remove the line break (not 100% sure but worth testing):
            
            User-agent: *
            Disallow: /product/
            Disallow: /media/bcc.xml
            
            1 Reply Last reply Reply Quote 1
            • Martijn_Scheijbeler
              Martijn_Scheijbeler @Leonie-Kramer last edited by

              Sounds more like a bug in the tool that you're as I tested the syntax just now in Google Webmaster Tools and it's not causing any issues there.

              Leonie-Kramer 1 Reply Last reply Reply Quote 1
              • Leonie-Kramer
                Leonie-Kramer @Martijn_Scheijbeler last edited by

                Okay i got these errors in webmaster tools, very strange it is 😉

                1 Reply Last reply Reply Quote 0
                • Leonie-Kramer
                  Leonie-Kramer last edited by

                  if i test the blocked url's they are blocked so it looks like the file is doing what's supposed to do. but still is strange i got these errors.

                  @Dean Andrews, thanx i will test it without empty lines, though have to wait for another deployment 😉

                  BlueprintMarketing 1 Reply Last reply Reply Quote 0
                  • BlueprintMarketing
                    BlueprintMarketing @Leonie-Kramer last edited by

                    if you want to find out anything that could possibly be wrong with that this tool is the holy grail of finding out what's wrong with robots.txt issues in my opinion just expect a lot more info than a simple response from it.

                    http://tools.seochat.com/tools/robots-txt-validator/

                    Sincerely,

                    Thomas

                    DeanAndrews Leonie-Kramer 2 Replies Last reply Reply Quote 3
                    • DeanAndrews
                      DeanAndrews @BlueprintMarketing last edited by

                      Thomas,

                      That's an awesome tool, thank you for sharing.

                      BlueprintMarketing 1 Reply Last reply Reply Quote 0
                      • Leonie-Kramer
                        Leonie-Kramer @BlueprintMarketing last edited by

                        Thanx for the url: it gives a warning on

                        Disallow: /product/
                        and
                        Disallow: /media/bcc.xml

                        i wonder why?

                        BlueprintMarketing 1 Reply Last reply Reply Quote 0
                        • BlueprintMarketing
                          BlueprintMarketing @DeanAndrews last edited by

                          Hi Dean happy to be of help!

                          1 Reply Last reply Reply Quote 0
                          • BlueprintMarketing
                            BlueprintMarketing @Leonie-Kramer last edited by

                            Hi Leonie,

                            I believe that you should create a robots.txt file that allows for a user agent disallow a folder /media/ and /.xml file. make the Unwanted xml file a 410 it will be dead to Google. however I think I have come up with a solution below please try pasting that in if it does not work.

                            A another tool for building robots.txt files and comparing them to the existing file from the same company believe it or not is right here.

                            http://www.internetmarketingninjas.com/seo-tools/robots-txt-generator/

                            please note you are disallowing more than just media you are disallowing something that should  be more like this is for the xml sitemap why not just set it to a 410 killing the link in Google's eyes then you will not have to  Disallow.

                            User-agent: *
                            Disallow: /product/
                            Disallow: /media/
                            Disallow: /bcc.xml

                            Sitemap: http://example.com/sitemap_index.xml

                            putting your new site map in where I have placed a site map or where the rule above will give you the spot to put it will help you tell Google where your new site map resides along with of course submitting it to Google Webmaster tools and fetching it as a Google bot.

                            I would like to look at the architecture of your site if you're getting errors with what you showed me you can send me a private message and I promise I will respond if you are not comfortable showing the URL on Q&A.

                            I hope this is of help,

                            Thomas

                            Leonie-Kramer 1 Reply Last reply Reply Quote 0
                            • BlueprintMarketing
                              BlueprintMarketing last edited by

                              By the way here is an outdated site map that has when it looks like errors that really is telling me the protocol for putting a site map inside a robots.txt file is not endorsed by Google or Bing however I  truly feel it is helpful so I do it. I've also added extra video site maps from an external host which is what's throwing out the errors the red color of the disallows is not a error it is just letting you know they are being blocked. Hopefully this will be of help

                              bigger photo is right here as well please give me a look at what errors are getting

                              http://i.imgur.com/Xg7EXwO.png

                              http status: 200

                              Syntax check robots.txt on http://www.blueprintmarketing.com/robots.txt (359 bytes)

                              | Line | Severity | Code |
                              | 6 | Warning | The official standard does not include Sitemap support even though major crawlers (Google and Bing) support it. It is still nonstandard. |
                              | 7 | Warning | The official standard does not include Sitemap support even though major crawlers (Google and Bing) support it. It is still nonstandard. |
                              | 8 | Warning | The official standard does not include Sitemap support even though major crawlers (Google and Bing) support it. It is still nonstandard. |
                              | 9 | Warning | The official standard does not include Sitemap support even though major crawlers (Google and Bing) support it. It is still nonstandard. |
                              | 10 | Warning | The official standard does not include Sitemap support even though major crawlers (Google and Bing) support it. It is still nonstandard. |

                              Warnings Detected: 5

                              Errors Detected: 0

                              robots.txt source code for http://

                              | Line | Code |
                              | <a name="line-1"></a>1 | User-agent: * |
                              | <a name="line-2"></a>2 | Disallow: /wp-content/plugins/ |
                              | <a name="line-3"></a>3 | Disallow: /wp-admin/ |
                              | <a name="line-4"></a>4 | Disallow: /wp-includes/ |
                              | <a name="line-5"></a>5 |   |
                              | <a name="line-6"></a>6 | Sitemap: http://www.blueprintmarketing.com/sitemap_index.xml |
                              | <a name="line-7"></a>7 | Sitemap: http://app.wistia.com/sitemaps/11323.xml |
                              | <a name="line-8"></a>8 | Sitemap: http://app.wistia.com/sitemaps/4339.xml |
                              | <a name="line-9"></a>9 | Sitemap: http://app.wistia.com/sitemaps/14213.xml |
                              | <a name="line-10"></a>10 | Sitemap: http://app.wistia.com/sitemaps/23283.xml |

                              Xg7EXwO.png

                              1 Reply Last reply Reply Quote 1
                              • Leonie-Kramer
                                Leonie-Kramer @BlueprintMarketing last edited by

                                Hi, Thanx for your reply, i'm not sure i understand you by "please note you are disallowing more than just media"

                                the thing with this is the xml file is an old file but somewhere in the google archive. i tried do remove it with the wmt, but returns. It's  not on the server anymore. the directory "media" doesn't exist anymore, also from an old website.

                                Because the file still returns in wmt i thought let's try it with the robots.txt

                                new robots.txt not tested waiting for deployment 😉

                                Oh call me stupid, but how do i make a 410?

                                Grtz, Leonie

                                BlueprintMarketing 1 Reply Last reply Reply Quote 0
                                • Leonie-Kramer
                                  Leonie-Kramer last edited by

                                  Hi ,

                                  i got it working with a proper sitemap. Special thanks to Thomas for the great effort in his answers!

                                  BlueprintMarketing 1 Reply Last reply Reply Quote 1
                                  • BlueprintMarketing
                                    BlueprintMarketing @Leonie-Kramer last edited by

                                    Hi Leonie,

                                    That's very kind of you I am very happy that you got it working correctly.

                                    All the best,

                                    Thomas

                                    1 Reply Last reply Reply Quote 0
                                    • BlueprintMarketing
                                      BlueprintMarketing @Leonie-Kramer last edited by

                                      Believe me it took me plenty of time to realize how to do this but if you're handy with SFTP or SSH you can change the

                                      http://stackoverflow.com/questions/1975904/htaccess-to-redirect-all-traffic-to-one-page-410-gone?rq=1

                                      And for the ultimate in ease if you're using WordPress there is actually a plug-in for 410s so it wasn't something anyone found easy to do.

                                      https://wordpress.org/plugins/wp-410/

                                      Sincerely,

                                      Thomas

                                      Leonie-Kramer 1 Reply Last reply Reply Quote 0
                                      • Leonie-Kramer
                                        Leonie-Kramer @BlueprintMarketing last edited by

                                        Ah thanks, it's an Azure platform, so no SFTP, SSH or .htaccess. but i'll give the stack link to the technical guys then they have to translate it to our environment  ( .net)

                                        BlueprintMarketing 1 Reply Last reply Reply Quote 1
                                        • BlueprintMarketing
                                          BlueprintMarketing @Leonie-Kramer last edited by

                                          I think thats a great Idea .net is not my thing.

                                          All the best!

                                          Tom

                                          1 Reply Last reply Reply Quote 0
                                          • 1 / 1
                                          • First post
                                            Last post
                                          • Will a Robots.txt 'disallow' of a directory, keep Google from seeing 301 redirects for pages/files within the directory?
                                            DmitriiK
                                            DmitriiK
                                            0
                                            4
                                            408

                                          • Strange URL's for client's site
                                            everestagency
                                            everestagency
                                            0
                                            3
                                            456

                                          • What's Moz's Strategy behind their blog main categories?
                                            Cyrus-Shepard
                                            Cyrus-Shepard
                                            0
                                            8
                                            279

                                          • Pro's & contra's: http vs https
                                            max.favilli
                                            max.favilli
                                            0
                                            9
                                            377

                                          • How to solve the meta : A description for this result is not available because this site's robots.txt. ?
                                            Vale7
                                            Vale7
                                            0
                                            5
                                            1.2k

                                          • What's the best way to handle Overly Dynamic Url's?
                                            GKLA
                                            GKLA
                                            0
                                            2
                                            309

                                          • Can I format my H1 to be smaller than H2's and H3's on the same page?
                                            theLotter
                                            theLotter
                                            0
                                            5
                                            2.8k

                                          Get started with Moz Pro!

                                          Unlock the power of advanced SEO tools and data-driven insights.

                                          Start my free trial
                                          Products
                                          • Moz Pro
                                          • Moz Local
                                          • Moz API
                                          • Moz Data
                                          • STAT
                                          • Product Updates
                                          Moz Solutions
                                          • SMB Solutions
                                          • Agency Solutions
                                          • Enterprise Solutions
                                          • Digital Marketers
                                          Free SEO Tools
                                          • Domain Authority Checker
                                          • Link Explorer
                                          • Keyword Explorer
                                          • Competitive Research
                                          • Brand Authority Checker
                                          • Local Citation Checker
                                          • MozBar Extension
                                          • MozCast
                                          Resources
                                          • Blog
                                          • SEO Learning Center
                                          • Help Hub
                                          • Beginner's Guide to SEO
                                          • How-to Guides
                                          • Moz Academy
                                          • API Docs
                                          About Moz
                                          • About
                                          • Team
                                          • Careers
                                          • Contact
                                          Why Moz
                                          • Case Studies
                                          • Testimonials
                                          Get Involved
                                          • Become an Affiliate
                                          • MozCon
                                          • Webinars
                                          • Practical Marketer Series
                                          • MozPod
                                          Connect with us

                                          Contact the Help team

                                          Join our newsletter
                                          Moz logo
                                          © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                                          • Accessibility
                                          • Terms of Use
                                          • Privacy