The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Moz Tools
    4. Seomoz bar: No Follow and Robots.txt

    Seomoz bar: No Follow and Robots.txt

    Moz Tools
    7 3 1.3k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • squareplug
      squareplug last edited by

      Should the Mozbar pickup 'nofollow" links that are handled in  robots.txt ?

      the robots.tx blocks categories, but is still show as a followed (green) link when using the mozbar.

      Thanks!

      Holly

      ETA: I'm assuming that- disallow: myblog.com/category/ - is comparable to the nofollow tag on catagory?

      1 Reply Last reply Reply Quote 0
      • PracticeFusion
        PracticeFusion last edited by

        The nofollow attribute and robots.txt file serve different purposes.

        Nofollow Attribute

        This attribute is used to tell search engines, "Don't follow this link", or even "Don't follow any links on this page." It doesn't prevent pages from being indexed, just prevents the search engines from following that link from that particular page.

        Robots.txt

        This file contains a list of pages that the search engine should not access and should not index.

        To read more about robots.txt check out this page: http://googleblog.blogspot.com/2007/01/controlling-how-search-engines-access.html

        For more on Nofollow, check out this page: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=96569

        Hope this helps!

        squareplug 1 Reply Last reply Reply Quote 1
        • squareplug
          squareplug @PracticeFusion last edited by

          I know one day i may wakeup one morning and this will all click, but for now perhaps an example will help me get past this initial hurdle.

          Squarespace disallows categories in the robots.txt, but using the mozbar I see the category links are green.

          So if I understand (partly anyways), the disallow in robots keeps the bots from crawling those pages when they come knocking at my site. However, the category  links in a blog post are being crawled? or what's the point?

          I'm just trying to understand the reasoning behind disallowing categories and how that should impact the tagging and categorizing of blog posts.

          Perhaps I should of started a new question? or is it applicable to leave it here..

          PracticeFusion squareplug Cyrus-Shepard 4 Replies Last reply Reply Quote 0
          • PracticeFusion
            PracticeFusion @squareplug last edited by

            Thanks for providing some more detail Holly. I definitely think it's applicable to leave here and I'm happy to help.

            Some people like to prevent search engines from crawling category pages out of a fear of duplicate content. For example, say you have a post that's at this URL:

            site.com/blog/chocolate-milk-is-great.html

            and it's also the only post in the category "milk" with this url:

            site.com/blog/category/milk

            then search engines see the same exact content (your blog post) on two different URLs. Since duplicate content is a big no-no, many people choose to prevent the engines from crawling category pages. Although, in my experience, it's really up to you. Do you feel like your category pages will provide value to users? Would you like them to show up in search results? If so, then make sure you let Google crawl them.

            If you DON'T want category pages to be indexed by Google, then I think there's a better choice than using robots.txt. Your best bet is applying the noindex, follow tag to these pages. This tag tells the engines NOT to index this page, but to follow all of the links on it. This is better than robots.txt because robots.txt won't always prevent your site from showing up in search results (that's another long story), but the noindex tag will.

            If I'm not making sense at all then please just let me know :).

            Lastly, from what I can see on your site and blog, it doesn't look like the category pages for your blog are actually in your robots.txt file. Have someone do a double check.

            To check this myself, I just did a google search for this URL:

            http://blog.squarespace.com/blog/?category=Roadmap

            And it showed up in Google right away. Looks like something isn't going according to plan. Don't worry though, that happens all of the time and it should be an easy fix.

            1 Reply Last reply Reply Quote 2
            • squareplug
              squareplug @squareplug last edited by

              Thank you so much for the detailed reply. It's REALLY appreciated. The blog you are referring to is the Squarespace company's blog. This disallow: categories IS however on any site that uses their service. But I've done a similar search with my personal blog on Squarespace and a couple of categories still show up in the SERPs anyways. You can edit the robot file if you want, but you have to do a redirect as you don't have root access.

              Unfortunately, (at least I don't think we can), include meta tags for noindex on a page by page basis.  You can use it in robots.txt.

              It seems their would be a lot more duplicate content issue with tags rather than categories as it's more granular than categories.

              The point of all this is I'm creating  new websites for some of our homeschool students and want to get it right from the start with the site architecture and how we use tags and categories with a balanced focus on usability as well as optimizing for search. These kids are super interested in all the reasoning behind things and their questions are tougher than any client! Ha!

              Again, Thanks so much and take care,

              Holly

              1 Reply Last reply Reply Quote 0
              • Cyrus-Shepard
                Cyrus-Shepard @squareplug last edited by

                As Phil pointed out, blocking a URL with robot.txt may keep search engines from crawling your pages, but that doesn't mean they wont index those pages. The meta robots NOINDEX, FOLLOW tag is a much better choice.

                Highly recommend the following article that explains this in more detail:

                http://www.seomoz.org/blog/serious-robotstxt-misuse-high-impact-solutions

                Unfortunately, Sqarespace isn't all that flexible when it comes to meta tags. For the most part, Google is getting better at figuring this kind of duplicate content out, but it's best to address it when you can.

                1 Reply Last reply Reply Quote 1
                • squareplug
                  squareplug @squareplug last edited by

                  Thank you Cyrus for that great article link. And like that article states near the end, it touches on a common problem for those of us that assume all the info at SeoMoz is accurate even though it may not be current. (not only seomoz to be fair) I've found several instances where even authorities change their mind or google changes is for them?

                  But anyways, it appears using canonical or meta tags would be the better solution. Unfortunately,neither is possible in Squarespace. I had just about decided to change the robots.txt , get rid of the disallow: /category/ , and call it a day. But then I found an example where the noindex was used in the robots.txt file of a squarespace website (specializing in SEM among other things). Probably the "longest" robots list I've ever seen!

                  http://www.hunchfree.com/robots.txt

                  Would it be a good idea to use noindex, FOLLOW in the robots.txt  for /category/

                  (if that's even possible) or just keep with my "call it a day" solution...at least where robots.txt is concerned.

                  BTW- I posted a similar question on the reasoning behind the robots.txt for ss websites at the developers forum- nothing but crickets. Unless it's about design, things pretty much drop like a rock. Oh well.

                  1 Reply Last reply Reply Quote 0
                  • 1 / 1
                  • First post
                    Last post
                  • Robots.txt file issues on Shopify server
                    Expansyon
                    Expansyon
                    0
                    2
                    44

                  • Htaccess and robots.txt and 902 error
                    SEOguy1
                    SEOguy1
                    0
                    6
                    1.1k

                  • Moz campaign works around my robots.txt settings
                    VinceWicks
                    VinceWicks
                    0
                    4
                    149

                  • Do the SEOmoz Campaign Reports follow Robots.txt?
                    Flexcin
                    Flexcin
                    0
                    3
                    236

                  • Does SeoMoz realize about duplicated url blocked in robot.txt?
                    Abe_Schmidt
                    Abe_Schmidt
                    0
                    3
                    218

                  • Does Rogerbot respect the robots.txt file for wildcards?
                    AC_Pro
                    AC_Pro
                    0
                    4
                    574

                  • To block with robots.txt or canonicalize?
                    STPseo
                    STPseo
                    0
                    2
                    522

                  • SEOmoz bar causes FF to hang
                    CPU
                    CPU
                    0
                    4
                    725

                  Get started with Moz Pro!

                  Unlock the power of advanced SEO tools and data-driven insights.

                  Start my free trial
                  Products
                  • Moz Pro
                  • Moz Local
                  • Moz API
                  • Moz Data
                  • STAT
                  • Product Updates
                  Moz Solutions
                  • SMB Solutions
                  • Agency Solutions
                  • Enterprise Solutions
                  • Digital Marketers
                  Free SEO Tools
                  • Domain Authority Checker
                  • Link Explorer
                  • Keyword Explorer
                  • Competitive Research
                  • Brand Authority Checker
                  • Local Citation Checker
                  • MozBar Extension
                  • MozCast
                  Resources
                  • Blog
                  • SEO Learning Center
                  • Help Hub
                  • Beginner's Guide to SEO
                  • How-to Guides
                  • Moz Academy
                  • API Docs
                  About Moz
                  • About
                  • Team
                  • Careers
                  • Contact
                  Why Moz
                  • Case Studies
                  • Testimonials
                  Get Involved
                  • Become an Affiliate
                  • MozCon
                  • Webinars
                  • Practical Marketer Series
                  • MozPod
                  Connect with us

                  Contact the Help team

                  Join our newsletter
                  Moz logo
                  © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                  • Accessibility
                  • Terms of Use
                  • Privacy