The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. How to use robots.txt to block areas on page?

    How to use robots.txt to block areas on page?

    Technical SEO Issues
    8 5 225
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • LauraHT
      LauraHT last edited by

      Hi,

      Across the categories/product pages on out site there are archives/shipping info section and the texts are always the same. Would this be treated as duplicated content and harmful for seo?

      How can I alter robots.txt to tell google not to crawl those particular text

      Thanks for any advice!

      1 Reply Last reply Reply Quote 0
      • Kingof5
        Kingof5 last edited by

        Google smart enough to recognize what it is, it won't get you penalized for duplicate content.

        1 Reply Last reply Reply Quote 0
        • GPainter
          GPainter last edited by

          Hiya,

          First off the main answer is here - http://moz.com/learn/seo/robotstxt

          an alternative solution might be use of the canonical tag meaning you're getting all the link juice rather than letting it fall off the radar. I wouldn't be overly worried about duplicate content its not a big bad wolf that will annihilate your website.

          Best idea if you're worried about duplicate content is the canonical tag it has the benefit of keeping link juice where as the robots tends to mean you loose some link juice. One thing to remember those is the canonical tag means the pages will not be indexed (same as robots tag in the end) so if they are ranking (or getting page views) something to remember.

          hope that helps.

          Good luck.

          1 Reply Last reply Reply Quote 0
          • Wasabihound
            Wasabihound last edited by

            Hi Laura

            I am not sure that you can use robots.txt to prevent a search engine bot from crawling a part of a page. Robots.txt is usually used to exclude a whole page.

            The effect of the duplicate content on your search engine optimisation depends in part on how extensive is the duplication. In many cases it seems that Google won't penalise the duplicate content (it understands that some content will of necessity be duplicated) - see this video by Matt Cutts from Google.

            Duplicate Content is Small (Short Paragraph)

            From your question it sounds like you are talking about part of page and it sounds like a relatively small part - I assume you are not a shipping company so the shipping info would be a small part of the page.

            In which case it may not affect your search engine optimisation at all (assuming you are not trying to rank for the shipping info).

            As long as the content on the rest of the page is unique or different from other pages on the site.

            Duplicate Content is Large (but not a page)

            If the shipping info is substantial (say a couple of paragraphs or half the content on the page) then Google suggests you create a separate page with the substantial info on it and use a brief summary on other pages with a link to the separate page:

            • Minimize boilerplate repetition: For instance, instead of including lengthy copyright text on the bottom of every page, include a very brief summary and then link to a page with more details. In addition, you can use the Parameter Handling tool to specify how you would like Google to treat URL parameters.

            (from Google Webmaster: Duplicate Content)

            Duplicated Pages

            Much of the discussion about duplicated content is more about whole pages of duplicated content. The risk with these pages are that search engines may not know which to rank (or more to the point rank the one you don't want to rank). This is where you might use a rel=canonical tag or a 301 redirect to direct or hint to the search engine which page to use.

            Moz has a good article on Duplicate Content.

            All the best

            Neil

            1 Reply Last reply Reply Quote 2
            • LauraHT
              LauraHT last edited by

              Thanks, the info above is quite detailed.

              We are not a shipping company those text are just to ensure visitors accordingly. The shipping info is quite long as we want to prompt as mush as we could to avoid customer leaving current page to search.

              Wasabihound 1 Reply Last reply Reply Quote 0
              • LesleyPaone
                LesleyPaone last edited by

                Here is a tip that I use for my clients and I would recommend. Most CMS / Ecommerce platforms allow for you to put a category description in the page. But, what they do is when the page paginates is they use the same category description and just different products on the page (some use a querystring on the url, others use a shebang, others use other things).

                What I recommend to my clients to escape any thin content issues is to point the canonical url of all of the paginated pages back to the 1st category page. At the same time I will add a noindex, follow tag to the header of the paginated pages. This is counter to what a lot of people do I think, but the reason I do it is because of thin content. Also you don't want your page 3 results cannibalizing your main category landing page results. Since no CMS that I know of lets you specify different category descriptions for each pagination of a category it seems like the only real choice. It also makes it where you do not really need to add rel=next and rel=previous to the paginated pages too.

                1 Reply Last reply Reply Quote 0
                • Wasabihound
                  Wasabihound @LauraHT last edited by

                  Hi Laura

                  I have not used lazy loading except with images, however I did some reading around and it might be a solution. There is a large section in Google Webmasters that talks about how to make AJAX readable by a crawler/bot so obviously it is not normally readable (Google Webmaster on AJAX crawling).

                  The other option is to provide a summary on the product page for shipping info and link to a larger shipping info page (as suggested earlier) and get it to open on a new page/tab. At least this keeps the product page open too.

                  (Note good UX practice recommends you tell the user they will open a new page if they click on the link - this could be as simple as using the anchor text: "More Detailed Shipping Information (opens new page)".

                  cheers

                  Neil

                  1 Reply Last reply Reply Quote 0
                  • LauraHT
                    LauraHT last edited by

                    Thanks for the info above. I think I'll find out if I can cut the text and try to put popup link.

                    1 Reply Last reply Reply Quote 0
                    • 1 / 1
                    • First post
                      Last post
                    • Should I block Map pages with robots.txt?
                      imaginex
                      imaginex
                      0
                      3
                      73

                    • HTTP Status showing up in opensiteexplorer top pages as blocked by robot.txt file
                      FedeEinhorn
                      FedeEinhorn
                      0
                      2
                      106

                    • The use of robots.txt
                      ICON_Malta
                      ICON_Malta
                      0
                      3
                      84

                    • Using Robots.txt
                      OlegKorneitchouk
                      OlegKorneitchouk
                      0
                      2
                      311

                    • Warnings for blocked by blocked by meta-robots/meta robots Nofollow...how to resolve?
                      Cyrus-Shepard
                      Cyrus-Shepard
                      0
                      3
                      415

                    • OK to block /js/ folder using robots.txt?
                      katemorris
                      katemorris
                      0
                      4
                      2.3k

                    • Do you get credit for an external link that points to a page that's being blocked by robots.txt
                      ShaMenz
                      ShaMenz
                      0
                      2
                      390

                    • Blocking other engines in robots.txt
                      RyanKent
                      RyanKent
                      0
                      2
                      581

                    Get started with Moz Pro!

                    Unlock the power of advanced SEO tools and data-driven insights.

                    Start my free trial
                    Products
                    • Moz Pro
                    • Moz Local
                    • Moz API
                    • Moz Data
                    • STAT
                    • Product Updates
                    Moz Solutions
                    • SMB Solutions
                    • Agency Solutions
                    • Enterprise Solutions
                    • Digital Marketers
                    Free SEO Tools
                    • Domain Authority Checker
                    • Link Explorer
                    • Keyword Explorer
                    • Competitive Research
                    • Brand Authority Checker
                    • Local Citation Checker
                    • MozBar Extension
                    • MozCast
                    Resources
                    • Blog
                    • SEO Learning Center
                    • Help Hub
                    • Beginner's Guide to SEO
                    • How-to Guides
                    • Moz Academy
                    • API Docs
                    About Moz
                    • About
                    • Team
                    • Careers
                    • Contact
                    Why Moz
                    • Case Studies
                    • Testimonials
                    Get Involved
                    • Become an Affiliate
                    • MozCon
                    • Webinars
                    • Practical Marketer Series
                    • MozPod
                    Connect with us

                    Contact the Help team

                    Join our newsletter
                    Moz logo
                    © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                    • Accessibility
                    • Terms of Use
                    • Privacy