The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Using Meta Header vs Robots.txt

    Using Meta Header vs Robots.txt

    Intermediate & Advanced SEO
    16 3 195
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • evan89
      evan89 @AlanMosley last edited by

      Hey Alan,

      Again, I thank you for your feedback. Unfortunately rel prev/next are not relevant in this circumstance. Also, it is all unique content on my client's own site, and I know that it is a duplicate content problem because I have 2 similar pages with slightly different facets ranking 14 and 15 in SERPS. If search engines were to choose one over the other, they would not rank them back to back.

      For clarification, this is an e-commerce application with faceted navigation. Not a pagination issue.

      Thanks for your input.

      AlanMosley 1 Reply Last reply Reply Quote 0
      • AlanMosley
        AlanMosley @evan89 last edited by

        I'm not sure you have a problem, why not let them all get indexed?

        evan89 1 Reply Last reply Reply Quote 0
        • evan89
          evan89 @AlanMosley last edited by

          It is a problem in the SERPS because if I run a query for the brand, I can see faceted variations of that brand (say "brand" "blue") is ranking right below, but neither of them are ranking on the first page. I won't NOINDEX all pages, just those that don't provide value for customers searching, and those that are competing with competitive terms that are causing the preferred page to rank lower.

          It was brought to my attention through Moz analytics, and once I began to investigate it further, I found many sources mentioning that this is very common for e-commerce. Common practice is robots.txt and a plugin, but we are using a different plugin. So, for this reason, I am trying to figure out if NOINDEX meta headers are a good option.

          Does that make sense?

          1 Reply Last reply Reply Quote 0
          • CraigBradford
            CraigBradford last edited by

            Hi Even, this is quite a common problem. There are a couple of things to consider when deciding if Noindex is the solution rather than robots.txt.

            Unless there is a reason the pages need to be crawled (like there are pages on the site that are only linked to from those pages) I would use robots.txt. Noindex doesn't stop search engines crawling those pages, only from putting them in the index. So in theory, search engines could spend all there time crawling pages that you don't want to be in the index.

            Here's what I'd do:

            Decide on a reasonable number of facets, for example, if you're selling TVs people might search for:

            1. Sony TV (Brand search)
            2. 50 inch sony tv (size + brand)
            3. Sony 50 inch HD TV (brand + size + specification)

            But past 3 facets tends to get very little search volume (do keyword research for your own market)

            In this case I'd create a rule that appends something to the URL after 3 facets hat would make it easy to block in robots.txt. For example I might make my structure:

            1. example.com/tv/sony
            2. example.com/tv/sony/50
            3. example.com/tv/sony/50/HD

            But as soon as I add a 4th facet, for example 'colour'- I add in the filter subfolder

            • example.com**/filter/**tv/sony/50/HD/white

            I can then easily block all these pages in robots.txt using:

            Disallow: /filter/

            I hope this helps.

            AlanMosley evan89 2 Replies Last reply Reply Quote 1
            • AlanMosley
              AlanMosley @CraigBradford last edited by

              The problem with robots text is that any link pointing to a no-indexed page is passing link juice that will never be returned, it is wasted. robots.txt is the last resort, IMO its should never be used.

              1 Reply Last reply Reply Quote 0
              • CraigBradford
                CraigBradford last edited by

                Hi Alan, I understand that, but the problem Evan is describing seems to be related to duplicate content and crawl allowance. There's no perfect answer but in my experience the types of pages that Evan is describing aren't often linked to. Taking that into consideration, IMO robots.txt is the correct solution.

                evan89 1 Reply Last reply Reply Quote 0
                • AlanMosley
                  AlanMosley last edited by

                  they will be linked to by internal links,

                  There is no penalty for have duplicates of your own content, but having links pouring away link juice is a self imposed penalty.

                  evan89 1 Reply Last reply Reply Quote 1
                  • evan89
                    evan89 @CraigBradford last edited by

                    Hey Craig,

                    Thanks for your response. This is the common answer that I have found. Here is the challenge I am having (I will use your example above):

                    Let's say that example.com/tv/sony is the main category page for this brand, but I only carry a few Sony tvs. Therefore, the only difference between that page and this page: example.com/tv/sony/50 is a category description that disappears when further facets are chosen.

                    When I search in the SERPS for "Sony TVs", rather than ranking well for one of these pages, both rank moderately well, but not well enough for first page results, and I would think this is confusing to customers as well to find two very closely related pages side by side.

                    So, while I agree that robots.txt is a tool that I can apply for limiting search engines from getting dizzy with the facets by limiting them to (say) 4, is NOINDEX the best solution for controlling duplicate content issues that are not that deep, and more case-by-case?

                    One more thing I might add is that these issues don't happen site-wide. If I carry many products from Samsung, than example.com/tv/samsung and example.com/tv/samsung/50 and even example.com/tv/samsung/50/HD will produce very different results. The issue usually occurs where there are few products for a brand, and they filter the same way with many facets.

                    Does that make sense? I agree with you whole heartedly, I am just trying to figure out how to deal with the shallow duplicate issues.

                    Cheers,

                    AlanMosley 1 Reply Last reply Reply Quote 0
                    • AlanMosley
                      AlanMosley @evan89 last edited by

                      This sounds like a job for a canonical tag.

                      evan89 1 Reply Last reply Reply Quote 0
                      • evan89
                        evan89 @AlanMosley last edited by

                        As mentioned initially, the CMS doesn't allow me to edit canonicals for individual pages that are created via facets. The other question I had about canonicals is that a rel canonical is meant to help bots understand that different variations of the same page are, in fact, the same page: example.com = example.com/. But, for the user (which ultimately bots care about), example.com/sony/50 may not always be the same as example.com/sony.

                        Anyways, that is a little beside the point. I have visited the options of canonicals, but I am not sure it can be done.

                        1 Reply Last reply Reply Quote 0
                        • evan89
                          evan89 @CraigBradford last edited by

                          Hey Craig,

                          I agree with you regarding the robots.txt, however, how does one isolate parameters that are commonly used within product names, thus being the the product url as well. We are using a plugin the makes the urls more user friendly, but it makes it tough to isolate "small" or "blue" because the parameters don't include a "?sort=" or "color=" prefix anymore.

                          This is why I am considering using the meta header in order to control help with the issues of the duplicate content and crawl allowance?

                          Since I can edit the meta headers on a variety of pages, is it a viable option to use NOINDEX,FOLLOW?

                          1 Reply Last reply Reply Quote 0
                          • evan89
                            evan89 @AlanMosley last edited by

                            "there is no penalty for have duplicates of your own content"

                            Alan,

                            I must respectfully disagree with this statement. Perhaps google will not penalize you directly, but it is easy to self-canabalize key terms if one has many facets that only differ slightly. I have seen this on a site where they don't rank on the first page, but they have 3-4 pages on the second page or SERPs. This is the exact issue that I am trying to resolve.

                            Evan

                            ps. sorry I hit the wrong button, but you got a good answer out of it 😛

                            1 Reply Last reply Reply Quote 0
                            • 1 / 1
                            • First post
                              Last post
                            • Not sure how we're blocking homepage in robots.txt; meta description not shown
                              Saijo.George
                              Saijo.George
                              0
                              5
                              434

                            • Should I be using meta robots tags on thank you pages with little content?
                              Alick300
                              Alick300
                              0
                              3
                              802

                            • Use Canonical or Robots.txt for Map View URL without Backlink Potential
                              ahmettanir
                              ahmettanir
                              0
                              5
                              118

                            • Using folder blocked by robots.txt before uploaded to indexed folder - is that OK?
                              khi5
                              khi5
                              0
                              4
                              100

                            • Using 2 wildcards in the robots.txt file
                              lonniea
                              lonniea
                              0
                              2
                              605

                            • Block all but one URL in a directory using robots.txt?
                              Cyrus-Shepard
                              Cyrus-Shepard
                              0
                              3
                              3.5k

                            • Does using robots.txt to block pages decrease search traffic?
                              KeriMorgret
                              KeriMorgret
                              0
                              4
                              520

                            • Not using a robot command meta tag
                              Malarowski
                              Malarowski
                              0
                              3
                              624

                            Get started with Moz Pro!

                            Unlock the power of advanced SEO tools and data-driven insights.

                            Start my free trial
                            Products
                            • Moz Pro
                            • Moz Local
                            • Moz API
                            • Moz Data
                            • STAT
                            • Product Updates
                            Moz Solutions
                            • SMB Solutions
                            • Agency Solutions
                            • Enterprise Solutions
                            • Digital Marketers
                            Free SEO Tools
                            • Domain Authority Checker
                            • Link Explorer
                            • Keyword Explorer
                            • Competitive Research
                            • Brand Authority Checker
                            • Local Citation Checker
                            • MozBar Extension
                            • MozCast
                            Resources
                            • Blog
                            • SEO Learning Center
                            • Help Hub
                            • Beginner's Guide to SEO
                            • How-to Guides
                            • Moz Academy
                            • API Docs
                            About Moz
                            • About
                            • Team
                            • Careers
                            • Contact
                            Why Moz
                            • Case Studies
                            • Testimonials
                            Get Involved
                            • Become an Affiliate
                            • MozCon
                            • Webinars
                            • Practical Marketer Series
                            • MozPod
                            Connect with us

                            Contact the Help team

                            Join our newsletter
                            Moz logo
                            © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                            • Accessibility
                            • Terms of Use
                            • Privacy