The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. Best use of robots.txt for "garbage" links from Joomla!

    Best use of robots.txt for "garbage" links from Joomla!

    Technical SEO Issues
    6 3 410
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • teleman
      teleman last edited by

      I recently started out on Seomoz and is trying to make some cleanup according to the campaign report i received.

      One of my biggest gripes is the point of "Dublicate Page Content".

      Right now im having over 200 pages with dublicate page content.

      Now.. This is triggerede because Seomoz have snagged up auto generated links from my site.

      My site has a "send to freind" feature, and every time someone wants to send a article or a product to a friend via email a pop-up appears.

      Now it seems like the pop-up pages has been snagged by the seomoz spider,however these pages is something i would never want to index in Google.

      So i just want to get rid of them.

      Now to my question

      I guess the best solution is to make a general rule via robots.txt, so that these pages is not indexed and considered by google at all.

      But, how do i do this? what should my syntax be?

      A lof of the links looks like this, but has different id numbers according to the product that is being send:

      http://mywebshop.dk/index.php?option=com_redshop&view=send_friend&pid=39&tmpl=component&Itemid=167

      I guess i need a rule that grabs the following and makes google ignore links that contains this:

      view=send_friend

      1 Reply Last reply Reply Quote 0
      • chris.kent
        chris.kent last edited by

        Your right I would disallow via robots.txt & a wildcard (*) wherever a unique item id # could be generated.

        teleman 1 Reply Last reply Reply Quote 0
        • teleman
          teleman @chris.kent last edited by

          So my link example would look like this in robots.txt?

          Disallow: /index.php?option=com_redshop&view=send_friend&pid=&tmpl=component&Itemid=

          Or

          Disallow: /view=send_friend/

          chris.kent teleman Cyrus-Shepard 3 Replies Last reply Reply Quote 0
          • chris.kent
            chris.kent @teleman last edited by

            The second one "Disallow: /*view=send_friend" will prevent googlebot from crawling any url with that string in it. So that should take care of your problem.

            1 Reply Last reply Reply Quote 1
            • teleman
              teleman @teleman last edited by

              I just tried to add

              Disallow: /view=send_friend

              I removed the last /

              however a crawl gave me the dublicate content problem again.

              Is my syntax wrong?

              1 Reply Last reply Reply Quote 0
              • Cyrus-Shepard
                Cyrus-Shepard @teleman last edited by

                Hi Henrik,

                It can take up to a week for SEOmoz crawlers to process your site, which may be an issue if you recently added the tag. Did you remember to include all user agents in your first line?

                User-agent: *
                

                Be sure to test your robots.txt file in Google Webmaster Tools to ensure everything is correct.

                Couple of other things you can do:

                1. Add a rel="nofollow" on your send to friend links.

                2.  Add a meta robots "noindex" to the head of the popup html.

                3. And/or add a canonical tag to the popup. Since I don't have a working example, I don't know what to canonical it too (whatever content it is duplicating) but this is also an option.

                1 Reply Last reply Reply Quote 0
                • 1 / 1
                • First post
                  Last post
                • Should you use robots.txt for pages within your site which do not have high quality content or are not contributing a great deal so when Google crawls your site the best performing content has a higher chance of being indexed?
                  Jacksons_Fencing
                  Jacksons_Fencing
                  0
                  5
                  44

                • Google Webmaster Tools is saying "Sitemap contains urls which are blocked by robots.txt" after Https move...
                  vetofunk
                  vetofunk
                  0
                  5
                  11.2k

                • Robots.txt - "File does not appear to be valid"
                  PeaSoupDigital
                  PeaSoupDigital
                  0
                  3
                  319

                • "Extremely high number of URLs" warning for robots.txt blocked pages
                  KristinaKledzik
                  KristinaKledzik
                  0
                  8
                  380

                • Using Robots.txt
                  OlegKorneitchouk
                  OlegKorneitchouk
                  0
                  2
                  311

                • Same URL in "Duplicate Content" and "Blocked by robots.txt"?
                  alsvik
                  alsvik
                  0
                  3
                  502

                • Should we use "and" or "&"?
                  IPROdigital
                  IPROdigital
                  0
                  9
                  3.2k

                • Should I use a "-", ":", or "|" in the title tag?
                  CPU
                  CPU
                  0
                  3
                  826

                Get started with Moz Pro!

                Unlock the power of advanced SEO tools and data-driven insights.

                Start my free trial
                Products
                • Moz Pro
                • Moz Local
                • Moz API
                • Moz Data
                • STAT
                • Product Updates
                Moz Solutions
                • SMB Solutions
                • Agency Solutions
                • Enterprise Solutions
                • Digital Marketers
                Free SEO Tools
                • Domain Authority Checker
                • Link Explorer
                • Keyword Explorer
                • Competitive Research
                • Brand Authority Checker
                • Local Citation Checker
                • MozBar Extension
                • MozCast
                Resources
                • Blog
                • SEO Learning Center
                • Help Hub
                • Beginner's Guide to SEO
                • How-to Guides
                • Moz Academy
                • API Docs
                About Moz
                • About
                • Team
                • Careers
                • Contact
                Why Moz
                • Case Studies
                • Testimonials
                Get Involved
                • Become an Affiliate
                • MozCon
                • Webinars
                • Practical Marketer Series
                • MozPod
                Connect with us

                Contact the Help team

                Join our newsletter
                Moz logo
                © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                • Accessibility
                • Terms of Use
                • Privacy