The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. Robots.txt to disallow /index.php/ path

    Robots.txt to disallow /index.php/ path

    Technical SEO Issues
    9 4 7.1k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Mikkehl
      Mikkehl last edited by

      Hi SEOmoz,

      I have a problem with my Joomla site (yeah - me too!). I get a large amount of /index.php/ urls despite using a program to handle these issues. The URLs cause indexation errors with google (404). Now, I fixed this issue once before, but the problem persist. So I thought, instead of wasting more time, couldnt I just disallow all paths containing /index.php/ ?.

      I don't use that extension, but would it cause me any problems from an SEO perspective?

      How do I disallow all index.php's? Is it a simple: Disallow: /index.php/

      1 Reply Last reply Reply Quote 0
      • SanketPatel
        SanketPatel last edited by

        Hi Mikkel,

        Do you inbound link pointing to you index.php pages ? If yes, then it might affect your seo. Disallow: /index.ph/ is perfect but after implementing it don't inter link those index.php pages. Can you share me your website URL so that I can show you with example. How to do it.

        Mikkehl 1 Reply Last reply Reply Quote 1
        • cogbox
          cogbox last edited by

          Couldn't you rewrite those /index.php/ urls to remove the /index.php/?

          Like this in .htaccess:

          RewriteRule ^(.*)$ /index.php/$1 [L]

          Only used Joomla once, but there must be a way to configure joomla to just use "/" instead of "/index.php/"?

          Update:

          Here's a solution to your /index.php/ issue:

          http://www.eprcreations.com/remove-index-php-from-joomla-urls/

          Once you've updated that, and have your urls working properly without the /index.php/, you could add this slight modification of the rewrite rule above so that all your old /index.php/ urls would be 301'd to your new ones:

          RewriteRule ^(.*)$ /index.php/$1 [R=301,L]

          Put it underneath the RewriteBase / line they describe in that post.

          Mikkehl 1 Reply Last reply Reply Quote 0
          • Mikkehl
            Mikkehl @SanketPatel last edited by

            Sure, the website in question is www.vauni.dk

            I don't think that there is any inbound links to the index.php pages. They are not easily found.

            1 Reply Last reply Reply Quote 0
            • Mikkehl
              Mikkehl @cogbox last edited by

              Well, I tried the sensible solution and redirecting to the correct URL instead. However the SEF program is quite limited and keep on creating new URLs regardless of my modification. Im looking for a more permanent solution, and the disallow seems at bit simple as I'm not a super programmer.

              By the way - thanks for quick replys, kudos to both of you!

              SanketPatel cogbox 2 Replies Last reply Reply Quote 0
              • SanketPatel
                SanketPatel @Mikkehl last edited by

                Hi Mikkel, I have checked your robots.txt, it looks perfect. If you redirect /index.php to home page that using httaccess file or by using any joomla plugin that would great for you. And its also a permanent solution. 🙂

                1 Reply Last reply Reply Quote 0
                • cogbox
                  cogbox @Mikkehl last edited by

                  If I spider your site I'm not seeing any /index.php urls. Does that mean you did get Joomla to cooperate with your rewriting?

                  Or was your problem that you'd previously had urls indexed with /index.php/ paths and you needed to remove them?

                  1 Reply Last reply Reply Quote 1
                  • Cyrus-Shepard
                    Cyrus-Shepard last edited by

                    Hi Mikkel,

                    Like Chris, I spidered your site and couldn't find any links to /index.php files, which probably indicates one of two things:

                    1. You've fixed the problem - Yay!
                    2. Or Google is finding those links from external sources
                    3. Google found those links at one time in the past, and is still trying to crawl them.

                    In the Crawl Errors report in Google Webmaster Tools, if you click on the link of each 404, there's often a "linked from" source where you can see where Google discovered the broken link. This is really helpful in rooting out the cause.

                    Regardless, I'm going to go with #1 and optimistically believe that you were able to fix the problem. 🙂

                    Mikkehl 1 Reply Last reply Reply Quote 1
                    • Mikkehl
                      Mikkehl @Cyrus-Shepard last edited by

                      Hi Cyrus,

                      Thanks for your reply!

                      Unfortunately the problem is yet to be fixed, I hope that my disallow will work shortly.

                      It seems that most of the index.php links to each other internally (and from old /index.php/ pages that no longer exist), which is super weird. How google found them does not make any sense to me.

                      I don't beleive that external sources are linking to these pages either - I mean, how would they find these links anyway?.

                      1 Reply Last reply Reply Quote 0
                      • 1 / 1
                      • First post
                        Last post
                      • Disallow wildcard match in Robots.txt
                        effectdigital
                        effectdigital
                        0
                        3
                        1.0k

                      • No index tag robots.txt
                        Nigel_Carr
                        Nigel_Carr
                        0
                        11
                        3.3k

                      • Robots.txt Disallow: / in Search Console
                        GastonRiera
                        GastonRiera
                        0
                        2
                        124

                      • Will a Robots.txt 'disallow' of a directory, keep Google from seeing 301 redirects for pages/files within the directory?
                        DmitriiK
                        DmitriiK
                        0
                        4
                        408

                      • Correct linking to the /index of a site and subfolders: what's the best practice? link to: domain.com/ or domain.com/index.html ?
                        CleverPhD
                        CleverPhD
                        0
                        4
                        245

                      • Duplicate content /index.php/ issues
                        ThompsonPaul
                        ThompsonPaul
                        1
                        5
                        867

                      • WordPress - How to stop both http:// and https:// pages being indexed?
                        Dr-Pete
                        Dr-Pete
                        0
                        4
                        1.9k

                      • How do I use the Robots.txt "disallow" command properly for folders I don't want indexed?
                        portalseo
                        portalseo
                        0
                        5
                        1.9k

                      Get started with Moz Pro!

                      Unlock the power of advanced SEO tools and data-driven insights.

                      Start my free trial
                      Products
                      • Moz Pro
                      • Moz Local
                      • Moz API
                      • Moz Data
                      • STAT
                      • Product Updates
                      Moz Solutions
                      • SMB Solutions
                      • Agency Solutions
                      • Enterprise Solutions
                      • Digital Marketers
                      Free SEO Tools
                      • Domain Authority Checker
                      • Link Explorer
                      • Keyword Explorer
                      • Competitive Research
                      • Brand Authority Checker
                      • Local Citation Checker
                      • MozBar Extension
                      • MozCast
                      Resources
                      • Blog
                      • SEO Learning Center
                      • Help Hub
                      • Beginner's Guide to SEO
                      • How-to Guides
                      • Moz Academy
                      • API Docs
                      About Moz
                      • About
                      • Team
                      • Careers
                      • Contact
                      Why Moz
                      • Case Studies
                      • Testimonials
                      Get Involved
                      • Become an Affiliate
                      • MozCon
                      • Webinars
                      • Practical Marketer Series
                      • MozPod
                      Connect with us

                      Contact the Help team

                      Join our newsletter
                      Moz logo
                      © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                      • Accessibility
                      • Terms of Use
                      • Privacy