The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Moz Tools
    4. Crawlers crawl weird long urls

    Crawlers crawl weird long urls

    Moz Tools
    14 3 758
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • r.nijkamp
      r.nijkamp last edited by

      I did a crawl start for the first time and i get many errors, but the weird fact is that the crawler tracks duplicate long, not existing urls.

      For example (to be clear):

      there is a page: www.website.com/dogs/dog.html

      but then it is continuing crawling:
      www.website.com/dogs/dog.html
      www.website.com/dogs/dogs/dog.html
      www.website.com/dogs/dogs/dogs/dog.html
      www.website.com/dogs/dogs/dogs/dogs/dog.html
      www.website.com/dogs/dogs/dogs/dogs/dogs/dog.html

      what can I do about this? Screaming Frog gave me the same issue, so I know it's something with my website

      1 Reply Last reply Reply Quote 0
      • WesleySmits
        WesleySmits last edited by

        Are you somehow linking to www.website.com/dogs/dog.html from the page itself? There could be something wrong with that link.
        I made a small mistake not so long ago with a redirection plugin. I told it to go to domain.com. This plugin was looking at the base + what i told it to. So it went to: domain.com/domain.com. Perhaps you made a similar mistake.

        Maybe you can send me the URL and i can take a look at it?

        1 Reply Last reply Reply Quote 1
        • r.nijkamp
          r.nijkamp last edited by

          That is a good one! It's true that I have the same linking to the page itself. I will remove all that kind of links first and crawl again. I'll keep you in touch!

          WesleySmits 1 Reply Last reply Reply Quote 0
          • PaddyDisplays
            PaddyDisplays last edited by

            I think Screaming Frog will tell you the page it found the weird url, then you can check the source, and find out whats producing that link.

            r.nijkamp 1 Reply Last reply Reply Quote 1
            • WesleySmits
              WesleySmits @r.nijkamp last edited by

              You don't necessarily have to remove the link. As long as you can verify that it directs to the right page.

              But curious to see what caused the problem 🙂

              r.nijkamp 1 Reply Last reply Reply Quote 1
              • r.nijkamp
                r.nijkamp @WesleySmits last edited by

                please be free to check: http://tinyurl.com/lox7le9

                WesleySmits 1 Reply Last reply Reply Quote 0
                • WesleySmits
                  WesleySmits @r.nijkamp last edited by

                  Which URL(s) is/are causing problems?

                  r.nijkamp 1 Reply Last reply Reply Quote 1
                  • r.nijkamp
                    r.nijkamp @PaddyDisplays last edited by

                    I can't see any source:

                    The pages are like:

                    | URL | www.website.com/page/ |
                    | Status Code | 200 |
                    | Status | OK |
                    | Type | text/html; charset=utf-8 |
                    | Size | 55811 |
                    | Title |   |
                    | Level | 10 |
                    | In Links | 9 |
                    | Out Links | 38 |

                    1 Reply Last reply Reply Quote 0
                    • r.nijkamp
                      r.nijkamp @WesleySmits last edited by

                      every link, except the hompage itself

                      bugurl.png

                      1 Reply Last reply Reply Quote 0
                      • PaddyDisplays
                        PaddyDisplays last edited by

                        ok I did a quick screaming fog and I think I have an idea, you just have to follow the breadcrumbs

                        You said in you example "In Links 9", you need to find out what those pages are and follow it back to the point of origin As I think its just one bad link that cause this nested link effect.

                        eg

                        http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/OverOdin/OverOdin/HeutinkICT.aspx

                        is being linked from

                        http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/OverOdin/StationtoStation.aspx  (as well as others)

                        You just have to follow that trail till you find the source of the problem

                        PaddyDisplays 1 Reply Last reply Reply Quote 1
                        • PaddyDisplays
                          PaddyDisplays @PaddyDisplays last edited by

                          OK found one problem

                          on this page

                          http://www.odin-groep.nl/Home/ctl/OverOdin/ReindersICT.aspx

                          you have a link to

                          http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/LesscherIT.aspx

                          which i think should be

                          http://www.odin-groep.nl/Home/ctl/OverOdin/LesscherIT.aspx

                          1 Reply Last reply Reply Quote 3
                          • WesleySmits
                            WesleySmits last edited by

                            I see a link to http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/HeutinkICT.aspx from http://www.odin-groep.nl/Home/ctl/OverOdin/ReindersICT.aspx.

                            It's the bottom left block which causes this link. This way you will get a big nesting effect.

                            1 Reply Last reply Reply Quote 2
                            • r.nijkamp
                              r.nijkamp last edited by

                              Wow, Big mistakes are made one Home

                              maybe because of the .aspx. extension? alle pages have seo-friendly urls

                              Thanks Wesley and Paddy Displays

                              1 Reply Last reply Reply Quote 0
                              • r.nijkamp
                                r.nijkamp last edited by

                                Answer from Screaming Frog!

                                The reason the SEO spider is crawling these URLs, is due to incorrect relative linking on the site from the login URL.
                                It's actually when the spider crawls the login page, http://www.website.com/login?returnurl=%2F which then leads to this URL http://www.website.com/Home/ctl/SendPassword?returnurl=http:/www.website.com/ and then this /home/ sub directory URL http://www.website.com/Home/ctl/page/dogs.aspx which links to http://www.website.com/Home/ctl/page/page/dogs.aspx and so on and so forth. This is the path to the incorrect relative linking (attached for you).

                                To stop this, you can correct the incorrect relative linking, or easier, simply exclude the login page.

                                1 Reply Last reply Reply Quote 0
                                • 1 / 1
                                • First post
                                  Last post
                                • Pages with URL Too Long
                                  Nigel_Carr
                                  Nigel_Carr
                                  0
                                  2
                                  972

                                • Moz Crawl Report more urls?
                                  MattRoney
                                  MattRoney
                                  0
                                  2
                                  240

                                • How long does the Seomoz crawl cache last?
                                  Brian-H
                                  Brian-H
                                  0
                                  9
                                  297

                                • Order of urls in SEOMoz crawl report
                                  LynnMarie
                                  LynnMarie
                                  0
                                  3
                                  804

                                • Dot Net Nuke generating long URL showing up as crawl errors!
                                  benjaminspak
                                  benjaminspak
                                  0
                                  3
                                  398

                                • Dynamic URL pages in Crawl Diagnostics
                                  cmaseattle
                                  cmaseattle
                                  0
                                  4
                                  681

                                • Why is the SEOmoz crawler crawling the old version of our website?
                                  Gestisoft-Qc
                                  Gestisoft-Qc
                                  0
                                  4
                                  801

                                • Crawl test. Bot crawled only 200 or so links when it should have crawled thousands
                                  Ev84
                                  Ev84
                                  0
                                  9
                                  1.2k

                                Get started with Moz Pro!

                                Unlock the power of advanced SEO tools and data-driven insights.

                                Start my free trial
                                Products
                                • Moz Pro
                                • Moz Local
                                • Moz API
                                • Moz Data
                                • STAT
                                • Product Updates
                                Moz Solutions
                                • SMB Solutions
                                • Agency Solutions
                                • Enterprise Solutions
                                • Digital Marketers
                                Free SEO Tools
                                • Domain Authority Checker
                                • Link Explorer
                                • Keyword Explorer
                                • Competitive Research
                                • Brand Authority Checker
                                • Local Citation Checker
                                • MozBar Extension
                                • MozCast
                                Resources
                                • Blog
                                • SEO Learning Center
                                • Help Hub
                                • Beginner's Guide to SEO
                                • How-to Guides
                                • Moz Academy
                                • API Docs
                                About Moz
                                • About
                                • Team
                                • Careers
                                • Contact
                                Why Moz
                                • Case Studies
                                • Testimonials
                                Get Involved
                                • Become an Affiliate
                                • MozCon
                                • Webinars
                                • Practical Marketer Series
                                • MozPod
                                Connect with us

                                Contact the Help team

                                Join our newsletter
                                Moz logo
                                © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                                • Accessibility
                                • Terms of Use
                                • Privacy