The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. How to extract URLs from a site (without bringing the server down!)

    How to extract URLs from a site (without bringing the server down!)

    Technical SEO Issues
    6 5 546
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • neooptic
      neooptic last edited by

      Hi everybody.

      One of my clients is migrating to a new ecommerce platform, and we need to get a list of urls from the existing site to start mapping out the 301 redirects. Usually, I'd use a tool like Xenu or Integrity to crawl and output a list.

      However, the database and server setup is so bad that it can't handle the requests from these tools and it sends the site down. This, unsurprisingly, is one of the reasons for the migration.

      Does anybody know of a way to get a full list of urls without having to make a bunch of http requests which will kill the site? Any advice would be much appreciated!

      1 Reply Last reply Reply Quote 0
      • YannickVeys
        YannickVeys last edited by

        • Scrape Google?

        • Make your own scraper and keep the requests per second really low ?

        • Maybe the site has an automated sitemap somewhere ?

        • Google webmaster tools -> download "internal links" table

        neooptic 1 Reply Last reply Reply Quote 3
        • neooptic
          neooptic @YannickVeys last edited by

          Thanks Yannick, I don't know why I didn't think of using a scraper! Can you recommend any good code (PHP perhaps)?

          AlanMosley 1 Reply Last reply Reply Quote 0
          • AlanMosley
            AlanMosley @neooptic last edited by

            why not find the links to the site, becauase you will only need to 301 the urls with extenal links. let teh rest 404. i use Bing WMT as it has a most complete collection IMO. they also export to a csv

            1 Reply Last reply Reply Quote 0
            • Dan-Petrovic
              Dan-Petrovic last edited by

              Copy the site, set it up on a staging server and run http://www.xml-sitemaps.com/ on it?

              1 Reply Last reply Reply Quote 1
              • Dr-Pete
                Dr-Pete last edited by

                Just a follow-up to my endorsement. It looks like Screaming Frog will let you control the number of pages crawled per second, but to do a full crawl you'll need to get the paid version (the free version only crawls 500 URLs):

                http://www.screamingfrog.co.uk/seo-spider/

                It's a good tool, and nice to have around, IMO.

                1 Reply Last reply Reply Quote 1
                • 1 / 1
                • First post
                  Last post
                • Changing site URL structure
                  vezaus
                  vezaus
                  0
                  2
                  80

                • URL Question: Is there any value for ecomm sites in having a reverse "breadcrumb" in the URL?
                  ROI_DNA
                  ROI_DNA
                  0
                  4
                  229

                • We have designed a new site and are in dilemna whether or not to change the site's URL structure or maybe few odd looking urls. How exactly do we go about the URL thing in toto? thanks in advance, any suggestion(s) would be dearly welcome
                  ShaunPhilips
                  ShaunPhilips
                  1
                  7
                  109

                • Friendly URLs for MultiLingual Site
                  wissamdandan
                  wissamdandan
                  0
                  2
                  106

                • Best practice for eCommerce site migration, should I 301 redirect or match URLs on new site
                  Jinx14678
                  Jinx14678
                  0
                  4
                  1.2k

                • Blog article URL - with or without date?
                  Dino64
                  Dino64
                  0
                  6
                  6.8k

                • Crawl reveals hundreds of urls with multiple urls in the url string
                  irvingw
                  irvingw
                  0
                  5
                  455

                • New URL or Folder Off Existing Site
                  VERBInteractive
                  VERBInteractive
                  0
                  3
                  422

                Get started with Moz Pro!

                Unlock the power of advanced SEO tools and data-driven insights.

                Start my free trial
                Products
                • Moz Pro
                • Moz Local
                • Moz API
                • Moz Data
                • STAT
                • Product Updates
                Moz Solutions
                • SMB Solutions
                • Agency Solutions
                • Enterprise Solutions
                • Digital Marketers
                Free SEO Tools
                • Domain Authority Checker
                • Link Explorer
                • Keyword Explorer
                • Competitive Research
                • Brand Authority Checker
                • Local Citation Checker
                • MozBar Extension
                • MozCast
                Resources
                • Blog
                • SEO Learning Center
                • Help Hub
                • Beginner's Guide to SEO
                • How-to Guides
                • Moz Academy
                • API Docs
                About Moz
                • About
                • Team
                • Careers
                • Contact
                Why Moz
                • Case Studies
                • Testimonials
                Get Involved
                • Become an Affiliate
                • MozCon
                • Webinars
                • Practical Marketer Series
                • MozPod
                Connect with us

                Contact the Help team

                Join our newsletter
                Moz logo
                © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                • Accessibility
                • Terms of Use
                • Privacy