The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. International Issues
    4. Site Spider/ Crawler/ Scraper Software

    Site Spider/ Crawler/ Scraper Software

    International Issues
    2 2 1.6k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • AlexThomas
      AlexThomas last edited by

      Short of coding up your own web crawler - does anyone know/ have any experience with a good bit of software to run through all the pages on a single domain?

      (And potentially on linked domains 1 hop away...)

      This could be either server or desktop based.

      Useful capabilities would include:

      • Scraping (x-path parameters)
      • of clicks from homepage (site architecture)

      • http headers
      • Multi threading
      • Use of proxies
      • Robots.txt compliance option
      • csv output
      • Anything else you can think of...

      Perhaps an oppourtunity for an additional SEOmoz tool here since they do it already!

      Cheers!

      Note:
      I've had a look at:

      • Nutch
        http://nutch.apache.org/
      • Heritrix
        https://webarchive.jira.com/wiki/display/Heritrix/Heritrix
      • Scrapy
        http://doc.scrapy.org/en/latest/intro/overview.html
      • Mozenda (does scraping but doesn't appear extensible..)

      Any experience/ preferences with these or others?

      1 Reply Last reply Reply Quote 0
      • iPullRank
        iPullRank last edited by

        Hey Alex,

        Screaming Frog is hands down the best desktop crawling software and it has most of what you are looking for.

        -Mike

        1 Reply Last reply Reply Quote 2
        • 1 / 1
        • First post
          Last post
        • Redirect to 'default' or English (/en) version of site?
          ShahzadAhmed
          ShahzadAhmed
          0
          3
          94

        • US site vs New Canadian site for Brand
          zeehj
          zeehj
          0
          2
          45

        • /en-us/ Outranking Root Domain and other hreflang errors
          Andy.Drinkwater
          Andy.Drinkwater
          0
          2
          160

        • International SEO question domain.com vs domain.com/us/ , domain.com/uk etc.
          jeremycabral
          jeremycabral
          0
          10
          729

        • Redirect the main site to keyword-rich subfolder / specific page for SEO
          SEOdub
          SEOdub
          0
          3
          252

        • Multi country targeting for listing site, ccTLD, sub domain or .com/folder?
          Francisco_Meza
          Francisco_Meza
          0
          4
          2.2k

        • How to replace my .co.uk site with my .com site in the US Google results
          OffSightIT
          OffSightIT
          0
          3
          402

        • Does it matter whether you use /en vs /uk
          RyanKent
          RyanKent
          0
          4
          786

        Get started with Moz Pro!

        Unlock the power of advanced SEO tools and data-driven insights.

        Start my free trial
        Products
        • Moz Pro
        • Moz Local
        • Moz API
        • Moz Data
        • STAT
        • Product Updates
        Moz Solutions
        • SMB Solutions
        • Agency Solutions
        • Enterprise Solutions
        • Digital Marketers
        Free SEO Tools
        • Domain Authority Checker
        • Link Explorer
        • Keyword Explorer
        • Competitive Research
        • Brand Authority Checker
        • Local Citation Checker
        • MozBar Extension
        • MozCast
        Resources
        • Blog
        • SEO Learning Center
        • Help Hub
        • Beginner's Guide to SEO
        • How-to Guides
        • Moz Academy
        • API Docs
        About Moz
        • About
        • Team
        • Careers
        • Contact
        Why Moz
        • Case Studies
        • Testimonials
        Get Involved
        • Become an Affiliate
        • MozCon
        • Webinars
        • Practical Marketer Series
        • MozPod
        Connect with us

        Contact the Help team

        Join our newsletter
        Moz logo
        © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
        • Accessibility
        • Terms of Use
        • Privacy