The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Internal Duplicate Content Question...

    Internal Duplicate Content Question...

    Intermediate & Advanced SEO
    7 3 188
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • tdawson09
      tdawson09 last edited by

      We are looking for an internal duplicate content checker that is capable of crawling a site that has over 300,000 pages. We have looked over Moz's duplicate content tool and it seems like it is somewhat limited in how deep it crawls. Are there any suggestions on the best "internal" duplicate content checker that crawls deep in a site?

      1 Reply Last reply Reply Quote 1
      • Ria_
        Ria_ last edited by

        Check out Siteliner. I've never tried it with a site that big, personally. But it's free, so worth a shot to see what you can get out of it.

        1 Reply Last reply Reply Quote 0
        • BlueprintMarketing
          BlueprintMarketing last edited by

          If you looking for the most powerful tool for crawling websites deepcrawl.com is the king. Screaming frog it Is good but is dependent on RAM on your desktop. And does not have as many features as deep crawl

          https://www.deepcrawl.com/knowledge/news/google-webmaster-hangout-highlights-08102015/

          1 Reply Last reply Reply Quote 0
          • BlueprintMarketing
            BlueprintMarketing last edited by

            If the tool has to crawl more than a crawl depth of 100 it is very common to find something  that's able to do it. Like a said deep crawl,  screaming frog  & Moz is but you're talking about finding content that shouldn't be restructured.

            1 Reply Last reply Reply Quote 0
            • tdawson09
              tdawson09 last edited by

              Correct, Thomas. We are not looking to restructure the site at this time but we are looking for a program that will crawl 300,000 plus pages and let us know which internal pages are duplicated.

              BlueprintMarketing 1 Reply Last reply Reply Quote 0
              • BlueprintMarketing
                BlueprintMarketing @tdawson09 last edited by

                Far no way the Best is going to be  deep Crawl  it automatically connects to Google Webmaster tools and analytics.

                it can crawl  constantly for ever. The real advantage is setting it to five URLs per second and depending on the speed of your server it will do it consistently I would not go over five pages per second. Make sure that you pick a dynamic   IP structuring if you do not have a strong web application firewall if you do pick a single static IP then you can crawl the entire tire site without issue by white listing it. Now this is my personal opinion and I know what you're asking to be accomplished in the literally  no time compared to other systems using deep crawl deepcrawl.com

                It will show you what duplicate content is contained inside your website duplicate URLs what duplicate  title tags you name it.

                https://www.deepcrawl.com/knowledge/best-practice/seven-duplicate-content-issues/

                https://www.deepcrawl.com/knowledge/news/google-webmaster-hangout-highlights-08102015/

                You have a decent sized  website and I would recommend adding a free edition of  Robotto.org   Robotto, can detect whether a preferredwww or non-www option has been configured correctly.

                A lot of issues with web application firewall and CDNs you name it can be detected using the school and the combination of them is a real one-two punch. I honestly think that you will be happy with this tool. I have had issues with anything local like screaming frog when crawling surcharge websites you do not want to depend on your desktop ram. I hope you will let me know if this is a good solution for you I know that it works very very well and it will not stop crawling until it finds everything.  Your site will be finished before 24 hours are done.

                1 Reply Last reply Reply Quote 0
                • BlueprintMarketing
                  BlueprintMarketing last edited by

                  If you want to a free test to crawl use this

                  https://www.deepcrawl.com/forms/free-crawl-report/

                  Please remember that URIs & URLs are different so your site with 300,000 URLs  might have 600,000 URIs if you want to see how it works for free you can sign up for a free crawl for your first 10,000 pages.

                  I am not affiliated with the company aside from being a very happy customer.

                  1 Reply Last reply Reply Quote 0
                  • 1 / 1
                  • First post
                    Last post
                  • Duplicate content - how to diagnose duplicate content from another domain before publishing pages?
                    Chemometec
                    Chemometec
                    0
                    7
                    141

                  • If a website trades internationally and simply translates its online content from English to French, German, etc how can we ensure no duplicate content penalisations and still maintain SEO performance in each territory?
                    Martijn_Scheijbeler
                    Martijn_Scheijbeler
                    0
                    2
                    46

                  • Duplicate content within sections of a page but not full page duplicate content
                    J_Sinclair
                    J_Sinclair
                    0
                    3
                    112

                  • Duplicate Content Question
                    NathanGilmore
                    NathanGilmore
                    0
                    4
                    122

                  • Duplicate Content Question
                    fablau
                    fablau
                    1
                    9
                    126

                  • Duplicate Content Question
                    EGOL
                    EGOL
                    0
                    3
                    245

                  • Duplicate Content Question
                    SEOKeith
                    SEOKeith
                    1
                    5
                    525

                  • Duplicate Content - Panda Question
                    EGOL
                    EGOL
                    0
                    2
                    551

                  Get started with Moz Pro!

                  Unlock the power of advanced SEO tools and data-driven insights.

                  Start my free trial
                  Products
                  • Moz Pro
                  • Moz Local
                  • Moz API
                  • Moz Data
                  • STAT
                  • Product Updates
                  Moz Solutions
                  • SMB Solutions
                  • Agency Solutions
                  • Enterprise Solutions
                  • Digital Marketers
                  Free SEO Tools
                  • Domain Authority Checker
                  • Link Explorer
                  • Keyword Explorer
                  • Competitive Research
                  • Brand Authority Checker
                  • Local Citation Checker
                  • MozBar Extension
                  • MozCast
                  Resources
                  • Blog
                  • SEO Learning Center
                  • Help Hub
                  • Beginner's Guide to SEO
                  • How-to Guides
                  • Moz Academy
                  • API Docs
                  About Moz
                  • About
                  • Team
                  • Careers
                  • Contact
                  Why Moz
                  • Case Studies
                  • Testimonials
                  Get Involved
                  • Become an Affiliate
                  • MozCon
                  • Webinars
                  • Practical Marketer Series
                  • MozPod
                  Connect with us

                  Contact the Help team

                  Join our newsletter
                  Moz logo
                  © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                  • Accessibility
                  • Terms of Use
                  • Privacy