The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. On-Page / Site Optimization
    4. Checking for content duplication against content on your own site.

    Checking for content duplication against content on your own site.

    On-Page / Site Optimization
    10 4 483
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • MichealGooden
      MichealGooden last edited by

      We are currently trying to rewrite our product descriptions and I'm afraid some of the salespeople that are writing the descriptions are plagiarizing one-another's writing. Is there a content duplication checker that will allow you to check a piece of writing against a specific site rather than all of the web?

      1 Reply Last reply Reply Quote 0
      • Cornel_Ilea
        Cornel_Ilea last edited by

        Hi Michael,

        Maybe this can help you. http://www.copyscape.com/

        Cornel

        MichealGooden 1 Reply Last reply Reply Quote -1
        • MichealGooden
          MichealGooden @Cornel_Ilea last edited by

          That site searches the entire web for copies. I'm looking for something to crawl my own site for duplicate content.

          Cornel_Ilea MichealGooden Jinx14678 4 Replies Last reply Reply Quote 0
          • Cornel_Ilea
            Cornel_Ilea @MichealGooden last edited by

            The duplicate content from you website is shown in the SEOmoz tools.

            Check the Crawl Diagnostics Summary:

            Cornel

            1 Reply Last reply Reply Quote 0
            • MichealGooden
              MichealGooden @MichealGooden last edited by

              That's for pages that are already published and crawled. I want to able to search my site for entire sentences and/or paragraphs of text that I have yet to publish so I can make sure it's not being used elsewhere on the site. The crawl diagnostics tell me I have duplicate content after the fact - I'm trying to take a proactive approach rather than reactive.

              1 Reply Last reply Reply Quote 0
              • Jinx14678
                Jinx14678 @MichealGooden last edited by

                Just off the top of my head, there are a few low tech ways to do it....

                If you have Win 7 the searching has improved greatly - just move all files to a local machine - and search the directory you placed in for the content you are wanting to check - it will give all files that contain the words. (but can become overloading)

                If you have dreamweaver or other enterprise level editor - almost all have a site search function to where you can search/profile code/text and have it find one by one which pages contain the searched terms - or globally list them.

                Other than that, probably a custom script -or a google search for an HTML profiler might help?

                Shane

                1 Reply Last reply Reply Quote 1
                • MichealGooden
                  MichealGooden @MichealGooden last edited by

                  Those are good answers and would work on a smaller scale site. We currently have over 17,000 product pages so I can't really use either method. It's looking like a google custom search is the best bet even though I can't search an entire paragraph at a time.

                  1 Reply Last reply Reply Quote 0
                  • Cornel_Ilea
                    Cornel_Ilea last edited by

                    Hi Michael,

                    Having a website that big means that you might have a test or dev environment.

                    If not create one.

                    if you have something like  test.yourwebsite.com and submit it  to the SEOmoz tools as a new project you can see a report before your website goes live.

                    Cornel

                    MichealGooden 1 Reply Last reply Reply Quote -1
                    • MichealGooden
                      MichealGooden @Cornel_Ilea last edited by

                      I have two dev servers, one of which it is possible to do what you're talking about but that is the absolute least efficient tool to use for this.

                      The crawl diagnostics are updated about once a week which means I would have to post the new content and hope I got it online in time for the crawl. If I didn't then I would have to wait an additional week to see results.

                      The crawl diagnostics also limits the amount of pages it will crawl on your site to 10,000. I stated before that I have over 17,000 pages. So even if I did use this method, the chances of that page being crawled is little better than 50/50.

                      Also, the crawl diagnostics only tell you what pages have duplicate content - not the exact content that was duplicated. That means I'd have to manually find the page I'm targeting, then follow the supposed duplicate content suggestions proposed by the crawler and find the similarities myself.

                      I think it's very safe to say that the crawl diagnostics, nor any product that SEOmoz provides, is an answer to my issue. If I thought it was, I would have already been using it and would not have posted this question.

                      CleverPhD 1 Reply Last reply Reply Quote 0
                      • CleverPhD
                        CleverPhD @MichealGooden last edited by

                        I assume that you have an admin section in the CMS where you are editing and entering these articles before they go live.

                        You need to get a developer to simply write a search algo that when you create a new article and before it goes live, it takes sections of your content and looks for matches/duplicates.  You can set a requirement that it has to match on a minimum of a 4 to 5 word string and other such limitations to make sure you are not matching too many items.  It will take a few tests to find a sweet spot of too many matches vs not enough.

                        With 17K pages, this is the only way you can really do this in an efficient way, you need some IT support/development. They may have to create a reporting layer as well to help you sift through the results.

                        Good luck.

                        1 Reply Last reply Reply Quote 0
                        • 1 / 1
                        • First post
                          Last post
                        • To avoid the duplicate content issue I have created new urls for that specific site I am posting to and redirecting that url to the original on my site. Is this the right way to do it?
                          0
                          1
                          24

                        • Acquired Old, Bad Content Site That Ranks Great. Redirect to Content on My Site?
                          Blenny
                          Blenny
                          0
                          7
                          130

                        • Duplicate content on partner site
                          Dr-Pete
                          Dr-Pete
                          0
                          7
                          749

                        • Is duplicate content harmful? Example from on my site
                          anthonydnelson
                          anthonydnelson
                          0
                          6
                          609

                        • Checking for content originality in a site
                          MoosaHemani
                          MoosaHemani
                          0
                          3
                          1.4k

                        • Article on site and distribution, is it duplicate content?
                          BeytzNet
                          BeytzNet
                          0
                          4
                          481

                        • Duplicate content on area specific sites
                          D2DWeb
                          D2DWeb
                          0
                          5
                          281

                        • Checking Duplicate Content
                          YannickVeys
                          YannickVeys
                          0
                          2
                          402

                        Get started with Moz Pro!

                        Unlock the power of advanced SEO tools and data-driven insights.

                        Start my free trial
                        Products
                        • Moz Pro
                        • Moz Local
                        • Moz API
                        • Moz Data
                        • STAT
                        • Product Updates
                        Moz Solutions
                        • SMB Solutions
                        • Agency Solutions
                        • Enterprise Solutions
                        • Digital Marketers
                        Free SEO Tools
                        • Domain Authority Checker
                        • Link Explorer
                        • Keyword Explorer
                        • Competitive Research
                        • Brand Authority Checker
                        • Local Citation Checker
                        • MozBar Extension
                        • MozCast
                        Resources
                        • Blog
                        • SEO Learning Center
                        • Help Hub
                        • Beginner's Guide to SEO
                        • How-to Guides
                        • Moz Academy
                        • API Docs
                        About Moz
                        • About
                        • Team
                        • Careers
                        • Contact
                        Why Moz
                        • Case Studies
                        • Testimonials
                        Get Involved
                        • Become an Affiliate
                        • MozCon
                        • Webinars
                        • Practical Marketer Series
                        • MozPod
                        Connect with us

                        Contact the Help team

                        Join our newsletter
                        Moz logo
                        © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                        • Accessibility
                        • Terms of Use
                        • Privacy