The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Moz Tools
    4. Excel tips or tricks for duplicate content madness?

    Excel tips or tricks for duplicate content madness?

    Moz Tools
    3 3 619
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • FMLLC
      FMLLC last edited by

      Dearest SEO Friends,

      I'm working on a site that has over 2,400 instances of duplicate content (yikes!).

      I'm hoping somebody could offer some excel tips or tricks to managing my SEOMoz crawl diagnostics summary data file in a meaningful way, because right now this spreadsheet is not really helpful. Here's a hypothetical situation to describe why:

      Say we had three columns of duplicate content. The data is displayed thusly:

      |

      Column A

      |

      Column B

      |

      Column C

      URL A

      |

      URL B

      |

      URL C

      |

      In a perfect world, this is easy to understand. I want URL A to be the canonical. But unfortunately, the way my spreadsheet is populated, this ends up happening:

      |

      Column A

      |

      Column B

      |

      Column C

      URL A

      |

      URL B

      |

      URL C

      URL B

      |

      URL A

      |

      URL C

      URL C

      |

      URL A

      |

      URL B

      |

      Essentially all of these URLs would end up being called a canonical, thus rendering the effect of the tag ineffective. On a site with small errors, this has never been a problem, because I can just spot check my steps. But the site I'm working on has thousands of instances, making it really hard to identify or even scale these patterns accurately.

      This is particularly problematic as some of these URLs are identified as duplicates 50+ times! So my spreadsheet has well over 100K cells!!! Madness!!! Obviously, I can't go through manually. It would take me years to ensure the accuracy, and I'm assuming that's not really a scalable goal.

      Here's what I would love, but I'm not getting my hopes up. Does anyone know of a formulaic way that Excel could identify row matches and think - "oh! these are all the same rows of data, just mismatched. I'll kill off duplicate rows, so only one truly unique row of data exists for this particular set" ? Or some other work around that could help me with my duplicate content madness?

      Much appreciated, you Excel Gurus you!

      1 Reply Last reply Reply Quote 0
      • rmontanez
        rmontanez last edited by

        FMLLC,

        I use Excel 2010 so my approach would be as follows:

        1. Make a backup copy of your file before you start.

        2. You will need to sort each row by value, but Excel has a 3 sort level limit, so you will need to add a macro.

        3. Assuming your data starts in A1 and has no header row,  Put it in a general module, go back to excel, activate your sheet, then run the macro from Tools=>Macro=>Macros.

        Sub SortEachRowHorizontal()

        Dim rng As Range, rw As Range

        Set rng = Range("A1").CurrentRegion

        For Each rw In rng.Rows

        rw.Sort Key1:=rw(1), _

        order1:=xlAscending, _

        Header:=xlNo, _

        OrderCustom:=1, _

        MatchCase:=False, _

        Orientation:=xlLeftToRight

        Next

        End Sub

        1. Then Highlight all your cells and then go to Data -> Remove Duplicates

        The result should be all unique rows. I hope this helps.

        1 Reply Last reply Reply Quote 1
        • Chenzo
          Chenzo last edited by

          Choose one of the URL's as the authoritive and remove the dupped content from the others.

          1 Reply Last reply Reply Quote 0
          • 1 / 1
          • First post
            Last post
          • Pages with Duplicate Content
            100offdeal
            100offdeal
            0
            3
            61

          • Moz is treating my pages as duplicate content but the pages have different content in reality
            clestcruz
            clestcruz
            0
            3
            168

          • Duplicate Content: Marketing Page / Content Page
            AllMedSeo
            AllMedSeo
            0
            3
            136

          • Duplicate Content
            Bryan_Loconto
            Bryan_Loconto
            0
            4
            170

          • Duplicate Page Content
            William.Lau
            William.Lau
            0
            4
            438

          • Tips and Tricks
            RevanaDigitalSEO
            RevanaDigitalSEO
            0
            3
            315

          • Duplicate Content
            EQ-Richie
            EQ-Richie
            0
            5
            586

          • "Duplicate Page Title" and "Duplicate Page Content" issue
            JoeBrewer
            JoeBrewer
            0
            3
            421

          Get started with Moz Pro!

          Unlock the power of advanced SEO tools and data-driven insights.

          Start my free trial
          Products
          • Moz Pro
          • Moz Local
          • Moz API
          • Moz Data
          • STAT
          • Product Updates
          Moz Solutions
          • SMB Solutions
          • Agency Solutions
          • Enterprise Solutions
          • Digital Marketers
          Free SEO Tools
          • Domain Authority Checker
          • Link Explorer
          • Keyword Explorer
          • Competitive Research
          • Brand Authority Checker
          • Local Citation Checker
          • MozBar Extension
          • MozCast
          Resources
          • Blog
          • SEO Learning Center
          • Help Hub
          • Beginner's Guide to SEO
          • How-to Guides
          • Moz Academy
          • API Docs
          About Moz
          • About
          • Team
          • Careers
          • Contact
          Why Moz
          • Case Studies
          • Testimonials
          Get Involved
          • Become an Affiliate
          • MozCon
          • Webinars
          • Practical Marketer Series
          • MozPod
          Connect with us

          Contact the Help team

          Join our newsletter
          Moz logo
          © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
          • Accessibility
          • Terms of Use
          • Privacy